Jillian Palmer was granted an Anna Radkowski-Lee Graduate Assistantship for the 2020-2021 academic year, working with Amanda Lowe, Outreach and Marketing Librarian, as the Outreach and Marketing Graduate Assistant. Jillian is a student in the MPA program at UAlbany’s Rockefeller College of Public Affairs and Policy, where she is focusing on environmental policy.
After graduating from SUNY Geneseo with a degree in International Relations, she joined UAlbany as a program assistant, where was able to expand her interests in information technologies and communications. She now joins the University Libraries in a role that was designed to help create and build upon a foundation for the new Libraries Student Ambassadors program, as well as boost our contributions to social media, focusing mainly on Instagram operations.
We are glad to have Jillian on-board this academic year and look forward to all the wonderful contributions she will surely make!
Amanda Greenwood is the Anna Radkowski-Lee Graduate Assistant for Web Archives for the 2020-2021 academic year. She currently a graduate student in the Information Studies program, focusing on Archives and Record Administration, and also has a background in English Literature as a scholar of James Joyce. She recently moved back to Albany after teaching literature and writing for the past 15 years in France, South Korea, and the United States
The University Libraries has been collecting the web since 2012. The university now keeps many of its records online and, just like paper records, state records laws required us to keep and preserve certain documentation. The primary focus of the web archiving program is to preserve all of Albany.edu.
The program was expanded in 2016 to support the mission of the M.E. Grenander Department of Special Collections & Archives to document New York State politics. We began preserving the websites of New York political organizations, like the NYCLU, Environmental Advocates of New York State, and the Business Council of New York State, whose administrative records are also preserved in the Science Library.
Amanda uses complex tools such as Archive-It, which sends out web crawlers to recursively read all the links on a page and package whole websites into WARC files, which can be preserved and later “replayed” using tools such as the Internet Archive’s Wayback Machine. To control the crawlers, Amanda customizes specific sets of programmatic rules, called scoping rules, which tell the crawlers which links to include or ignore.
While the web crawling itself is automated, the web archives requires constant maintenance to adjust the crawlers as websites change. The web is a very complex and ever-changing network and the crawlers often find unexpected challenges. It’s quite easy for a crawler to go awry and accidently try to download all of YouTube. Amanda constantly reads crawl reports and adjusts the scoping rules to ensure we collect only what we intend to preserve.
Amanda’s big focus for the year is to make the crawlers more efficient, since we haven’t been able to dedicate this much time to the web archives previously, the amount of data the crawlers are preserving has steadily accumulated. We hope that adjusting the crawlers more delicately will free up data storage so the web archives can support additional web archiving efforts throughout the University Libraries. Other areas, such as Scholarly Communication and the Scholars Archive institutional repository have discovered that some faculty create websites as part of their research efforts, and that it makes sense to use the web archives to preserve these pages. We hope that Amanda’s work will allow us to better support this and preserve this valuable research output on the web.
Amanda will also help support our Twitter collecting for the 2020 New York State Elections. This year we’ve been preserving the Tweets of every State Senate and State Assembly candidates for future researchers – all 424 of them! We use a tool called Twarc to download this data using the Twitter Application Programing Interface (API). Right now this information is just piles of text files, but Amanda will help work on transforming them into a format that’s more accessible to researchers.
Finally, in the Spring semester, Amanda will help evaluate the web archiving program and provide feedback on how effectively we are preserving Albany.edu and the New York political organizations. She’ll work on formally applying our existing collecting policy to the web archives, helping develop a vision for preserving New York State politics on the web that will ground our work in the future.