Collaboration

In collaboration with Shawn Walker (University of Arizona) and Ed Summers (University of Maryland Institute for Technology in the Humanities), the project is investigating the use and circulation of web archives in the public online sphere. Using the Internet Archive’s Save Page Now tool, this pilot project is focused on the ways that participatory web archiving tools enable and constrain the creation of public stakeholders in the curation of the Web’s past. In addition to these substantive concerns, the project is also concerned with the methodological challenges of tracing practices and processes of big data creation and curation that are largely obscured by the sociotechnical infrastructures of that enable their production.

(Left) Founder of the Internet Archive, Brewster Kahle tweets about the Save Page Now tool. (Source: https://twitter.com/brewster_kahle/status/994380510011928578). (Right) The Internet Archive Logo.

Project Pitch

As both a publishing and archiving platform, the Web promises new opportunities for participatory archives that can provide a more democratic, inclusive and diverse historical record. For example, the Internet Archive’s Save Page Now (SPN) function allows anyone with a web browser and an Internet connection to add a particular web resource to a public archive of over 600 billion resources. SPN was recently estimated to be adding close to 100 URLs per second (Source: https://twitter.com/brewster_kahle/status/994380510011928578). While we have seen increased use of web archives by journalists and scholars, we currently understand very little about the nature of the participatory archiving process itself. What communities of users are using archive on demand services? How are the archival records created in this way circulating on the web? To what extent is automation a factor in archival production, and what purposes does it serve? In this study we will answer these questions using a mixed methods approach of analyzing SPN WARC data in conjunction with signals from social media, as well as in person interviews with individuals who use archive-on-demand services.

Project Outputs

Acknowledgements

This work used the Extreme Science and Engineering Discovery Environment (XSEDE) Bridges and Bridges Storage at the Pittsburgh Supercomputing Center through allocation TG-ECS180012. Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562.