May 1

The End of Term Archive: Collecting & Preserving the .gov Information Sphere

Description: In the fall of 2016, a group of institutions – Internet Archive, Library of Congress, CA Digital Library, and libraries from the University of North Texas, Stanford University, and George Washington University – organized to preserve a snapshot of the federal government website. This is the third time this End of Term (EOT) group has organized with the goals of identifying, harvesting, preserving, and providing access to a snapshot of the federal government web presence. They do this for two important reasons. The first is that the transition of elected officials in the federal government’s executive branch prompts a reset of sites like www.WhiteHouse.gov, so it’s critical to document the changes. The EOT group’s work also provides a broad snapshot of the federal domain once every four years; it’s replicated among a number of organizations for long-term preservation.

Jefferson Bailey from the Internet Archive and James Jacobs from Stanford University Libraries will discuss the project’s methods for identifying and selecting in-scope content, strategies for capturing web content, and access models for collected content. The two will highlight the challenges and opportunities of large-scale, distributed, multi-institutional, born-digital collecting and preservation efforts; how the project aligns with participant institutions collection mandates; the project’s importance for archiving historically-valuable but highly-ephemeral web content without a clear steward; and how the breadth and size of the EOT Web Archive informs both new methods of collaboration and new models for data-driven access and analysis by researchers. Our speakers will also discuss the project’s alliance with other government data preservation projects as well as ideas and future plans for long-term sustainable methods for collecting, preserving and maintaining the .gov information ecosystem.

James R. Jacobs, US Government Information Librarian at Stanford University Libraries

Jacobs (jrjacobs@stanford.edu) works on both traditional collection development and research support as well as digital projects like LOCKSS-USDOCS, End of Term Crawl and other Web harvesting projects. He received his MSLIS in 2002 from the University of Illinois at Urbana-Champaign. He is a member of ALA’s Government Documents Roundtable (GODORT) and served a 3 year term (2009 – 2012) on Depository Library Council to the Public Printer, including serving as DLC Chair from 2011–2012. He is a co-founder of Free Government Information (freegovinfo.info) and Radical Reference (radicalreference.info) and is on the board of Question Copyright, a 501(c)(3) non-profit organization that promotes a better public understanding of the history and effects of copyright, and encourages the development of alternatives to information monopolies.

More on James can be found at http://freegovinfo.info/node/972.

Jefferson Bailey Director, Web Archiving Programs

Jefferson joined Internet Archive in Summer 2014. Prior to joining IA, he worked on strategic initiatives, digital preservation, archives, and digital collections at institutions such as Metropolitan New York Library Council, Library of Congress, Brooklyn Public Library, and Frick Art Reference Library and has worked in the archives at NARA, NASA, and Atlantic Records. He has an MLIS in Archival Studies from University of Pittsburgh and a BA in English from Oberlin College. He once flew NASA’s Space Shuttle Simulator and caused, according to the flight engineer, “minor landing gear damage.” He has deaccessioned all records of this event from his personal archive.