Home » Blog » Currently Reading:

Creating a living archive using Internet Archive + Archive-It

February 5, 2012 Blog No Comments

Recently, I had the opportunity to interview Alexis Rossi from the Internet Archive about the work that they are doing to preserve the videos, still images, and minutes of the Occupy movement worldwide. The Internet Archive is a non-profit library that aims to provide permanent access to historical collections that exist in digital format for researchers, historians, scholars, people with disabilities, and the general public. Read on to find out how activists and media creators can use this excellent resource to save their digital materials!

What are you collecting in relation to Occupy, and why?

Internet Archive accepts any media that people choose to upload to archive.org. If uploaders put an Occupy-related keyword on their item (e.g. “OWS”), it will be automatically included in our Occupy collection. We are pulling Occupy-tagged items from YouTube and Flickr that have appropriate Creative Commons licenses attached, and working with the Occupy Wall Street Minutes Working Group to save audio files documenting assemblies and meetings. We are also saving websites and URLs that have been suggested by individuals and institutions through the web archiving service, Archive-It. Additionally we collect many television channels that carry coverage of the movement, although that content is not currently available to the public on archive.org. We created these collections because we believe it is important to record major events that occur in the world, whether they are political, cultural, or natural.

How do you collect these materials?

We chose to create a collection that anyone can contribute content to because we think the people involved in the Occupy movement are the ones best qualified to tell us what should be saved. People who wish to contribute audio, video, or text materials can use the upload button on archive.org to give us files (an alternate method may be needed for files over 2GB).

Also people can send websites or specific web URLs to crawl through Archive-It to graham@archive.org, and after the sites are captured they will be accessible from the Archive-It website and added to the archive.org access page. Submission of materials does not require Internet Archive’s approval, so content is available to anyone shortly after upload.

How will you catalog, archive, and preserve these materials?

Internet Archive does not provide any additional cataloging or curation for materials, so it is important that people include sufficient metadata with their uploaded files. All user uploaded media is stored redundantly in two separate data centers, and files are audited regularly to ensure we have not lost any bits.

How will people access this collection?

Most media (video, audio, texts, etc.) is available here.

Web pages collected via Archive-It are here and these will in time be included in the larger archive.org collection.

What advice would you give to media creators who want to ensure their content is usable in the future?

Internet Archive wants to save the best digital artifacts possible. For our purposes, that means people should upload the highest quality files they have and give us plentiful metadata about the files; title, description, keywords, date, time, location, creator, license information, etc. The more metadata we have, the easier these items are to find and use.