Skip to content
ADVERTISEMENT
Sign In
  • Sections
    • News
    • Advice
    • The Review
  • Topics
    • Data
    • Diversity, Equity, & Inclusion
    • Finance & Operations
    • International
    • Leadership & Governance
    • Teaching & Learning
    • Scholarship & Research
    • Student Success
    • Technology
    • Transitions
    • The Workplace
  • Magazine
    • Current Issue
    • Special Issues
    • Podcast: College Matters from The Chronicle
  • Newsletters
  • Events
    • Virtual Events
    • Chronicle On-The-Road
    • Professional Development
  • Ask Chron
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Professional Development
    • Career Resources
    • Virtual Career Fair
  • More
  • Sections
    • News
    • Advice
    • The Review
  • Topics
    • Data
    • Diversity, Equity, & Inclusion
    • Finance & Operations
    • International
    • Leadership & Governance
    • Teaching & Learning
    • Scholarship & Research
    • Student Success
    • Technology
    • Transitions
    • The Workplace
  • Magazine
    • Current Issue
    • Special Issues
    • Podcast: College Matters from The Chronicle
  • Newsletters
  • Events
    • Virtual Events
    • Chronicle On-The-Road
    • Professional Development
  • Ask Chron
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Professional Development
    • Career Resources
    • Virtual Career Fair
    Upcoming Events:
    An AI-Driven Work Force
    University Transformation
Sign In
Wired Campus circle logo

Wired Campus

The latest on tech and education.

Old Dominion U. Researchers Ask How Much of the Web Is Archived

By Jie Jenny Zou July 6, 2011
Researchers at Old Dominion University in Virginia are trying to figure out what percentage of the Web is archived by sampling from the four different sources shown here.
Researchers at Old Dominion University in Virginia are trying to figure out what percentage of the Web is archived by sampling from the four different sources shown here.

Researchers at Old Dominion University in Virginia are trying to figure out how much of the public Web is archived

To continue reading for FREE, please sign in.

Sign In

Or subscribe now to read with unlimited access for as low as $10/month.

Don’t have an account? Sign up now.

A free account provides you access to a limited number of free articles each month, plus newsletters, job postings, salary data, and exclusive store discounts.

Sign Up

Researchers at Old Dominion University in Virginia are trying to figure out what percentage of the Web is archived by sampling from the four different sources shown here.
Researchers at Old Dominion University in Virginia are trying to figure out what percentage of the Web is archived by sampling from the four different sources shown here.

Researchers at Old Dominion University in Virginia are trying to figure out how much of the public Web is archived and who is storing it, as part of a larger effort to preserve the digital record.

Michael L. Nelson, a computer-science professor, has been working with professors and students since September to determine how much of the Web’s history has been preserved in Internet databases around the world.

Mr. Nelson’s team estimated the percentage of 4,000 Web pages that were archived by sampling data known as URI’s, or uniform-resource identifiers. An identifier is a label for a specific Web-page address or name. The researchers used Memento, a browser plug-in they developed in 2009, to find old versions of the pages across various Internet archives.

ADVERTISEMENT

The URI’s were compiled from various sources: from search-engine caches from Google, Bing, and Yahoo!, from an Internet archive called the Open Directory Project, from a link-sharing service called Delicious, and from a Web-address-shortening service called Bitly.

The report showed that 35 percent to 90 percent of Web pages have at least one archived copy and that the chance of a page being archived depended on the source. For instance, URI’s gathered from Delicious were much more likely to be archived than Bitly URI’s, but the reason for that is not entirely clear. Mr. Nelson plans to continue the project, as he felt that no “final answer” had yet been reached.

Alexis Rossi, the Web-collections manager at Internet Archive, found the university’s efforts interesting, but she wondered whether it is even possible to accurately assess archival rates in a continually changing landscape.

“It’s such a moving target—the Web is expanding all the time,” Ms. Rossi said. Internet Archive was one of several archives used in the study and has been preserving the Web since 1996.

“People are coming to the realization that if nobody saves the Internet, their work will just be gone,” Ms. Rossi said. She also said the project may shed light on the efficacy of Web archiving as libraries and Internet users begin to think more about preserving the Web.

ADVERTISEMENT

For Mr. Nelson, the study is another step toward creating a browsing experience that links the past to the present: where users can replay events as they unfolded, such as media coverage of hurricane Katrina in 2005 or 2007’s Virginia Tech shootings.

“You relive the experience in a way that a summary page can’t even begin to capture,” Mr. Nelson said, imagining a day when such historical searches become common.

Scott G. Ainsworth, the project’s lead student researcher, compared saving old Web pages to the historical preservation of old Sears catalogs. “You never know what’s going to be important in 100 or 150 years,” he said.

We welcome your thoughts and questions about this article. Please email the editors or submit a letter for publication.
Share
  • Twitter
  • LinkedIn
  • Facebook
  • Email
ADVERTISEMENT
ADVERTISEMENT

More News

Illustration showing the logos of Instragram, X, and TikTok being watch by a large digital eyeball
Race against the clock
Could New Social-Media Screening Create a Student-Visa Bottleneck?
Mangan-Censorship-0610.jpg
Academic Freedom
‘A Banner Year for Censorship’: More States Are Restricting Classroom Discussions on Race and Gender
On the day of his retirement party, Bob Morse poses for a portrait in the Washington, D.C., offices of U.S. News and World Report in June 2025. Morse led the magazine's influential and controversial college rankings efforts since its inception in 1988. Michael Theis, The Chronicle.
List Legacy
‘U.S. News’ Rankings Guru, Soon to Retire, Reflects on the Role He’s Played in Higher Ed
Black and white photo of the Morrill Hall building on the University of Minnesota campus with red covering one side.
Finance & operations
U. of Minnesota Tries to Soften the Blow of Tuition Hikes, Budget Cuts With Faculty Benefits

From The Review

A stack of coins falling over. Motion blur. Falling economy concept. Isolated on white.
The Review | Opinion
Will We Get a More Moderate Endowment Tax?
By Phillip Levine
Photo illustration of a classical column built of paper, with colored wires overtaking it like vines of ivy
The Review | Essay
The Latest Awful Ed-Tech Buzzword: “Learnings”
By Kit Nicholls
William F. Buckley, Jr.
The Review | Interview
William F. Buckley Jr. and the Origins of the Battle Against ‘Woke’
By Evan Goldstein

Upcoming Events

07-16-Advising-InsideTrack - forum assets v1_Plain.png
The Evolving Work of College Advising
Plain_Acuity_DurableSkills_VF.png
Why Employers Value ‘Durable’ Skills
Lead With Insight
  • Explore Content
    • Latest News
    • Newsletters
    • Letters
    • Free Reports and Guides
    • Professional Development
    • Events
    • Chronicle Store
    • Chronicle Intelligence
    • Jobs in Higher Education
    • Post a Job
  • Know The Chronicle
    • About Us
    • Vision, Mission, Values
    • DEI at The Chronicle
    • Write for Us
    • Work at The Chronicle
    • Our Reporting Process
    • Advertise With Us
    • Brand Studio
    • Accessibility Statement
  • Account and Access
    • Manage Your Account
    • Manage Newsletters
    • Individual Subscriptions
    • Group and Institutional Access
    • Subscription & Account FAQ
  • Get Support
    • Contact Us
    • Reprints & Permissions
    • User Agreement
    • Terms and Conditions
    • Privacy Policy
    • California Privacy Policy
    • Do Not Sell My Personal Information
1255 23rd Street, N.W. Washington, D.C. 20037
© 2025 The Chronicle of Higher Education
The Chronicle of Higher Education is academe’s most trusted resource for independent journalism, career development, and forward-looking intelligence. Our readers lead, teach, learn, and innovate with insights from The Chronicle.
Follow Us
  • twitter
  • instagram
  • youtube
  • facebook
  • linkedin