> Skip to content
FEATURED:
  • Student-Success Resource Center
Sign In
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
Sign In
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
Sign In
ADVERTISEMENT
Wired Campus circle logo

Wired Campus

The latest on tech and education.

  • Twitter
  • LinkedIn
  • Show more sharing options
Share
  • Twitter
  • LinkedIn
  • Facebook
  • Email
  • Copy Link URLCopied!
  • Print

What Wikipedia Deletes, and Why

By  Alexandra Rice
October 26, 2011

Wikipedia, the online encyclopedia, famously allows anyone to write or revise its entries, and the history of each item is open for anyone to review. Except for material that leaders of the effort consider too “dangerous” to leave online.

We're sorry. Something went wrong.

We are unable to fully display the content of this page.

The most likely cause of this is a content blocker on your computer or network.

Please allow access to our site, and then refresh this page. You may then be asked to log in, create an account if you don't already have one, or subscribe.

If you continue to experience issues, please contact us at 202-466-1032 or help@chronicle.com

Wikipedia, the online encyclopedia, famously allows anyone to write or revise its entries, and the history of each item is open for anyone to review. Except for material that leaders of the effort consider too “dangerous” to leave online.

The fine print of its stated practices notes that in some cases, material is completely spiked from the record. Or, as the policy reads: “a revision with libelous content, criminal threats or copyright infringements may be removed afterwards.”

These total redactions are what a University of Pennsylvania research team has been mining for the past year in the hopes of shedding some light on what Wikipedia deletes forever and why. In 2010 redactions accounted for more than 56,000 of the 47.1 million revisions, according to the research team.

The researchers, Andrew G. West and Insup Lee, wondered what content on the enormously popular Web site could be so troubling that Wikipedia administrators would decide to remove it forever. “Wikipedia is at that paramount example of open-source transparency,” Mr. Lee said. “So when you see them behaving in a nontransparent manner, you want to see what motivates them to do this.”

Copyright infringement was the most common reason Wikipedia stated for deleting material, Mr. West and Mr. Lee found.

ADVERTISEMENT

The Wikimedia Foundation has been sued over copyright and privacy issues in the past. While only 0.007 percent of page views in 2010 to the English Wikipedia site resulted in content that was later redacted, that’s enough to land the organization and its operators in hot water. That’s why leaders of the encyclopedia refer to the material it redacts as “dangerous content.”

“We’ve identified that on the surface these copyright cases are the worst,” said Mr. Lee.

“The research goal for us is, how can we provide some automated way to detect the problems so they can be removed immediately?” Mr. West added. “It’s very difficult to stop people from adding something, but we can find a way to get rid of it quickly.”

The difficulty in identifying instances of plagiarism, the pair said, is evident in the numbers. Most “dangerous content,” such as libel or invasions of privacy, is taken down within two minutes, on average. But copyright-related issues stayed up for an average of 21 days, they found.

Wikipedia’s leaders have recently increased the number of people with the ability to permanently delete text, including entries in the history pages. In May 2010, approximately 40 people held these rights; now more than 1,800 people do, Mr. West and Mr. Lee said.

ADVERTISEMENT

The larger work force has helped to reduce the amount of dangerous content found on the site, the researchers said. But humans alone won’t solve the problem in its entirety. Sometimes they even introduce problems when trying to delete dangerous content and removing beneficial revisions in the process, which the research team refers to as “collateral damage.” This brings up the question, then, of who even gets to make the call when something is dangerous content or not.

“For all the problems on Wikipedia,” Mr. West said, “I feel strongly that the solutions have to be automatic in nature because these attackers increasingly have these machines doing their bidding for them.”

The biggest hurdle the Wikipedia operators need to overcome, in the minds of the research team, is trust. If the encyclopedia hopes to see continued success, that will be the main obstacle, they said.

More on the authors’ Wikipedia redaction research can be viewed in their full paper, “What Wikipedia Deletes: Characterizing Dangerous Collaborative Content.”

ADVERTISEMENT
ADVERTISEMENT
  • Explore Content
    • Latest News
    • Newsletters
    • Letters
    • Free Reports and Guides
    • Professional Development
    • Virtual Events
    • Chronicle Store
    • Chronicle Intelligence
    • Find a Job
    • Post a Job
    Explore Content
    • Latest News
    • Newsletters
    • Letters
    • Free Reports and Guides
    • Professional Development
    • Virtual Events
    • Chronicle Store
    • Chronicle Intelligence
    • Find a Job
    • Post a Job
  • Know The Chronicle
    • About Us
    • Write for Us
    • Work at The Chronicle
    • Our Reporting Process
    • Advertise With Us
    • Brand Studio
    • DEI Commitment Statement
    • Accessibility Statement
    Know The Chronicle
    • About Us
    • Write for Us
    • Work at The Chronicle
    • Our Reporting Process
    • Advertise With Us
    • Brand Studio
    • DEI Commitment Statement
    • Accessibility Statement
  • Account and Access
    • Manage Your Account
    • Manage Newsletters
    • Individual Subscriptions
    • Institutional Subscriptions
    • Subscription & Account FAQ
    Account and Access
    • Manage Your Account
    • Manage Newsletters
    • Individual Subscriptions
    • Institutional Subscriptions
    • Subscription & Account FAQ
  • Get Support
    • Contact Us
    • Reprints & Permissions
    • User Agreement
    • Terms and Conditions
    • Privacy Policy
    • California Privacy Policy
    • Do Not Sell My Personal Information
    Get Support
    • Contact Us
    • Reprints & Permissions
    • User Agreement
    • Terms and Conditions
    • Privacy Policy
    • California Privacy Policy
    • Do Not Sell My Personal Information
1255 23rd Street, N.W. Washington, D.C. 20037
© 2023 The Chronicle of Higher Education
  • twitter
  • instagram
  • youtube
  • facebook
  • linkedin