> Skip to content
FEATURED:
  • The Evolution of Race in Admissions
Sign In
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
Sign In
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
  • News
  • Advice
  • The Review
  • Data
  • Current Issue
  • Virtual Events
  • Store
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
    • Featured Products
    • Reports
    • Data
    • Collections
    • Back Issues
  • Jobs
    • Find a Job
    • Post a Job
    • Career Resources
    • Find a Job
    • Post a Job
    • Career Resources
Sign In
ADVERTISEMENT
News
  • Twitter
  • LinkedIn
  • Show more sharing options
Share
  • Twitter
  • LinkedIn
  • Facebook
  • Email
  • Copy Link URLCopied!
  • Print

The Spoken Word, Searchable for Scholarship

By  Mary Helen Miller
May 28, 2010

If audio content were as easy to search as text, linguists could analyze all the speech that has been recorded for the past 100 years with little hassle. No Google-like technology exists for searching all recorded speech yet, but linguists at the Universities of Oxford and of Pennsylvania are starting by making the equivalent of one year of speech easily searchable.

With quick access to such a large and varied collection of speech, linguists could more thoroughly analyze questions such as how social status affects dialect or what strategies people use to interrupt each other in arguments. A researcher could search for something as specific as “all examples of O’s in nouns spoken by women in Birmingham over the age of 40,” says John Coleman, director of the phonetics laboratory at Oxford.

We’re sorry. Something went wrong.

We are unable to fully display the content of this page.

The most likely cause of this is a content blocker on your computer or network. Please make sure your computer, VPN, or network allows javascript and allows content to be delivered from c950.chronicle.com and chronicle.blueconic.net.

Once javascript and access to those URLs are allowed, please refresh this page. You may then be asked to log in, create an account if you don't already have one, or subscribe.

If you continue to experience issues, contact us at 202-466-1032 or help@chronicle.com

If audio content were as easy to search as text, linguists could analyze all the speech that has been recorded for the past 100 years with little hassle. No Google-like technology exists for searching all recorded speech yet, but linguists at the Universities of Oxford and of Pennsylvania are starting by making the equivalent of one year of speech easily searchable.

With quick access to such a large and varied collection of speech, linguists could more thoroughly analyze questions such as how social status affects dialect or what strategies people use to interrupt each other in arguments. A researcher could search for something as specific as “all examples of O’s in nouns spoken by women in Birmingham over the age of 40,” says John Coleman, director of the phonetics laboratory at Oxford.

He and Mark Y. Liberman, a professor of phonetics at Pennsylvania, have developed technology that will make 9,000 hours of British and American recorded speech searchable by queries such as context, phrase, vowel, or consonant. The sound bites include conversations among women talking about the day’s newspaper, people talking to their dogs, and political speeches. Mr. Coleman is using the British National Corpus, an audio collection recorded in the early 1990s, and Mr. Liberman is primarily working with selections from the Linguistic Data Consortium, a group of universities, companies, and government research labs that creates, collects, and distributes speech recordings.

Mr. Coleman and Mr. Liberman are modifying existing speech-recognition technology that aligns transcripts of the files with the audio component. Their version marks up words, or parts of words, and adds available context like the date of the recording, the setting, and details about the speakers. Using this kind of technology for research is unusual, and applying it to such a huge set of data is unprecedented.

The technology may also help achieve something more monumental, says Mr. Coleman: “It might help the field to settle upon standards for preparing audio material for large-scale distribution over the Web.”

ADVERTISEMENT

Mr. Coleman explained that a large proportion of material on the Web is audio or visual, but that there is no efficient way to search for it. If you want to find a song, he says, you look up the name of the track or the artist with a text search. He envisions using segments of sound to identify other, similar segments. “What you want to do is use a bit of language to find another bit of language,” he says.

We welcome your thoughts and questions about this article. Please email the editors or submit a letter for publication.
Technology
ADVERTISEMENT
ADVERTISEMENT

Related Content

  • The Humanities Go Google
  • Scholars Scale Up Music Studies
  • The Rise of Crowd Science
  • ‘Crowd Tracking’ the Gulf Oil Spill
  • Crowdsourcing, a Honey of an Idea
  • Explore
    • Get Newsletters
    • Letters
    • Free Reports and Guides
    • Blogs
    • Virtual Events
    • Chronicle Store
    • Find a Job
    Explore
    • Get Newsletters
    • Letters
    • Free Reports and Guides
    • Blogs
    • Virtual Events
    • Chronicle Store
    • Find a Job
  • The Chronicle
    • About Us
    • DEI Commitment Statement
    • Write for Us
    • Talk to Us
    • Work at The Chronicle
    • User Agreement
    • Privacy Policy
    • California Privacy Policy
    • Site Map
    • Accessibility Statement
    The Chronicle
    • About Us
    • DEI Commitment Statement
    • Write for Us
    • Talk to Us
    • Work at The Chronicle
    • User Agreement
    • Privacy Policy
    • California Privacy Policy
    • Site Map
    • Accessibility Statement
  • Customer Assistance
    • Contact Us
    • Advertise With Us
    • Post a Job
    • Advertising Terms and Conditions
    • Reprints & Permissions
    • Do Not Sell My Personal Information
    Customer Assistance
    • Contact Us
    • Advertise With Us
    • Post a Job
    • Advertising Terms and Conditions
    • Reprints & Permissions
    • Do Not Sell My Personal Information
  • Subscribe
    • Individual Subscriptions
    • Institutional Subscriptions
    • Subscription & Account FAQ
    • Manage Newsletters
    • Manage Your Account
    Subscribe
    • Individual Subscriptions
    • Institutional Subscriptions
    • Subscription & Account FAQ
    • Manage Newsletters
    • Manage Your Account
1255 23rd Street, N.W. Washington, D.C. 20037
© 2023 The Chronicle of Higher Education
  • twitter
  • instagram
  • youtube
  • facebook
  • linkedin