Using DataBasic for Simple Data Analysis

I love teaching with visualized data: something as simple as running a text through Wordle (a word cloud tool) can reveal all sorts of fun things about what’s going on in a text through making patterns more visible. However, a lot of the tools for next-level data analysis aren’t particularly easy to use and can be overkill for simple lessons. Northeastern University’s Storybench recently released A set of tools that I find nicely streamlined for classroom use is DataBasic, created by Rahul Bhargava, Catherine D’Ignazio, and the Emerson College Engagement Lab.

There are three tools in the DataBasic platform, which is accessible through the browser:

  • WordCounter is exactly what it sounds like — a tool for counting words in a text and revealing the most common. It can handle two and three word phrases, which makes it great for noticing bad habits in one’s own creative writing or finding patterns in another person’s work or rhetoric. The results are presented in a straightforward chart of frequency, which offers a detailed numerical snapshot to go with the more visual word cloud. Also, there are some fun sample datasets to play with, like Beyonce’s lyrics.

  • SameDiff is a lot like WordCounter, but for multiple text sources. The sample data sets are fun–try comparing Lady Gaga and Madonna’s lyrics to see the overlap and differences in vocabulary. On a conceptual level this is rather like plagiarism services, but with a much more analytic focus. The resulting visualizations can be a simple way to see shared rhetoric across discourses, like by taking two different politician’s debate speeches and placing them side by side.

  • WTFCSV is a visualization tool for interpreting spreadsheets. It takes every column and turns it into a representation: bar charts for numerical data, and word clouds for text fields. The results are really only as good as the labeling system, but that in itself can be useful for analyzing strategies of structuring data. It can be particularly helpful when looking at large spreadsheets of data that is otherwise hard to parse, such as the example data set of trends in UFO sightings in Massachusetts.

Have you tried DataBasic, or do you have a favorite tool for classroom data analysis? Share your tips in the comments!

Return to Top