
[This is a guest post by Stéfan Sinclair, Associate Professor of Communication Studies & Multimedia at McMaster University. He is a member of the ACH Executive and an associate editor of Literary and Linguistic Computing and Digital Humanities Quarterly. His research focuses on the design, development and theorization of tools for the digital humanities, including Voyeur Tools. You can follow him on Twitter at @sgsinclair.--@jbj]
You have published content online – it could be a blog, a syllabus, a scholarly resource, or any other form of web-based creation. This is the fruit of your hard work that you’ve made public for others to find and engage with. Now, wouldn’t it be nice to know a bit about who your visitors are (though not their identities) and how they’re interacting with your content? What country are visitors from? How long are they staying on your site and what are they looking at? What search terms are people using to find your content? How does last month’s traffic compare with this month? How many of your visitors are using mobile devices? These are just some of the questions that web analytics can help answer. I’ll explain a bit more about web analytics and then we’ll have a look at some data from the DHAnswers site, a community-driven forum where members can ask and answer questions related to digital humanities.
One way to better understand traffic to your website is to read and analyze web server logs (wherever your content is being hosted). Almost all servers are configured to capture some data about every request that a browser (or other web client) makes, including the time, the requested resource, and the originating IP address, and user agent. Dear reader, I’m going to assume that you’re less interested in running arcane command-line scripts or installing and running specialized web analytics software.
An alternative to analyzing the local logs on your web server web is to integrate a hosted web analytics solution. Doing so is usually surprisingly quick and easy; it’s essentially a matter, firstly, of creating an account with the service provider and, secondly, of embedding a small snippet of code into your web page (or into a template that’s used in several pages). If you have a Google account, the first step is already mostly complete, and the second is no more difficult then embedding a YouTube clip. One of my students this past term created a Prezi that provides more information about Google Analytics and a short tutorial for getting started. There are also modules available for most content management systems like this one for WordPress and this one for Drupal. Hosted web analytics have some additional limitations due to their reliance on Javascript and cookies (and counter-measures that users can install in their browsers), but they tend to be considerably easier to use and still provide robust aggregate data.
(An adequate treatment of reasonable privacy concerns related to web analytics would require a whole other post – or perhaps a book – but for now suffice it to say that a user’s IP address is probably the datum most specific to individual users, that IP addresses are an integral part of Internet protocols and logging them is unlikely to disappear anytime soon.)
Let’s look at some Google Analytics data from the DHAnswers website. One of the first things we might be interested in viewing is the overall trend in traffic since the site’s launch in September. Whereas many sites start off with modest traffic and build progressively, DHAnswers shows an early spike in traffic – our launch blitz was clearly successful – and then a more modest but steady flow of traffic ever since (the graph is showing unique visitors by day).

A Google Analytics graph showing visitors to DHAnswers since September 2010.
Next we might consider geographical information. As can be seen from the image below, DHAnswers has been visited by people representing some 130 countries. In this case the map is less helpful beyond indicating the significant dominance of visitors from the USA, but the embedded ranked list also shows considerable activity from Australia and a noteworthy surge from India (possibly thanks to answers being indexed and found by search engines, in addition to Indian scholars self-identifying as digital humanists).
A Google Analytics graph showing global geographical data about visitors.

A similar type of map provides an interesting perspective of digital humanities activity in the USA, from the particular bias of the DHAnswers site.
Shifting now to content, we might be interested in detemining the most visited topics on the site – what is of special interest to users? This information can be particularly useful for the DHAnswers admin team as we strive to ensure that the top topics have at least some answers. The list is also a good expression of the diversity of content on the site.
- collaborative-software-for-transcribing-digital-images-of-handwritten-documents (4.02%)
- what-is-digital-humanities (3.33%)
- iannotate-vs-goodreader (2.60%)
- what-are-the-essential-dh-books-on-your-class-reading-list (2.39%)
- alternatives-to-indesign-for-creating-multimedia-ebooks (1.65%)
- how-do-we-introduce-undergraduates-to-the-digital-humanities (1.65%)
- visualization-in-digital-humanities-what-are-the-possibilities (1.50%)
- what-are-some-useful-tools-for-creating-timelines (1.44%)
- is-there-a-list-anywhere-of-all-the-graduate-programs-that-study-dh (1.30%)
- alternatives-to-cocoon (1.29%)
It’s of course useful to keep in mind there are various ways of arriving at those topics, including the front page, the in-site search, external links, and search engines. The table below shows that a majority of traffic comes from Google, an impressive amount from ProfHacker (Chronicle.com), and a remarkable amount of traffic from twitter.com (from people using the Twitter website as a client), which isn’t surprising given that new questions from DHAnswers are automatically tweeted to more than 750 followers. In terms of search engine keywords (from Google or elsewhere), there are clearly a lot of people interested in comparing iAnnotate and GoodReader iOS apps.

One final aspect of data that might be of interest is the type of operating system and browser used by visitors. As the graph below indicates, visitors to DHAnswers are disproportionately highly represented by people running Macintosh and Linux operating systems (which may be true of the DH community in general compared to common market share stats). Though much smaller, the representation of mobile devices is also significant, probably again in part due to the Twitter component of the site.
Google Analytics data for operating systems used by visitors to DHAnswers.
Google Analytics offer a huge number of functions and a lot of flexibility in customizing parameters and views (though the default views are very helpful as well). This is just a taste of some of the ways that we can better understand visitors to DHAnswers, and can suggest ways that we might improve the site to better serve the community.
What do you want to know about your visitors? Let us know in comments!