In my last posting, I imagined what it might look like to fork the academy, that is, to create a space within the world of academic writing and publishing where we could directly reuse, adapt, and expand each other’s work. I also discussed some of the most significant obstacles that stand in the way, both at the disciplinary level and the kinds of personal concerns I have seen raised from friends and colleagues I have discussed the idea with.
In an earlier posting I looked at some of the reasons why GitHub.com, which has led the way in making the practice of “forking” repositories of code and text possible, is not really an ideal environment for scholars to use for writing and collaboration. It works, but it has been developed more for building software, than for writing books, academic papers, syllabi, and other the genres of writing we engage in.
Over the next few weeks we will take a much closer look at some of the new and old collaborative writing tools that are emerging which might serve as alternatives to GitHub. Some of them are built on top of git, the version control system that runs GitHub while others are not necessarily aiming for the kind of large scale collaboration GitHub permits. Stepping back, though, I thought it might be worth thinking about the kinds of things we might look for in a “dream environment” for collaborative writing that incorporates some of the powerful benefits of a platform like GitHub while including other elements that make our job of writing and sharing scholarship easier.
Here is my stab at a wish list:
- Simple Clean Interface - As I have argued in some earlier postings here, GitHub, and the raw git version control system has a steep learning curve. Some of what you learn offers a great return: version control, and especially distributed version control, is a new way of thinking about collaboration, but more can be done to make it easy to grasp and use, while making simple use cases easy from scratch. More than that, the overall writing and administrative interfaces should be fun and beautiful because that is what we have come to expect.
- Private and Public - GitHub has a great model: you make things public, it is free. You keep things private, it costs you. We’ll look at one platform, Authorea which is aiming for this model. Other alternatives out there mirror the Google Docs approach of keeping everything private until you share. I personally admire and like the GitHub revenue model because it provides an instant avenue for monetization for the company that builds it while rewarding sharing. Since I’m imagining a platform for collaborative and open sharing, I prefer that to a system which makes users take the extra step to share and locks up their files by default. Others may feel differently. Either way, there will have to be a simple and easy way to control how and what is public on the site.
- License Control - GitHub repositories often, but not often enough have a license file which tells you what the terms of a collection of files are available on. It would be great if this were made even more easy. There should be an easy setting for you to choose between a set of licenses, such as CC-BY, or designate a custom one.
- Presentation Mode - When you open a Google Doc someone has shared with you, you are dumped into an editing environment. Even if they only grant you view access, there is a lot of Google scaffolding that surrounds your file. It would be nice to have an easy and clean presentation mode which allows anyone to read and enjoy your work without any distraction - but, with a click, be able to engage with your work (by commenting, forking, pull requests, resharing, etc.) Keep it clean!
- Repository Based - One thing which has really grown on me as I work with Github is the idea that it is not all about “the article” - a single final file that we share with everyone. For every article, every presentation, and every syllabus, there are usually several files which support, complement, or together compose a “final” document even in those cases where there is only one final document. Ideally a collaborative writing platform should account for this by making it possible to organize collections of files into repositories or sets that form a coherent unit and can be forked or shared together. Even more cool is if it improved on the GitHub conception to allow the creation of ad hoc file sets as a way to solve the challenge of multiple forked inheritance (see below, but I’m not sure how that would work if it is built on top of git though).
- Metadata - In addition to just a raw repository of text, it should be easy for users to add other relevant metadata about authorship, the motivation behind the project, its direction, its main collaborators, and so on to make it easily discoverable and easy for readers to find very basic information about it.
- Import - A lot of us have great material trapped within word documents, open or libre office files, rich text files, and so on. Though raw plain text, with simple markup like markdown, or more powerful markup like LaTeX is ideal for portability, a great collaborative writing tool will bake in import tools even if they are not perfect to help us get our material into the platform quickly and easily.
- Export - As with data in, there needs to be easy data out. I am ever more allergic to platforms which lock my content in. A strong collaborative writing platform should be able to give users ways to get their scholarship out in a format they can use. Markdown export would be great, LaTeX will please the masters of text, but ideally it would also include other forms of export including doc, pdf, rtf, odt, and html. This may be more limited if there is support for alternative or dynamic content elements (see below). Lincoln has introduced the open source conversion utility pandoc and a number of web services have begun to bake pandoc into their feature set. That would be a great start.
- Markdown and Friends - Ideally the platform would allow live composition of files within the platform for convenience, and allow baked in support of one or more simple markup languages such as markdown. Bonus points for LaTeX or other text processors and even more bonus points for a modular approach that allows you to add your own text processor for your favorite markup language.
- Version Control - Version control should be a strong component of any collaborative writing environment. It helps track changes over time, which is vital when you are working with many versions of a text or many authors, and also is a great way to save drafts and return to older approaches when there is a need. Unlike GitHub, however, “commit” messages that indicate what has been added to a file, should not be required, but should be optional. Like GitHub, however, it should be easy to add “tags” or milestones to indicate when a file has reached a major stable version. For example, “Talk as delivered April, 2013" or “Article as published in Journal of Awesomeness vol 2" etc. Also, version control should ideally be distributed, like git, not centralized, so that collaborators can work on the texts at the same time. This adds complexity to the version control system as merging becomes an issue, but the benefits are significant.
- Offline Support - Some of us are actually not online all the time. I personally would find it hard to adopt any collaborative writing platform that does not allow me to work fully offline when I have to, and that means offline access to my full repositories and the ability to edit the files, ideally in an editor of my own choice. I’m looking at you Vim.
- Forking and Pull Requests - My dream collaborative writing platform would of course allow you to open your repository up to be edited by others, and establish a small or large group of collaborators working on a project. I really believe, however, that forks and pull requests are a fantastic feature of github that changes the game: it creates relationships between related repositories in a system of inheritance, and it allows people who do not normally have permission to edit a file, to change it in a copy they make and then propose a change to you in a way that preserves your power over your own repository, and allows the contributor to preserve their proposed change in their own copy. This doesn’t need to be implemented the same way that GitHub does it - in fact, I think it can be significantly improved on. Currently forked GitHub repositories can have a single ancestor repository (even if its files might have many contributors) and I think it would be more natural for us as scholars to have a system that allowed our work to have multiple ancestors to solidify the chain of attribution at a macro level.
- History Tools - There are many great features in GitHub to view the “diff” or difference between different versions of files. For a collaborative writing platform, this would ideally be taken to an even higher level: better word or sentence (rather than line based) diffs, and better ways to easily, at a glance, get an idea of who is a major contributor to a repository and ways to incorporate them easily into authorship/contributor metadata. This might also acknowledge that there are different roles or types of contribution or provide methods to mark particularly central or important contributions.
- Footnotes and Citations - Footnotes and traditional models of attribution will remain an dominant form of scholarly communication and attribution for a long time to come. Any collaborative writing platform that wants to take scholarship seriously needs to have easy, robust, and convenient ways to organize and handle footnoting and citations. This need not all fall on the platform design, but making it easy to integrate with other existing tools, such as Zotero etc. would be ideal.
- Feedback - A platform can benefit from strong support for the optional ability to provide feedback at the level of word, sentence or paragraph. It should be easy to control who this is visible for and when it is to be shown.
- Alternative Content Forms - This is the hardest to implement but perhaps the most important in the long term. The world is moving past the model of scholarship that is simply text and perhaps a few photos. Even as many of us are rediscovering the power and beauty of a simple text format for portability and compatibility, we are also including more images, videos, sounds, datasets, and interactive content in our work. APIs make it possible to incorporate material that is not hosted in one place. Platforms like Zeega allow the creation of incredibly rich mixed media productions and our scholarship too will evolve to accommodate a wider range of forms of argumentation and expression. An ideal platform would allow us to incorporate a variety of rich content hopefully without destroying our ability to work offline with most of our local materials when we have to.
Other wish list grab bag items:
- Cloud Support - Many of use cloud services like Dropbox or Spideroak and others to keep our files accessible wherever we go. Being able to integrate our collection of repositories into a system like this would be wonderful.
- Play Nice - Ideally, the platform should have a robust API that allows us to build other services and functionality on top of what they offer.
- Metrics - A great platform would make it easy to get stats on our repositories, their size, word length, various stats about history, commits, and collaboration. Metrics of how many people viewed our repository, forked it, shared it, liked it, etc.
- Branching - Though tricky to implement, the ability to easily save and create “branches” as you can with repositories in GitHub does have its applications in the world of text as well: especially when targeting materials for different audiences etc. This more elegant than making duplicates of files and then editing.
- Free and Open Source - Ideally the platform would itself be a creature of free and open source software. It would be built, like GitHub is in the case of git, on a core set of open code that can be used to build competitors or be hacked for custom needs.
A platform cannot by itself allow us to fork the academy, or rather, fork a part of the academy. But if it can draw writers to it as a comfortable and easy place to work, it might, as GitHub for many developers who had never tried git, draw them into a world where collaboration seems desirable, but natural.
Are there other things you would love to see in a powerful collaborative writing platform? What is this list missing? Are there things in the list you would rather not see included?
Image Presents Creative Commons Licensed by Nina Matthews Photography