In the late 1700s, a Hungarian inventor named Wolfgang von Kempelen debuted a machine he called “The Mechanical Turk.” It was purported to be an automaton capable of playing chess, and during its reign it defeated opponents such as Napoleon Bonaparte and Benjamin Franklin. It was eventually revealed that the machine was a hoax; a chess master was concealed within the contraption, his invisible labor controlling the Mechanical Turk’s movements.
Its namesake, Amazon’s Mechanical Turk (AMT) is a microtask service that allows programmers and companies to contract with individuals to complete a series of repetitive tasks that computers are as yet unable to do, such as identifying objects in a photo or transcribing audio recordings. AMT is one of the many commercial services with which humans provide the raw material on which artificial intelligence is trained.
Such developments should make us question the growing number of education-technology practices now in place at colleges across the country. Predictive analytics, plagiarism-detection software, facial-recognition technology, chatbots — all the things we talk about lately when we talk about ed tech — are built, maintained, and improved by extracting the work of the people who use the technology: the students. In many cases, student labor is rendered invisible (and uncompensated), and student consent is not taken into account. In other words, students often provide the raw material by which ed tech is developed, improved, and instituted, but their agency is for the most part not an issue.
It should be.
One perspective describes data as “the new oil.” That analogy may work when the data in question relates to such things as commodity prices, the weather, or warehouse inventories. But when it comes to people, and students in particular, that perspective is troubling. Behind the data are people, and their labor. Such data not only reflects the most intimate selves of those who are the subjects of the data extraction but also reflects intimate information about others who may not have directly shared their information at all. That is the point of ideas like predictive analytics and plagiarism detection: using data generated by some people in the past to predict and understand the behavior of other people in the present or future.
Increasingly, corporations, researchers, and colleges take it as a given that they are entitled to the extraction, retention, remixing, and continued use of that data — and we do not have adequate language or institutional policies in place to push back. When it comes to collecting student data, the notion of informed consent is simply not a part of the model. At every stage of our interaction with current digital technologies, consent is systematically stripped from people, often referred to as “users,” in a telling bit of terminology.
Further, what informed consent means when information about me will be used to evaluate the behavior of other people is a question we don’t know how to ask very well. Yet school systems and colleges have deployed these assumptions against students, who provide most of the data that trains machine-learning algorithms and informs product development. This exploitation is exacerbated by the relative lack of expiration dates on permission to use student data, and by overly broad claims that student data will be used for “product improvement": Not only is student data extracted and made to work for the institution and the company, but the data also could continue to do that work long after the student has moved on.
Consider some recent examples:
Eric Tornoe, a professor at St. Thomas University, in Minnesota, and his team members saw a Microsoft demonstration of technology that could guess, with reasonable accuracy, ages and genders of people in the room. Thinking about other applications of the technology Tornoe quickly landed on the potential information hidden within students’ emotions. He envisioned professors’ using that information to ensure that a greater percentage of their lecture material engaged students and stuck in their minds. Microsoft lent Tornoe’s team the facial-recognition and emotion interfaces, free of charge, as well as the necessary software.
Ashok K. Goel, a professor at the Georgia Institute of Technology, used IBM’s “Watson” technology to test a chatbot teaching assistant on his students for a semester. He told them to email “Jill” with any questions but did not tell them that Jill was a bot.
At the University of Arizona, a professor named Sudha Ram gathered and analyzed data on how freshmen used their student-identity “CatCards” over a three-year period. The data, collected whenever students used the cards to check in at the library, rec center, or other places, was then used to create networks mapping which students interacted with one another and how often. Ram also looked at the regularity of students’ routines and whether or not they had established patterns of activity. She and her collaborators developed a machine-learning algorithm to develop ways to quantify those patterns.
Arizona now uses some 800 data points — on academic performance, financial aid, and use of the university’s course-management system, among other things — to identify which students are most at risk for dropping out. Those predictions are about 73 percent accurate from the first day of classes, with the accuracy rate improving over time, according to a university official.
At stake in the collection of such data are two things: how we think about labor and how we think about consent. Students must be given the choice to participate, and must be fully informed that they are part of an experiment or that their work will be used to improve corporate products. On a university website that describes his research at St. Thomas, Tornoe says: “One thing we learned from this experience is that the people have a strong reaction to the technology! This taught us that we need to be careful in how we describe and position technology that interacts with humans, to make sure that the people using it understand exactly what it is doing and what happens to the data collected, and that people are fully informed as to what data is being collected and how it will be used.”
At Arizona, Ram, too, has acknowledged privacy concerns. According to university news coverage, the data collected was anonymized so that students could not be personally identified, and was shared only with students’ advisers.
But the students could not have been told about all possible future uses of these data, since those uses are not fully known yet, or even knowable. Microsoft, IBM, and Arizona could all profit from the data produced from these experiments. (While the university is nonprofit, I would argue that its own bottom line depends in part on quantifiable data on student retention, among other things.) However, I wonder whether those involved in student-data projects sufficiently understand that the students — not the institutions or the algorithms — are the people whose worth, selves, and labor are supposed to be the primary focus of universities. Students have, in effect, been made into the Mechanical Turks of education technology, often without their knowledge or consent.
The time for having “the talk” about consent within the ed-tech community has long since passed. While improving ed tech, making classrooms more efficient, raising retention rates, and guarding against plagiarism are valuable goals, the value of creating a culture of consent and fair compensation for labor ranks higher. Faculty members may not stipulate on the syllabus how they treat their students (although they probably should), but their practices are as much a form of pedagogy, and arguably more so, than any list of required texts, assignments, or due dates.
When we draft students into education technologies and enlist their labor without their consent or even their ability to choose, we enact a pedagogy of extraction and exploitation. It’s time to stop.
Chris Gilliard, a professor of English at Macomb Community College, often writes about privacy and surveillance in higher education.