Previous Three evolving thoughts about flipped learning Next Observations and changes for specs grading, 8 weeks in

# If you don’t succeed, try again: Timed tests using specs grading

February 16, 2015, 9:00 am

It’s been six weeks since the start of the semester, so it’s time for a brief update on the specifications grading “experiment” (although something being carried out in real life probably shouldn’t be called an “experiment”). So far it’s going quite well.

In this post I want to talk about timed testing under specs grading. This is an idea that’s not prevalent in Linda Nilson’s book on specs grading that got me started down this road. Mathematics is a subject that typically has a significant amount of procedural knowledge, unlike a lot of the subjects represented in Linda’s book. So there is a need to assess students’ ability to show that they can perform certain tasks on demand, without the benefit of virtually unlimited time and resources — things like calculating derivatives, interpreting graphs, and instantiating definitions.

Don’t misunderstand: Those tasks don’t make up mathematics any more than the ABC’s make up the discipline of literature. But it seems reasonable to assess beginning writers on their ABC’s, and mathematical tasks that admit timed assessment are the ABC’s of the subject.

Timed assessments were in the design of the Discrete Structures course from the beginning through the use of Concept Checks, which are 15 minute weekly quizzes over the “CC” objectives. But another layer of timed testing needed to be built to assess what I called the “CORE-M” objectives, the learning outcomes that are not basic (and can’t be assessed through objective, easy-to-grade items) and of central importance. The solution I came up with turns the whole process of timed testing sideways, and I really like it.

Four times during the semester, we’ve set aside an entire period that I call a timed assessment period. During that timed assessment period, students will work individually to do a combination of the following: (1) work problems over CORE-M objectives that are new to them, (2) re-work new versions of problems over old CORE-M objectives that they haven’t passed yet, and (3) re-work new versions of problems over old CC objectives that they haven’t passed yet. About a week prior to the timed assessment period, I put up a survey using a Google Form to have students go in and select the problems they wish to work during the period. Then, I go make out one problem for each CORE-M and for each CC objective that were requested.

I print out the appropriate number of copies of each problem and bring them to class on the assessment period day and lay them out on two tables. Students then come up at the beginning of the hour and get the problems that they want to work. They work those, and submit them when done. I then grade all of those on a Pass/No Pass basis using the specifications we laid out at the beginning of the semester. If a student Passes an objective, they are done with it, and they are one step closer to attaining the goal grade they wanted for the semester. If the student doesn’t Pass, then they can request that objective again during the next timed assessment period, at which point I’ll make out a new problem that assesses it.

At the end of the semester, we have the entire 110-minute final exam period set aside as a massive, last-chance timed assessment period. I do not give a comprehensive final exam in the class. Instead, the time is spent by students to make one last attempt at any CC or CORE-M objective they had not passed yet.

Here are some pictures I took of the process of putting together the first timed assessment period for the discrete structures class. Going into the assessment period, I had copies of the six CORE-M objectives available and a single pack containing all the CC objectives. I also had envelopes labelled with the objective numbers for students to return their work when done.

Here’s how it looked when I got set up in class:

This was the first assessment period, so we only had six CORE-M objectives. The next one, happening this Friday, has seven more new ones in addition to a smattering of older ones, so I’ll need more space. To avoid a human logjam when coming to get the papers, I released students by rows to come down one at a time to get their papers. It dawned on me as this was happening that this process was exactly like the process of receiving Communion at my church on Sunday mornings. I suppose that makes me some sort of Eucharistic minister.

Some students decided to try all six CORE-M objectives, while others opted only to work on three or four of them, just to make sure they had time to work. One of the things I like the most about this system is that I will never again have the horrible situation of having to pry tests out of students’ hands when time runs out. That is, this happens (and did happen in this particular session) but each student can estimate how much time they need on each objective and only ask for problems that fit within the time frame; and if they misestimate, then no worries — just try again at the next session.

Students handed in their work by putting the problems into the appropriate envelope, Actually it turned out the envelopes were a bad idea because they were only slightly larger than the papers themselves, and it was hard (and noisy!) to shove a paper into an already full envelope. At the second section’s period, I switched from envelopes to hanging file folders, and that solved the problem.

When it was time to grade, I pulled out the papers from each envelope/folder and alphabetized them (note to self: hire a student to do this in the future), and put them in piles:

Then I graded everyone’s work on objective M.1, then everyone’s work on M.2 and so on. When the grading was done and Pass/No Pass marks entered into Blackboard, I laid out the papers on a big table and put together a packet for each student. This was simple because the stacks were alphabetized, so “Adams” was on top, then “Brown”, etc. and I just had to skim the current student’s work off the tops of the stacks and staple them together.

This was probably the same amount of work that I would expend on an ordinary timed test. Making out the problems didn’t take any more time. Grading them was about the same as for an ordinary test, with a slight speedup because I am not assigning partial credit to the work. (It’s only a slight speedup because I am giving feedback that’s more detailed than what I’d give on a regular test.) I have to alphabetize seven stacks of papers instead of one, and that’s time consuming but trivial. So really, once the logistics are ironed out, this isn’t that much more work. For future periods, I’ll have to make out new instances of problems that assess particular objectives, but that’s not usually very hard.

And there are many things I like, and the students like, about this way of testing. First there is minimal pressure; if you don’t Pass an objective, just do it again later. Second, because students are choosing what they want to be assessed on, it makes them think intentionally about their preparation, rather than engaging in “studying” (which usually isn’t very purposeful) and showing up hoping they do well. Third — and multiple students mentioned this as a positive — this system does not let you disengage from course material once the test has been given. Because the test is not over yet! If you don’t pass, try again later — but you do need to try again.

That last item is the main thing that makes me feel OK about not having a comprehensive final exam. If students are continuously revisiting material that they had not previously mastered and re-attempting problems to demonstrate mastery, the need for a comprehensive final exam diminishes.

As a last word, I think that even if you aren’t into specs grading and the whole no-partial-credit idea, this way of doing timed testing could still work. I should give credit where it’s due and say that the main idea for doing tests like this came from this Calculus 2 course that does not use specs grading. Just make up one problem per objective and assign a uniform number of points to each objective — say, 10 points per objective. Then students do their work and submit it as described above, and you grade it using your system and rubric for partial credit. Then, the sum total of all the points earned flows into a large pool of points for “Tests”. Over the course of a semester using regular grading, you might give four 100-point tests; in this system you’d be giving 40 problems worth 10 points each. So it’s roughly equivalent in terms of contributions to the course grade. There are details to work out of course, but in principle this way of testing fits into ordinary courses in addition to specs-grading courses.