High in Sky, a Refrain: ‘Squawk, Data’

A red kite, sans blog. Copyright Sean Gray

A red kite, sans blog. Copyright Sean Gray

Like any young adult moving to a strange new land—a common occurrence at this time of year—Wyvis, a resident of Scotland, took to blogging about her new home in late August.

Out on her own, far from relatives, she was making long trips around the farms of Durisdeer Mill, a village in the country’s southwest lowlands, she wrote. She loved the isolation. She would rest in the woodlands around Sanquhar. Then came the odd journey to the moors, worms squishing underfoot. It was good land. She seemed likely to stay.

Wyvis’s online journal was not so different from that of any young adult, scribbling away on a laptop, exploring a new home. Except for one notable quirk. Wyvis isn’t an incoming freshman. Wyvis is a bird.

Specifically, Wyvis is one of four GPS-tagged birds of prey featured in Blogging Birds, a pioneering effort at the University of Aberdeen that uses algorithms and ecological insight to generate, without human intervention, written narratives of the birds’ weekly explorations. And not only does the project seek to make such data comprehensible to the public; it seeks to create language that soars.

Like her three peers, Wyvis is a red kite, a raptor that human beings drove locally extinct in Scotland and then reintroduced in the late 1980s. A Scottish nonprofit group, the Royal Society for the Protection of Birds, monitors the birds, and eventually paid for GPS data from the birds to be translated by its staff into weekly blog updates for schoolchildren. Yet the updates provided little ecological insight to the kids, and proved mundane and expensive to maintain.

That problem led the charity to the University of Aberdeen, which has long pushed the verge of natural-language generation, a computer-science discipline that seeks to translate structured data into simple, clear texts. It’s a technique useful for monitoring complex systems like offshore oil platforms; if a turbine goes awry, the NLG system might say: “There was a High Vibration alert on Turbine1 Gearbox at 19 Sept 2012 19:22. The alert had been intermittently active for two days. …”

Still with me?

Obviously such technical language is ill suited for the public. (Trust that ecological jargon is not much better.) So the Aberdeen team, which consists of ecologists and coders, integrated its NLG software with lessons about kite habits, like their tendency to explore from home ranges, and local weather and habitat information. Into that stew the team added techniques from “computational narratology,” a nascent field that renders narrative elements, from high-level concepts like framing to low-level bits like actors and events, into computer-comprehensible chunks.

Here’s where all bloggers should get excited, or get chills, depending on your computational fluency. Algorithms have already proved, if not adept, at least mildly competent at generating basic stories out of the data from, say, a baseball game. (A company, Narrative Science, has built a business out of it.) But no one had brought the two realms together to spin scientific stories for the public.

“As far as we are aware, Blogging Birds is indeed the first attempt at bridging these fields to generate narratives from data for regular folk,” said Advaith Siddharthan, a lecturer in computing science at Aberdeen and a collaborator on the project.

The result is a Web site that went live in August. The four birds profiled—Wyvis, Moray, Millie, and Ussie—each has a blog, where visitors can see weekly or daily updates overlaid on a map. (Want a group blog? That’s more complicated.) The researchers weren’t shy about drawing some inferences based on their ecological knowledge. For instance, one early rule added to the system was that, if a kite visited farmland that had recently experienced rain, then it was likely to be feeding on earthworms. Or, as the software might put it:

Early that evening, she was seen in farmland near Torness. Here, the rain must have brought up earthworms to the surface—snacks well worth catching!

René van der Wal, a reader in ecology at Aberdeen, has found that the blog posts aid his own monitoring of the kites. It would take him much longer to build a snapshot of the week, and the summaries written by Royal Society staff members often missed some of the birds’ actions. Now he quickly knows if the kites are up to anything unusual.

“Our tool thus allows even ourselves to understand the movement of these reintroduced birds to the countryside,” van der Wal said. “Something I did not expect!”

When it all comes together, Siddharthan said, a blog entry can run like this summary of Wyvis’s explorations during a week in mid-June:

This week, Wyvis flew around an astonishing 263 miles and made a journey to Banchory outside her home range and back to Crieff. Wyvis’s foraging patterns during this week have been varied and Wyvis roosted largely in woodlands around Banchory and Crieff.

Monday and Tuesday Wyvis spent most of her time around Tullybannocher. During this time she was seen mainly on heather while making odd journeys to moorland. On Wednesday evening she reached farmland near Banchory outside her home usual ranges flying about 64 miles amid cloudy conditions and averaging a remarkable 11 miles per hour. On Thursday afternoon she reached heather near Crieff flying about 54 miles amid cloudy conditions. Friday and Saturday Wyvis spent most of her time around Aberfeldy, Crieff and St David’s. During this time she was seen mainly on acid grassland while making odd journeys to moorland.

Clearly Wyvis is fond of her home as she was swift to return back to the Crieff.

This is a best case. As the team’s own research shows, the automated blogs still rate worse with readers than GPS data salted with handcrafted insight from an ecologist. The language gets stilted and repetitive. The birds are always taking “odd journeys.” Their behavior is often “varied.” Entries tend to end with leading questions: “Will Moray be settled down here or will she be swapping her home range again?” “Will Ussie get bored and stretch his wings next week?” Acid grassland, we learn time and again, is a poor species habitat.

Rest easy, at least for now, dear blog writers.

But not for long. As large-scale data analysis becomes an ever-increasing part of scientific life, the value that can come from collapsing those numbers into narrative is obvious. Human beings are storytellers. We think in narrative. Perhaps it’s time that we make it so the computers can talk to us?

As for Blogging Birds, Siddharthan sees many ways to improve their first run of the software. It can get more visual in describing landscape, as the human-written blogs do. Stories tend to have a beginning, middle, and end, but how do you find an “end” in real-time data? The ecology could dictate that. They’re still discovering how kites behave in Scotland, he said.

“The ecology forces a top-down interpretation of the data, this sort of creates plans, though not always particularly exciting ones,” Siddharthan said. “The more we learn about kite ecology, the more complex the plots can get.”

Will the Aberdeen researchers solve the plot problem next year?

Return to Top