Curt Conrad still remembers heading to bed at 4 a.m. on football Friday nights. With multiple games to cover, quotes to cull and stats to sort, football Friday inevitably led to sleepy Saturday.
Not now.
Conrad, a sports reporter with the all-digital Richland Source in Mansfield, Ohio, relies on automated journalism to cover brass tacks such as the final score, scoring plays by quarter, team records, basic stats and future schedules to generate game stories independently. Conrad, a 20-year journalism veteran, is freed up to add in quotes and color, spend more time shooting photos, writing in-depth features or columns and focusing on higher-level work.
“I’ve always felt obligated to be out covering events as frequently as I can,” said Conrad, whose company uses the Artificial Intelligence platform Lede Ai. “This allows us to not have to do that. We still go out and cover a lot of events, but it frees me up to do what I’d rather be doing. It’s a lot more enterprise work.”
And now he’s in bed by midnight on football Fridays.
For most contests Conard lets the AI-generated story stand. For bigger games, he and other Richland Source staffers fill the automated story out with quotes and analysis, along with photo galleries. Conrad isn’t alone in sharing his bylines with a computer program. The use of automation to produce and aid journalism has progressed from Brave New World proposal to accepted professional practice in just a few years.
Examples abound across the country and overseas. Using natural language generation (NLG), the AP produces 4,400 quarterly earnings reports recaps, nearly 15 times its previous manual efforts, plus previews and summaries of sports contests.
The Washington Post’s Heliograf covers election results as well as crime, real estate and high school sports. Bloomberg’s Cyborg produces reports in mere seconds, with the company estimating that one-fourth of its content now includes some degree of automation.
AI writes in multiple languages and can even suggest story ideas by spotting statistical anomalies. It’s a tool proven effective from the national level down to the hyperlocal, including the handful of communities covered by Richland Source and its sister publications Ashland Source and Knox Pages. Some Lede Ai football recaps appear on all three sites, allowing domination of Google search results. After rolling it out locally, the outlet now “covers” games across the entire state.
“There are 1,600 high school football teams in Ohio, which means 800 or so contests,” said Richland Source publisher Jay Allred. “Our software produces 800 pieces of content every fall Friday night, plus other sports during the week.”
He describes it as giving superpowers to newsrooms.
Automation Age
AI, or “robot journalism” as it’s sometimes called, took off in the middle of the decade as emerging technologies began to make the practice a newsroom reality. In a competitive environment, automation proved effective at generating readable stories nearly instantly, flooding the internet with content many readers never know is computer generated.
“People probably don’t realize the extent to which automation is already being used in newsrooms and across different sides of the journalism business,” said Damian Radcliffe, a University of Oregon professor who follows consumer trends and business models for journalism.
Along with in-house products created by and for traditional media companies, several stand-alone companies compete in the space. To name but a few: ScoreStream, Abundat, SAM, Trint, Vidrovr, NewsWhip, Parse.ly, Chartbeat, Jigsaw and Otter all feature automation products. Fourteen months after launch, Richland Source’s Lede AI now counts seven paying customers.
The programs at first increase the human-powered workload. Programmers first design the AI, then editors pen a set of templates. Extensive beta testing then follows, with AI stories generated using sample data. Finally, stories see the light of day with prior approval from a human. The last step is to run completely on autopilot.
Prepping to cover 2016 Summer Olympic Games results with Heliograph, the Washington Post created dummy stories with historic results data to ensure everything checked out. Snags fixed during the testing phase included adjusting to the naming conventions of mainland Chinese athletes, and accounting for the possibility of multiple bronze medals in judo.
The Post learned to flag any data featuring results 10 percent over or under the standing Olympic record, which triggers a human review. Similar tactics abound across AI systems.
“In all cases, the program sent Slack alerts [notifications of possible problems], and we told it not to publish anything because we were unsure about the information,” said Jeremy Gilbert, director of strategic initiatives at The Washington Post. “We made some modifications over time.”
With the bugs worked out, Heliograph covered 2016 Olympic results. That fall it reported the outcome of several U.S. Senate and House races, often relatively low-interest contests where the outcome wasn’t in much doubt and the result wouldn’t have been covered at all otherwise. In 2018 Heliograph began laying in geographic data, such as how cities voted versus rural areas. Next year the program will loop in state legislatures and cover some 12,000 elections.
Scale is undoubtedly one of AI’s most attractive features. Richland Source covers those hundreds of Ohio high school football games each week using the same methods it developed to cover its home county.
Of course, even the best AI is only as good as its data. A famous blunder, cited by multiple sources for this story, is that of Quakebot, a Los Angeles Times product designed to cover earthquakes.
In 2017, Quakebot reported on a California earthquake measuring a 6.8 magnitude. The earthquake was real. It was also from 1925. The newspaper wasn’t entirely at fault, as the bot received incorrect data from the U.S. Geological Survey. Still, it taught a powerful lesson.
“Quakebot used to publish automatically,” said Radcliffe. “After the false alert, the LA Times realized there needs to be a person hitting publish rather than having it go live right away.”
No one felt the ground move under their feet despite the gaffe. But should a similar error take place with corporate earnings reports it would mean serious consequences for a lot of stakeholders.
Thus, though many stories now run without any human editing, all AI must be tested from time to time to ensure accuracy. For elections and major sports with reliable data, it’s often not an issue. When it comes to lower-level sports, especially at the high school level, things get more complicated.
Richland Source covers not only high school football, but basketball, baseball, softball and soccer. To do so, it teams with a company called ScoreStream and relies largely on user-generated data from coaches and fans in the stands. When data doesn’t add up, such as when one team’s coach fills in different information than the other team’s coach, it doesn’t get published.
“We have sports clerks who actually approve the data,” said Allred. “We’re always more cautious when the data is self-reported.”
Assuming correct data still leaves philosophical dilemmas. Does it matter to the audience if a human or robot wrote it?
“There is an ethical question around transparency,” Radcliffe said. “How are these stories labeled? Do audiences react differently or do they even care if it’s being automatically generated or written by a human? Labeling that is important. People may well judge it differently if they knew it was written by a computer.”
Beyond Templates
Computer-generated stories are just the beginning of journalism AI, and in some cases have become old hat. The AP rolled out its first AI stories more than five years ago and has long since turned its focus to newer technology.
One development long promised but perhaps never fully delivered is transcription software. The AP is working with a company called Trint to provide real-time transcription of video to text. Editors can use it to streamline the production process and send info to AP subscribers. The software learns a user’s habits over time and can be set to either provide rapid transcriptions with less accuracy, or more accurate representations that take a little longer.
Another company, SAM, creates algorithms which search social media for newsy keywords like “disaster,” “shooting” or “fire.” SAM assesses both the content and its creator to decide if they’re reliable, then alerts newsroom editors.
“We conducted a five-week test with 24 AP journalists around the globe to experiment with SAM alerts,” said Lisa Gibbs, director of news partnerships at the Associated Press. “In that five-week period SAM identified at least 50 instances where journalists were alerted to news before they’d otherwise have known about it.”
In one example, SAM used police scanner information and tweets to alert editors of a mass shooting at Henry Platt Co. in Aurora, Illinois.
Similar to a reporter staring at TweetDeck all afternoon, algorithms like SAM keep journalists up to date, yet simultaneously free them from the computer. For investigative journalists, AI can spot and point out statistical outliers, poring over hundreds of pages of documents in seconds.
Still other uses include A/B testing of automated content to test out different headlines and automated comment section management.
‘Robot’ Reporters
Excitement about AI, in journalism and beyond, runs high. A number of academics study the issue and stories about the phenomenon abound not only in niche publications but in mainstream print and broadcast outlets.
In fact, sometimes excitement can run too high.
University of Minnesota professor Matt Carlson penned a well-known 2014 paper called “The Robotic Reporter,” but said he sometimes wishes he’d chosen a different phrase.
“I regret using the term ‘robot journalism,’” Carlson said. “It makes us think of R2-D2 instead of a server rack. We romanticize it, but it’s really just servers spitting out data.”
Stock images of humanoid robots typing at a computer or sporting a fedora don’t help either.
“It’s way more boring than that,” said Carlson.
Banks of computer servers may not be breathtaking, but they provide a much more precise picture of AI. Some of the hype generates from within the industry, Carlson said, since for-profit companies naturally want to promote their own offerings.
Related to the hype issue, and transcending journalism, is the question of whether AI portends layoffs.
Most sources for this piece said no.
“Our human team covering politics is bigger than it’s ever been,” said Gilbert. “Our newsroom has grown every year since I joined the Post in 2014.”
Gibbs said the presence of video transcription software has to date led to zero video editor layoffs. Instead, they’re assigned either more volume of video or more sophisticated video, both of which translate to better outcomes for the AP and for consumers.
Others agreed, arguing that layoffs may continue across journalism due to the same factors disrupting the industry for the better part of two decades, but not due to nascent AI programs.
Carlson is less convinced.
“Any software, program or innovation that can automate labor is going to, at some point, lead to fewer people doing that labor,” he said. “The question is what do newsrooms do with the excess labor. Do they fire people? Or do they give them more important stories that are better for humans?”
AI-written stories, by definition, consist of straightforward reporting without a lot of analysis, certainly without the type of color a columnist of features writer could add.
Instead, for ground-level journalists, supporters argue AI is about freedom from repetitive tasks and allowing space to work at a higher level — more features, more investigations, more records requests, more in-depth reporting, less game recaps, numbers-heavy earnings reports and straightforward election results.
“Think about how much a journalist can do with a smartphone today that wasn’t possible before,” said Gibbs. “In a way, this is just continuing along that path of new technology and new tools shaping what we do.”
Tagged under: automation, newsroom