Wikipedia Uses AI to Assist Human Editors
Dec 4, 2015 11:02 AM PT
The Wikimedia Foundation this week rolled out a service designed to improve the quality of Wikipedia articles.
The Objective Revision Evaluation Service uses artificial intelligence and machine learning to help Wikipedia editors identify damaging articles more quickly and assign quality scores to them more rapidly.
Every day, Wikipedia is edited some 500,000 times, Wikimedia said. Editors, most of them volunteers, have to review all those changes.
ORES allows those editors to peer into incoming content to identify potentially damaging edits swiftly and quarantine them for future scrutiny.
A damaging edit might include the insertion of personal opinion or obscenity into an article.
"If you're in the media at all, there's a chance that someone is going to dislike something that you said and is going to try to damage your Wikipedia page," said Rob Enderle, principal analyst at the Enderle Group.
That behavior, though, fits a pattern -- a pattern that a system like ORES can address. "Low-level AI is really good at identifying patterns and taking prescribed action against the patterns it recognizes," he told TechNewsWorld.
"Unless you have a ton more people than Wikipedia has, you'd never be able to keep up with the bad edits," Enderle said.
With ORES, "Wikipedia can be more trusted and less likely to be used as a tool to harm somebody," he added.
It provides Wikipedia editors with a suite of tools they can use to help them sort edits by the probability that they're damaging.
"That allows that editor to review the most likely to be damaging edits first," said Wikimedia Senior Research Scientist Aaron Halfaker. "That can reduce the workload of reviewing edits by about 90 percent."
Less Editing, More Articles
ORES predicts the probability that an edit is damaging by drawing on knowledge it gains comparing before-and-after edits of all articles that appear in Wikipedia.
It uses that knowledge to assign a score to a proposed edit. Scores can be retrieved quickly -- in 50 to 100 milliseconds -- and can be used to identify problematic edits rapidly.
About 3 to 4 percent of the daily edits to Wikipedia articles are damaging. Weeding those damaging edits from the firehose of edits flooding Wikipedia every day is an onerous task for editors. It's less so with ORES, Wikipedia said.
"Our machine learning model is good enough at sorting those edits by the probability that they're damaging that you would have to review 10 percent of incoming edits to know that you caught all of the damaging edits," Halfaker told TechNewsWorld. "Without this tool, you'd have to review all the edits to know you caught all the damaging edits."
Although ORES itself doesn't directly improve the quality of the articles in Wikipedia, it does so indirectly by ensuring that editors catch all damaging content and freeing up those editors to create more articles of their own.
"One of the reasons we want to reduce the workload around quality control is so that editors can spend more of their time working on new article content rather than removing vandalism," Halfaker said.
Wikipedia receives about 12 million hours of volunteer labor a year. "A lot of what these artificial intelligence systems do is more effectively apply that human attention to the problem of writing a high-quality encyclopedia," he said.
While ORES addresses one aspect of quality control at Wikipedia, a bigger issue remains, noted Sorin Adam Matei, an associate professor at Purdue University who studies the relationship between information technology and social structures in knowledge markets.
Pure science articles that encompass a narrow band of factual truth, like the entry on pi, are very accurate in Wikipedia. "It's not easy to fake those kinds of articles," he told TechNewsWorld.
In the social science and humanities areas, however, the articles become more and more ambiguous as information is added to them, Matei continued.
"Accuracy becomes a moot point. It's not that they're not accurate. It's the stories that they build are so complex and difficult to read that by the time you get to the end of them, you don't know what to believe," he said.
"I think Wikipedia's problem is it's the product of many minds pulling it in all kinds of directions, and a mere technical solution for that hasn't been invented yet," Matei said.
"They're not trying to deal with the real meaningful issue that Wikipedia confronts us with," he added. "It's not that Wikipedia leads us astray from the one truth, but it has built into it all kinds of truths that you need to see through carefully."