TY - JOUR
T1 - Interactive Quality Analytics of User-generated Content: An Integrated Toolkit for the Case of Wikipedia
AU - Sciascio, Cecilia Di
AU - Strohmaier, David
AU - Errecalde, Marcelo
AU - Veas, Eduardo
PY - 2019/4
Y1 - 2019/4
N2 - Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example, Wikipedia, one of the most visited platforms on the Web, hosts thousands of user-generated articles and undergoes 12 million edits/contributions per month. User-generated content is undoubtedly one of the keys to its success but also a hindrance to good quality. Although Wikipedia has established guidelines for the “perfect article,” authors find it difficult to assert whether their contributions comply with them and reviewers cannot cope with the ever-growing amount of articles pending review. Great efforts have been invested in algorithmic methods for automatic classification of Wikipedia articles (as featured or non-featured) and for quality flaw detection. Instead, our contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. A design study shows that experts are able to effectively create complex quality metrics in a visual analytics environment. In turn, a user study evidences that regular users can identify flaws, as well as high-quality content based on the inspection of automatic quality scores
AB - Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example, Wikipedia, one of the most visited platforms on the Web, hosts thousands of user-generated articles and undergoes 12 million edits/contributions per month. User-generated content is undoubtedly one of the keys to its success but also a hindrance to good quality. Although Wikipedia has established guidelines for the “perfect article,” authors find it difficult to assert whether their contributions comply with them and reviewers cannot cope with the ever-growing amount of articles pending review. Great efforts have been invested in algorithmic methods for automatic classification of Wikipedia articles (as featured or non-featured) and for quality flaw detection. Instead, our contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. A design study shows that experts are able to effectively create complex quality metrics in a visual analytics environment. In turn, a user study evidences that regular users can identify flaws, as well as high-quality content based on the inspection of automatic quality scores
KW - Information quality assessment
KW - Text analytics
KW - User-generated content
KW - Visual analytics
KW - Wikipedia
KW - information quality assessment
KW - visual analytics
KW - user-generated content
UR - http://www.scopus.com/inward/record.url?scp=85065191460&partnerID=8YFLogxK
U2 - 10.1145/3150973
DO - 10.1145/3150973
M3 - Article
SN - 2160-6463
SN - 2160-6463
VL - 9
SP - 1
EP - 42
JO - ACM Transactions on Interactive Intelligent Systems
JF - ACM Transactions on Interactive Intelligent Systems
IS - 2-3
M1 - 13
ER -