Interactive Quality Analytics of User-generated Content: An Integrated Toolkit for the Case of Wikipedia

Cecilia Di Sciascio; David Strohmaier; Marcelo Errecalde; Eduardo Veas

doi:10.1145/3150973

Interactive Quality Analytics of User-generated Content: An Integrated Toolkit for the Case of Wikipedia

Cecilia Di Sciascio, David Strohmaier, Marcelo Errecalde, Eduardo Veas

Publikation: Beitrag in einer Fachzeitschrift › Artikel › Begutachtung

Abstract

Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example, Wikipedia, one of the most visited platforms on the Web, hosts thousands of user-generated articles and undergoes 12 million edits/contributions per month. User-generated content is undoubtedly one of the keys to its success but also a hindrance to good quality. Although Wikipedia has established guidelines for the “perfect article,” authors find it difficult to assert whether their contributions comply with them and reviewers cannot cope with the ever-growing amount of articles pending review. Great efforts have been invested in algorithmic methods for automatic classification of Wikipedia articles (as featured or non-featured) and for quality flaw detection. Instead, our contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. A design study shows that experts are able to effectively create complex quality metrics in a visual analytics environment. In turn, a user study evidences that regular users can identify flaws, as well as high-quality content based on the inspection of automatic quality scores

Originalsprache	englisch
Aufsatznummer	13
Seiten (von - bis)	1-42
Seitenumfang	42
Fachzeitschrift	ACM Transactions on Interactive Intelligent Systems
Jahrgang	9
Ausgabenummer	2-3
DOIs	https://doi.org/10.1145/3150973
Publikationsstatus	Veröffentlicht - Apr. 2019

ASJC Scopus subject areas

Artificial intelligence
Human-computer interaction

Zugriff auf Dokument

10.1145/3150973

Andere Dateien und Links

http://www.scopus.com/inward/record.url?scp=85065191460&partnerID=8YFLogxK

Dieses zitieren

@article{f3d142f4f31b4fbc8a7b55647f3d1c61,

title = "Interactive Quality Analytics of User-generated Content: An Integrated Toolkit for the Case of Wikipedia",

abstract = "Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example, Wikipedia, one of the most visited platforms on the Web, hosts thousands of user-generated articles and undergoes 12 million edits/contributions per month. User-generated content is undoubtedly one of the keys to its success but also a hindrance to good quality. Although Wikipedia has established guidelines for the “perfect article,” authors find it difficult to assert whether their contributions comply with them and reviewers cannot cope with the ever-growing amount of articles pending review. Great efforts have been invested in algorithmic methods for automatic classification of Wikipedia articles (as featured or non-featured) and for quality flaw detection. Instead, our contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. A design study shows that experts are able to effectively create complex quality metrics in a visual analytics environment. In turn, a user study evidences that regular users can identify flaws, as well as high-quality content based on the inspection of automatic quality scores",

keywords = "Information quality assessment, Text analytics, User-generated content, Visual analytics, Wikipedia, information quality assessment, visual analytics, user-generated content",

author = "Sciascio, {Cecilia Di} and David Strohmaier and Marcelo Errecalde and Eduardo Veas",

year = "2019",

month = apr,

doi = "10.1145/3150973",

language = "English",

volume = "9",

pages = "1--42",

journal = "ACM Transactions on Interactive Intelligent Systems ",

issn = "2160-6463 ",

publisher = "Association of Computing Machinery",

number = "2-3",

}

TY - JOUR

T1 - Interactive Quality Analytics of User-generated Content: An Integrated Toolkit for the Case of Wikipedia

AU - Sciascio, Cecilia Di

AU - Strohmaier, David

AU - Errecalde, Marcelo

AU - Veas, Eduardo

PY - 2019/4

Y1 - 2019/4

N2 - Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example, Wikipedia, one of the most visited platforms on the Web, hosts thousands of user-generated articles and undergoes 12 million edits/contributions per month. User-generated content is undoubtedly one of the keys to its success but also a hindrance to good quality. Although Wikipedia has established guidelines for the “perfect article,” authors find it difficult to assert whether their contributions comply with them and reviewers cannot cope with the ever-growing amount of articles pending review. Great efforts have been invested in algorithmic methods for automatic classification of Wikipedia articles (as featured or non-featured) and for quality flaw detection. Instead, our contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. A design study shows that experts are able to effectively create complex quality metrics in a visual analytics environment. In turn, a user study evidences that regular users can identify flaws, as well as high-quality content based on the inspection of automatic quality scores

AB - Digital libraries and services enable users to access large amounts of data on demand. Yet, quality assessment of information encountered on the Internet remains an elusive open issue. For example, Wikipedia, one of the most visited platforms on the Web, hosts thousands of user-generated articles and undergoes 12 million edits/contributions per month. User-generated content is undoubtedly one of the keys to its success but also a hindrance to good quality. Although Wikipedia has established guidelines for the “perfect article,” authors find it difficult to assert whether their contributions comply with them and reviewers cannot cope with the ever-growing amount of articles pending review. Great efforts have been invested in algorithmic methods for automatic classification of Wikipedia articles (as featured or non-featured) and for quality flaw detection. Instead, our contribution is an interactive tool that combines automatic classification methods and human interaction in a toolkit, whereby experts can experiment with new quality metrics and share them with authors that need to identify weaknesses to improve a particular article. A design study shows that experts are able to effectively create complex quality metrics in a visual analytics environment. In turn, a user study evidences that regular users can identify flaws, as well as high-quality content based on the inspection of automatic quality scores

KW - Information quality assessment

KW - Text analytics

KW - User-generated content

KW - Visual analytics

KW - Wikipedia

KW - information quality assessment

KW - visual analytics

KW - user-generated content

UR - http://www.scopus.com/inward/record.url?scp=85065191460&partnerID=8YFLogxK

U2 - 10.1145/3150973

DO - 10.1145/3150973

M3 - Article

SN - 2160-6463

VL - 9

SP - 1

EP - 42

JO - ACM Transactions on Interactive Intelligent Systems

JF - ACM Transactions on Interactive Intelligent Systems

IS - 2-3

M1 - 13

ER -

Interactive Quality Analytics of User-generated Content: An Integrated Toolkit for the Case of Wikipedia

Abstract

ASJC Scopus subject areas

Zugriff auf Dokument

Andere Dateien und Links

Fingerprint

Dieses zitieren