TY - GEN
T1 - Automatic News Article Generation from Legislative Proceedings
T2 - 9th International Conference on Statistical Language and Speech Processing, SLSP 2021
AU - Klimashevskaia, Anastasiia
AU - Gadgil, Richa
AU - Gerrity, Thomas
AU - Khosmood, Foaad
AU - Gütl, Christian
AU - Howe, Patrick
N1 - Funding Information:
The authors thank the John S. and James L. Knight Foundation. The collaboration between Graz University of Technology and Cal Poly was made possible thanks to funding by the Austrian Marshall Plan Foundation. We also thank the Institute for Advanced Technology and Public Policy, and Ms. Christine Robertson for her valuable insights.
Funding Information:
Acknowledgments. The authors thank the John S. and James L. Knight Foundation. The collaboration between Graz University of Technology and Cal Poly was made possible thanks to funding by the Austrian Marshall Plan Foundation. We also thank the Institute for Advanced Technology and Public Policy, and Ms. Christine Robertson for her valuable insights.
Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Algorithmic journalism refers to automatic AI-constructed news stories. There have been successful commercial implementations for news stories in sports, weather, financial reporting and similar domains with highly structured, well defined tabular data sources. Other domains such as local reporting have not seen adoption of algorithmic journalism, and thus no automated reporting systems are available in these categories which can have important implications for the industry. In this paper, we demonstrate a novel approach for producing news stories on government legislative activity, an area that has not widely adopted algorithmic journalism. Our data source is state legislative proceedings, primarily the transcribed speeches and dialogue from floor sessions and committee hearings in US State legislatures. Specifically, we create a library of potential events called phenoms. We systematically analyze the transcripts for the presence of phenoms using a custom partial order planner. Each phenom, if present, contributes some natural language text to the generated article: either stating facts, quoting individuals or summarizing some aspect of the discussion. We evaluate two randomly chosen articles with a user study on Amazon Mechanical Turk with mostly Likert scale questions. Our results indicate a high degree of achievement for accuracy of facts and readability of final content with 13 of 22 users in the first article and 19 of 20 subjects of the second article agreeing or strongly agreeing that the articles included the most important facts of the hearings. Other results strengthen this finding in terms of accuracy, focus and writing quality.
AB - Algorithmic journalism refers to automatic AI-constructed news stories. There have been successful commercial implementations for news stories in sports, weather, financial reporting and similar domains with highly structured, well defined tabular data sources. Other domains such as local reporting have not seen adoption of algorithmic journalism, and thus no automated reporting systems are available in these categories which can have important implications for the industry. In this paper, we demonstrate a novel approach for producing news stories on government legislative activity, an area that has not widely adopted algorithmic journalism. Our data source is state legislative proceedings, primarily the transcribed speeches and dialogue from floor sessions and committee hearings in US State legislatures. Specifically, we create a library of potential events called phenoms. We systematically analyze the transcripts for the presence of phenoms using a custom partial order planner. Each phenom, if present, contributes some natural language text to the generated article: either stating facts, quoting individuals or summarizing some aspect of the discussion. We evaluate two randomly chosen articles with a user study on Amazon Mechanical Turk with mostly Likert scale questions. Our results indicate a high degree of achievement for accuracy of facts and readability of final content with 13 of 22 users in the first article and 19 of 20 subjects of the second article agreeing or strongly agreeing that the articles included the most important facts of the hearings. Other results strengthen this finding in terms of accuracy, focus and writing quality.
KW - Algorithmic journalism
KW - Artificial intelligence
KW - Automatic summarization
KW - Digital government
KW - Natural language generation
KW - Partial order planning
UR - http://www.scopus.com/inward/record.url?scp=85118181043&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-89579-2_2
DO - 10.1007/978-3-030-89579-2_2
M3 - Conference paper
AN - SCOPUS:85118181043
SN - 9783030895785
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 15
EP - 26
BT - Statistical Language and Speech Processing - 9th International Conference, SLSP 2021, Proceedings
A2 - Espinosa-Anke, Luis
A2 - Martín-Vide, Carlos
A2 - Spasic, Irena
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 November 2021 through 25 November 2021
ER -