TY - GEN
T1 - A Study on Accuracy, Miscalibration, and Popularity Bias in Recommendations
AU - Kowald, Dominik
AU - Mayr, Gregor
AU - Schedl, Markus
AU - Lex, Elisabeth
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Recent research has suggested different metrics to measure the inconsistency of recommendation performance, including the accuracy difference between user groups, miscalibration, and popularity lift. However, a study that relates miscalibration and popularity lift to recommendation accuracy across different user groups is still missing. Additionally, it is unclear if particular genres contribute to the emergence of inconsistency in recommendation performance across user groups. In this paper, we present an analysis of these three aspects of five well-known recommendation algorithms for user groups that differ in their preference for popular content. Additionally, we study how different genres affect the inconsistency of recommendation performance, and how this is aligned with the popularity of the genres. Using data from Last.fm, MovieLens, and MyAnimeList, we present two key findings. First, we find that users with little interest in popular content receive the worst recommendation accuracy, and that this is aligned with miscalibration and popularity lift. Second, our experiments show that particular genres contribute to a different extent to the inconsistency of recommendation performance, especially in terms of miscalibration in the case of the MyAnimeList dataset.
AB - Recent research has suggested different metrics to measure the inconsistency of recommendation performance, including the accuracy difference between user groups, miscalibration, and popularity lift. However, a study that relates miscalibration and popularity lift to recommendation accuracy across different user groups is still missing. Additionally, it is unclear if particular genres contribute to the emergence of inconsistency in recommendation performance across user groups. In this paper, we present an analysis of these three aspects of five well-known recommendation algorithms for user groups that differ in their preference for popular content. Additionally, we study how different genres affect the inconsistency of recommendation performance, and how this is aligned with the popularity of the genres. Using data from Last.fm, MovieLens, and MyAnimeList, we present two key findings. First, we find that users with little interest in popular content receive the worst recommendation accuracy, and that this is aligned with miscalibration and popularity lift. Second, our experiments show that particular genres contribute to a different extent to the inconsistency of recommendation performance, especially in terms of miscalibration in the case of the MyAnimeList dataset.
KW - Accuracy
KW - Miscalibration
KW - Popularity bias
KW - Popularity lift
KW - Recommendation inconsistency
KW - Recommender systems
UR - http://www.scopus.com/inward/record.url?scp=85169072189&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-37249-0_1
DO - 10.1007/978-3-031-37249-0_1
M3 - Conference paper
AN - SCOPUS:85169072189
SN - 9783031372483
T3 - Communications in Computer and Information Science
BT - Advances in Bias and Fairness in Information Retrieval - 4th International Workshop, BIAS 2023, Revised Selected Papers
A2 - Boratto, Ludovico
A2 - Marras, Mirko
A2 - Faralli, Stefano
A2 - Stilo, Giovanni
PB - Springer Science and Business Media Deutschland GmbH
T2 - 4th International Workshop on Algorithmic Bias in Search and Recommendation, part of the 45th European Conference on Information Retrieval
Y2 - 2 April 2023 through 2 April 2023
ER -