Research with LensKit
LensKit is intended to be particularly useful in recommender systems research.
If you use LensKit in published research, cite:
BibTeX
@inproceedings{LKPY,
title={LensKit for Python: Next-Generation Software for Recommender Systems Experiments},
booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management},
DOI={10.1145/3340531.3412778},
author={Ekstrand, Michael D.},
year={2020},
month={Oct},
extra={arXiv:1809.03125}
}
We would appreciate it if you sent a copy of your published paper to ekstrand@acm.org, so we can know where LensKit is being used and add it to this list. Following is a list of papers that have used the Python version of LensKit; we maintain a separate list of ones using the Java version.

<script src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2F3TB3KT36%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1&jsonp=1"></script>
<?php
$contents = file_get_contents("https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2F3TB3KT36%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1");
print_r($contents);
?>
<iframe src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2F3TB3KT36%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1"></iframe>
For more details see the documention.
To the site owner:
Action required! Mendeley is changing its API. In order to keep using Mendeley with BibBase past April 14th, you need to:
- renew the authorization for BibBase on Mendeley, and
- update the BibBase URL in your page the same way you did when you initially set up this page.
@article{diaz_recall_2025, title = {Recall, robustness, and lexicographic evaluation}, volume = {4}, copyright = {Creative Commons Attribution 4.0 International License}, url = {https://dl.acm.org/doi/10.1145/3728373}, doi = {10.1145/3728373}, abstract = {Although originally developed to evaluate sets of items, recall is often used to evaluate rankings of items, including those produced by recommender, retrieval, and other machine learning systems. The application of recall without a formal evaluative motivation has led to criticism of recall as a vague or inappropriate measure. In light of this debate, we reflect on the measurement of recall in rankings from a formal perspective. Our analysis is composed of three tenets: recall, robustness, and lexicographic evaluation. First, we formally define “recall orientation” as the sensitivity of a metric to a user interested in finding every relevant item. Second, we analyze recall orientation from the perspective of robustness with respect to possible content consumers and providers, connecting recall to recent conversations about fair ranking. Finally, we extend this conceptual and theoretical treatment of recall by developing a practical preference-based evaluation method based on lexicographic comparison. Through extensive empirical analysis across multiple recommendation and retrieval tasks, we establish that our new evaluation method, lexirecall, has convergent validity (i.e., it is correlated with existing recall metrics) and exhibits substantially higher sensitivity in terms of discriminative power and stability in the presence of missing labels. Our conceptual, theoretical, and empirical analysis substantially deepens our understanding of recall and motivates its adoption through connections to robustness and fairness.}, number = {1}, urldate = {2025-08-08}, journal = {ACM Trans. Recomm. Syst.}, author = {Diaz, Fernando and Ekstrand, Michael D. and Mitra, Bhaskar}, month = jul, year = {2025}, pages = {13:1--13:50}, }
@inproceedings{pathak_circumventing_2025, address = {New York, NY, USA}, series = {{UMAP} '25}, title = {Circumventing {Misinformation} {Controls}: {Assessing} the {Robustness} of {Intervention} {Strategies} in {Recommender} {Systems}}, isbn = {979-8-4007-1313-2}, shorttitle = {Circumventing {Misinformation} {Controls}}, url = {https://dl.acm.org/doi/10.1145/3699682.3728350}, doi = {10.1145/3699682.3728350}, abstract = {Recommender systems are essential on social media platforms, shaping the order of information users encounter and facilitating news discovery. However, these systems can inadvertently contribute to the spread of misinformation by reinforcing algorithmic biases, fostering excessive personalization, creating filter bubbles, and amplifying false narratives. Recent studies have demonstrated that intervention strategies, such as Virality Circuit Breakers and accuracy nudges, can effectively mitigate misinformation when implemented on top of recommender systems. Despite this, existing literature has yet to explore the robustness of these interventions against circumvention—where individuals or groups intentionally evade or resist efforts to counter misinformation. This research aims to address this gap, examining how well these interventions hold up in the face of circumvention tactics. Our findings highlight that these intervention strategies are generally robust against misinformation circumvention threats when applied on top of recommender systems.}, urldate = {2025-06-16}, booktitle = {Proceedings of the 33rd {ACM} {Conference} on {User} {Modeling}, {Adaptation} and {Personalization}}, publisher = {Association for Computing Machinery}, author = {Pathak, Royal and Spezzano, Francesca}, month = jun, year = {2025}, pages = {279--284}, }
@inproceedings{danesi_estudo_2025, title = {Estudo {Comparativo} de {Bibliotecas} para {Sistemas} de {Recomendação} em {Python}}, copyright = {Copyright (c)}, url = {https://sol.sbc.org.br/index.php/erbd/article/view/35419}, doi = {10.5753/erbd.2025.6883}, abstract = {Este artigo apresenta uma análise de bibliotecas em linguagem Python que implementam algoritmos de Filtragem Colaborativa usados em Sistemas de Recomendação. Por meio de duas bibliotecas, Surprise e LensKit, são explorados os algoritmos K-Nearest Neighbors (K-NN) e Slope One e realizados testes comparativos para avaliar as bibliotecas.}, language = {pt}, urldate = {2025-08-01}, booktitle = {Anais da {XX} {Escola} {Regional} de {Banco} de {Dados}}, publisher = {SBC}, author = {Danesi, Lorenzo Dalla Corte and Lichtnow, Daniel}, month = apr, year = {2025}, note = {ISSN: 2595-413X}, pages = {157--160}, }
@misc{jakobsen_agent-based_2025, title = {Agent-{Based} {Exploration} of {Recommendation} {Systems} in {Misinformation} {Propagation}}, url = {http://arxiv.org/abs/2507.21724}, doi = {10.48550/arXiv.2507.21724}, abstract = {This study uses agent-based modeling to examine the impact of various recommendation algorithms on the propagation of misinformation on online social networks. We simulate a synthetic environment consisting of heterogeneous agents, including regular users, bots, and influencers, interacting through a social network with recommendation systems. We evaluate four recommendation strategies: popularity-based, collaborative filtering, and content-based filtering, along with a random baseline. Our results show that popularity-driven algorithms significantly amplify misinformation, while item-based collaborative filtering and content-based approaches are more effective in limiting exposure to fake content. Item-based collaborative filtering was found to perform better than previously reported in related literature. These findings highlight the role of algorithm design in shaping online information exposure and show that agent-based modeling can be used to gain realistic insight into how misinformation spreads.}, urldate = {2025-08-01}, publisher = {arXiv}, author = {Jakobsen, Lise and Holden, Anna Johanne and Gürcan, Önder and Özgöbek, Özlem}, month = jul, year = {2025}, note = {arXiv:2507.21724 [cs]}, }
@article{dokoupil_sm-rs_2025, title = {{SM}-{RS} 2.0: {User}-perceived {Qualities} of {Single}- and {Multi}-{Objective} {Recommender} {Systems}}, shorttitle = {{SM}-{RS} 2.0}, url = {https://dl.acm.org/doi/10.1145/3754459}, doi = {10.1145/3754459}, abstract = {Recommender systems (RS) rely on interaction data between users and items to generate effective results. Originally, RS aimed solely at predicting items’ relevance, but additional (beyond-relevance) quality criteria gained increased attention over time. Objectives such as diversity, novelty, fairness, or serendipity are nowadays at the center of RS research and also among the core components in production systems. Naturally, to properly steer towards such objectives, the system has to gain an understanding of how the users perceive these objectives, to what extent they require them in the recommendations, and how they evaluate the sufficiency of the results w.r.t. these objectives. However, so far, there is no publicly available dataset that would capture all the necessary knowledge. This results in a half-blind algorithmic design and evaluation, where the importance of individual objectives or the metrics for their evaluation cannot be validated from the users’ perspective. To address this issue, we present SM-RS 2.0, an expansion of the original single- and multi-objective recommendations dataset. The dataset links the self-declared propensity towards individual objectives with impressions, item selections, and explicit evaluation of individual quality criteria. Together with the dataset, we also distribute an evaluation framework containing six rather unique tasks that are rarely available to conduct on existing RS datasets. These include impression-aware click prediction, predicting propensity towards individual objectives, construction of proportional recommendations, and predicting the user-perceived fulfillment of individual objectives as well as their overall satisfaction. The dataset is available at https://osf.io/wsakx.}, urldate = {2025-07-27}, journal = {ACM Trans. Recomm. Syst.}, author = {Dokoupil, Patrik and Peska, Ladislav}, month = jul, year = {2025}, note = {Just Accepted}, }
@inproceedings{vente_potential_2025, address = {New York, NY, USA}, series = {{UMAP} {Adjunct} '25}, title = {The {Potential} of {AutoML} for {Recommender} {Systems}}, isbn = {979-8-4007-1399-6}, url = {https://doi.org/10.1145/3708319.3734173}, doi = {10.1145/3708319.3734173}, abstract = {Automated Machine Learning (AutoML) has significantly advanced Machine Learning (ML) applications, including model compression, machine translation, and computer vision. Recommender Systems (RecSys) can be seen as an application of ML. Yet AutoML has received little attention from the RecSys community, and RecSys has not received notable attention from the AutoML community. Only a few relatively simple Automated Recommender Systems (AutoRecSys) libraries exist that adopt AutoML techniques. However, these libraries are based on student projects and do not offer the features and thorough development of AutoML libraries. We set out to determine how AutoML libraries perform in the scenario of an inexperienced user who wants to implement a recommender system. We compared the predictive performance of 60 AutoML, AutoRecSys, ML, and RecSys algorithms from 15 libraries, including a mean predictor baseline, on 14 explicit feedback RecSys datasets. We found that AutoML and AutoRecSys libraries performed best. AutoML libraries performed best for six of the 14 datasets (43\%), but the same AutoML library did not always perform best. The single-best library was the AutoRecSys library Auto-Surprise, which performed best on five datasets (36\%). On three datasets (21\%), AutoML libraries performed poorly, and RecSys libraries with default parameters performed best. Although while obtaining 50\% of all placements in the top five per dataset, RecSys algorithms fall behind AutoML on average. ML algorithms generally performed the worst.}, urldate = {2025-07-18}, booktitle = {Adjunct {Proceedings} of the 33rd {ACM} {Conference} on {User} {Modeling}, {Adaptation} and {Personalization}}, publisher = {Association for Computing Machinery}, author = {Vente, Tobias and Wegmeth, Lukas and Beel, Joeran}, month = jun, year = {2025}, pages = {371--378}, }
@inproceedings{smucker_extending_2025, address = {New York, NY, USA}, series = {{SIGIR} '25}, title = {Extending {MovieLens}-{32M} to {Provide} {New} {Evaluation} {Objectives}}, isbn = {979-8-4007-1592-1}, url = {https://dl.acm.org/doi/10.1145/3726302.3730328}, doi = {10.1145/3726302.3730328}, abstract = {Offline evaluation of recommender systems has traditionally treated the problem as a machine learning problem. In the classic case of recommending movies, where the user has provided explicit ratings of which movies they like and don't like, each user's ratings are split into test and train sets, and the evaluation task becomes to predict the held out test data using the training data. This machine learning style of evaluation makes the objective to recommend the movies that a user has watched and rated highly, which is not the same task as helping the user find movies that they would enjoy if they watched them. This mismatch in objective between evaluation and task is a compromise to avoid the cost of asking a user to evaluate recommendations by watching each movie. As a resource available for download, we offer an extension to the MovieLens-32M dataset that provides for new evaluation objectives. Our primary objective is to predict the movies that a user would be interested in watching, i.e. predict their watchlist. To construct this extension, we recruited MovieLens users, collected their profiles, made recommendations with a diverse set of algorithms, pooled the recommendations, and had the users assess the pools. This paper demonstrates the feasibility of using pooling to construct a test collection for recommender systems. Notably, we found that the traditional machine learning style of evaluation ranks the Popular algorithm, which recommends movies based on total number of ratings in the system, in the middle of the twenty-two recommendation runs we used to build the pools. In contrast, when we rank the runs by users' interest in watching movies, we find that recommending popular movies as a recommendation algorithm becomes one of the worst performing runs. It appears that by asking users to assess their personal recommendations, we can alleviate the issue of popularity bias in the evaluation of top-n recommendation.}, urldate = {2025-07-18}, booktitle = {Proceedings of the 48th {International} {ACM} {SIGIR} {Conference} on {Research} and {Development} in {Information} {Retrieval}}, publisher = {Association for Computing Machinery}, author = {Smucker, Mark D. and Chamani, Houmaan}, month = jul, year = {2025}, pages = {3520--3529}, }
@inproceedings{mancino_datarec_2025, address = {New York, NY, USA}, series = {{SIGIR} '25}, title = {{DataRec}: {A} {Python} {Library} for {Standardized} and {Reproducible} {Data} {Management} in {Recommender} {Systems}}, isbn = {979-8-4007-1592-1}, shorttitle = {{DataRec}}, url = {https://dl.acm.org/doi/10.1145/3726302.3730320}, doi = {10.1145/3726302.3730320}, abstract = {Recommender systems have demonstrated a significant impact across diverse domains, yet ensuring the reproducibility of experimental findings remains a persistent challenge. A primary obstacle lies in the fragmented and often opaque data management strategies employed during the preprocessing stage, where decisions about dataset selection, filtering, and splitting can substantially influence outcomes. To address these limitations, we introduce DataRec, an open-source Python-based library specifically designed to unify and streamline data handling in recommender system research. By providing reproducible routines for dataset preparation, data versioning, and seamless integration with other frameworks, DataRec promotes methodological standardization, interoperability, and comparability across different experimental setups. Our design is informed by an in-depth review of 55 state-of-the-art recommendation studies, ensuring that DataRec adopts best practices while addressing common pitfalls in data management. Ultimately, our contribution facilitates fair benchmarking, enhances reproducibility, and fosters greater trust in experimental results within the broader recommender systems community. The DataRec library, documentation, and examples are freely available at https://github.com/sisinflab/DataRec.}, urldate = {2025-07-18}, booktitle = {Proceedings of the 48th {International} {ACM} {SIGIR} {Conference} on {Research} and {Development} in {Information} {Retrieval}}, publisher = {Association for Computing Machinery}, author = {Mancino, Alberto Carlo Maria and Bufi, Salvatore and Di Fazio, Angela and Ferrara, Antonio and Malitesta, Daniele and Pomo, Claudio and Di Noia, Tommaso}, month = jul, year = {2025}, pages = {3478--3487}, }
@inproceedings{vaez_barenji_user_2025, address = {New York, NY, USA}, series = {{UMAP} {Adjunct} '25}, title = {User and {Recommender} {Behavior} {Over} {Time}: {Contextualizing} {Activity} {Effectiveness} {Diversity} and {Fairness} in {Book} {Recommendation}}, isbn = {979-8-4007-1399-6}, shorttitle = {User and {Recommender} {Behavior} {Over} {Time}}, url = {https://dl.acm.org/doi/10.1145/3708319.3733710}, doi = {10.1145/3708319.3733710}, abstract = {Data is an essential resource for studying recommender systems. While there has been significant work on improving and evaluating state-of-the-art models and measuring various properties of recommender system outputs, less attention has been given to the data itself, particularly how data has changed over time. Such documentation and analysis provide guidance and context for designing and evaluating recommender systems, particularly for evaluation designs making use of time (e.g., temporal splitting). In this paper, we present a temporal explanatory analysis of the UCSD Book Graph dataset scraped from Goodreads, a social reading and recommendation platform active since 2006. We measure the book interaction data using a set of activity, diversity, and fairness metrics; we then train a set of collaborative filtering algorithms on rolling training windows to observe how the same measures evolve over time in the recommendations. Additionally, we explore whether the introduction of algorithmic recommendations in 2011 was followed by observable changes in user or recommender system behavior.}, urldate = {2025-06-22}, booktitle = {Adjunct {Proceedings} of the 33rd {ACM} {Conference} on {User} {Modeling}, {Adaptation} and {Personalization}}, publisher = {Association for Computing Machinery}, author = {Vaez Barenji, Samira and Parajuli, Sushobhan and Ekstrand, Michael D.}, month = jun, year = {2025}, pages = {280--287}, }
@misc{dilworth_privacy_2025, title = {Privacy {Preservation} through {Practical} {Machine} {Unlearning}}, url = {http://arxiv.org/abs/2502.10635}, doi = {10.48550/arXiv.2502.10635}, abstract = {Machine Learning models thrive on vast datasets, continuously adapting to provide accurate predictions and recommendations. However, in an era dominated by privacy concerns, Machine Unlearning emerges as a transformative approach, enabling the selective removal of data from trained models. This paper examines methods such as Naive Retraining and Exact Unlearning via the SISA framework, evaluating their Computational Costs, Consistency, and feasibility using the \${\textbackslash}texttt\{HSpam14\}\$ dataset. We explore the potential of integrating unlearning principles into Positive Unlabeled (PU) Learning to address challenges posed by partially labeled datasets. Our findings highlight the promise of unlearning frameworks like \${\textbackslash}textit\{DaRE\}\$ for ensuring privacy compliance while maintaining model performance, albeit with significant computational trade-offs. This study underscores the importance of Machine Unlearning in achieving ethical AI and fostering trust in data-driven systems.}, urldate = {2025-05-22}, publisher = {arXiv}, author = {Dilworth, Robert}, month = feb, year = {2025}, note = {arXiv:2502.10635 [cs]}, }
doi link bibtex abstract
@inproceedings{slokom_how_2025, address = {Cham}, title = {How to {Diversify} any {Personalized} {Recommender}?}, isbn = {978-3-031-88717-8}, doi = {10.1007/978-3-031-88717-8_23}, abstract = {In this paper, we introduce a novel approach to improve the diversity of Top-N recommendations while maintaining accuracy. Our approach employs a user-centric pre-processing strategy aimed at exposing users to a wide array of content categories and topics. We personalize this strategy by selectively adding and removing a percentage of interactions from user profiles. This personalization ensures we remain closely aligned with user preferences while gradually introducing distribution shifts. Our pre-processing technique offers flexibility and can seamlessly integrate into any recommender architecture. We run extensive experiments on two publicly available data sets for news and book recommendations to evaluate our approach. We test various standard and neural network-based recommender system algorithms. Our results show that our approach generates diverse recommendations, ensuring users are exposed to a wider range of items. Furthermore, using pre-processed data for training leads to recommender systems achieving performance levels comparable to, and in some cases, better than those trained on original, unmodified data. Additionally, our approach promotes provider fairness by facilitating exposure to minority categories. (Our GitHub code is available at: https://github.com/SlokomManel/How-to-Diversify-any-Personalized-Recommender-).}, language = {en}, booktitle = {Advances in {Information} {Retrieval}}, publisher = {Springer Nature Switzerland}, author = {Slokom, Manel and Daniil, Savvina and Hollink, Laura}, editor = {Hauff, Claudia and Macdonald, Craig and Jannach, Dietmar and Kazai, Gabriella and Nardini, Franco Maria and Pinelli, Fabio and Silvestri, Fabrizio and Tonellotto, Nicola}, year = {2025}, pages = {307--323}, }
@phdthesis{lansman_using_2025, type = {Ph.{D}. {Dissertation}}, title = {Using emotion diversification based on movie reviews to improve the user experience of movie recommender systems}, url = {https://www.proquest.com/docview/3196617694}, abstract = {Movies are made with the intention of evoking an emotional response. In recent years, researchers have hypothesized that the emotional response evoked by a movie can be leveraged to augment recommender system algorithms. In this work, we demonstrate that emotion diversification improves the user experience of a movie recommender system. We augmented the 10M MovieLens dataset with values of the eight dimensions of Plutchik’s wheel of emotions by leveraging an emotion analysis method that extracts these eight dimensions from movie reviews on IMDB to form an ’emotional signature’. Based on the finding of Mokryn et al. (October 2020) that showed that a film’s emotional signature reflects the emotions the film elicits in viewers, we used each movie’s emotional signature to diversify the output of our recommender algorithm. We tested this novel emotion diversification method against an existing latent diversification method and a baseline version without diversification in an online user experiment with a custom-built movie recommender system. We also tested two different types of visualization, a graph view against a baseline of a list view, as the graph view would increase user understandability regarding the reason behind the recommended items provided. The results of this study show that the emotion diversification method significantly improves the user experience of the movie recommender system, surpassing both the baseline system and the latent diversification method in terms of perceived taste coverage and system satisfaction without significantly reducing the perceived recommendation quality or increasing the trade-off difficulty. Going beyond the traditional rating and/or interaction data used by traditional recommender systems, our work demonstrates the user experience benefits of extracting emotional data from rich, qualitative user feedback and using it to give users a more emotionally diverse set of recommendations.}, language = {English}, author = {Lansman, Lior}, year = {2025}, note = {ISBN: 9798311930970 Pages: 83}, keywords = {0489:Information Technology, Decision making, Emotions, Information technology, Motion pictures, Recommender systems, User behavior}, }
@article{daniil_challenges_2025, title = {On the challenges of studying bias in {Recommender} {Systems}: {The} effect of data characteristics and algorithm configuration}, volume = {1}, copyright = {Copyright (c) 2025 Savvina Daniil, Manel Slokom, Mirjam Cuper, Cynthia Liem, Jacco van Ossenbruggen, Laura Hollink (Author)}, issn = {3050-9114}, shorttitle = {On the challenges of studying bias in {Recommender} {Systems}}, url = {https://irrj.org/article/view/19607}, doi = {10.54195/irrj.19607}, abstract = {Statements on the propagation of bias by recommender systems are often hard to verify or falsify. Research on bias tends to draw from a small pool of publicly available datasets and is therefore bound by their specific properties. Additionally, implementation choices are often not explicitly described or motivated in research, while they may have an effect on bias propagation. In this paper, we explore the challenges of measuring and reporting popularity bias. We showcase the impact of data properties and algorithm configurations on popularity bias by combining real and synthetic data with well known recommender systems frameworks. First, we identify data characteristics that might impact popularity bias, and explore their presence in a set of available online datasets. Accordingly, we generate various datasets that combine these characteristics. Second, we locate algorithm configurations that vary across implementations in literature. We evaluate popularity bias for a number of datasets, three real and five synthetic, and configurations, and offer insights on their joint effect. We find that, depending on the data characteristics, various configurations of the algorithms examined can lead to different conclusions regarding the propagation of popularity bias. These results motivate the need for explicitly addressing algorithmic configuration and data properties when reporting and interpreting bias in recommender systems.}, language = {en}, number = {1}, urldate = {2025-04-15}, journal = {Information Retrieval Research}, author = {Daniil, Savvina and Slokom, Manel and Cuper, Mirjam and Liem, Cynthia and Ossenbruggen, Jacco van and Hollink, Laura}, month = feb, year = {2025}, note = {Number: 1}, pages = {3--27}, }
@mastersthesis{arabzadeh_optimal_2025, title = {Optimal {Dataset} {Size} for {Recommender} {Systems}: {Evaluating} {Algorithms}' {Performance} via {Downsampling}}, shorttitle = {Optimal {Dataset} {Size} for {Recommender} {Systems}}, url = {http://arxiv.org/abs/2502.08845}, abstract = {The analysis reveals that algorithm performance under different downsampling portions is influenced by factors such as dataset characteristics, algorithm complexity, and the specific downsampling configuration (scenario dependent). In particular, some algorithms, which generally showed lower absolute nDCG@10 scores compared to those that performed better, exhibited lower sensitivity to the amount of training data provided, demonstrating greater potential to achieve optimal efficiency in lower downsampling portions. For instance, on average, these algorithms retained ∼81\% of their full-size performance when using only 50\% of the training set. In certain configurations of the downsampling method, where the focus was on progressively involving more users while keeping the test set fixed in size, they even demonstrated higher nDCG@10 scores than when using the original full-size dataset. These findings underscore the feasibility of balancing sustainability and effectiveness, providing practical insights for designing energy-efficient recommender systems and advancing sustainable AI practices.}, language = {en}, urldate = {2025-04-15}, school = {University of Siegen}, author = {Arabzadeh, Ardalan}, month = feb, year = {2025}, note = {arXiv:2502.08845 [cs]}, }
@article{akhadam_comparative_2025, title = {A {Comparative} {Evaluation} of {Recommender} {Systems} {Tools}}, volume = {13}, issn = {2169-3536}, url = {https://ieeexplore.ieee.org/abstract/document/10879478}, doi = {10.1109/ACCESS.2025.3541014}, abstract = {Due to the vast flow of information on the Internet, easy and effective access to information has become crucial. Recommender systems are important in information filtering, as they significantly impact large-scale internet web services such as YouTube, Netflix, and Amazon. As the demand for personalized recommendations continues to grow, researchers and practitioners alike strive to develop tools specifically designed for this purpose to meet the increasing need. In this work, we address the challenges associated with selecting software frameworks and Machine Learning (ML) algorithms for Recommender Systems (RSs), thus, we offer a detailed comparison of 42 open-source RS software to provide insights into their different features and capabilities. Furthermore, the paper presents a concise overview of various ML algorithms to generate recommendations, reviews the most used performance metrics to evaluate RS, and then compares several ML algorithms provided by four popular recommendation tools: Microsoft Recommenders, Lenskit, Turi Create, and Cornac.}, urldate = {2025-04-15}, journal = {IEEE Access}, author = {Akhadam, Ayoub and Kbibchi, Oumayma and Mekouar, Loubna and Iraqi, Youssef}, year = {2025}, pages = {29493--29522}, }
@phdthesis{danesi_um_2024, title = {Um estudo sobre bibliotecas para sistemas de recomendação em {Python}}, copyright = {Acesso Aberto}, url = {http://repositorio.ufsm.br/handle/1/33964}, abstract = {This paper presents a study on recommendation systems, with an emphasis on the analysis and implementation of algorithms using Python libraries for the Collaborative Filtering approach. Identifying the relevance of personalized recommendations in various applications, this research explores algorithms available for the development of such systems, using libraries as tools that facilitate their implementation. In particular, libraries implemented in the Python programming language are examined in the context of recommendation systems, such as Surprise and LensKit for Python (LKPY), presenting the functioning of their main algorithms, K -Nearest Neighbors (K-NN) and Slope One. Thus, the theoretical analysis of these tools is complemented by practical implementation and application in a real scenario demonstrating the performance and applicability of the libraries.}, language = {por}, urldate = {2025-05-22}, school = {Universidade Federal de Santa Maria}, author = {Danesi, Lorenzo Dalla Corte}, month = dec, year = {2024}, note = {Accepted: 2025-01-28T16:09:12Z Publisher: Universidade Federal de Santa Maria}, }
@article{lopes_recommendations_2024, title = {Recommendations with minimum exposure guarantees: a post-processing framework}, volume = {236}, issn = {0957-4174}, shorttitle = {Recommendations with minimum exposure guarantees}, url = {https://www.sciencedirect.com/science/article/pii/S0957417423016664}, doi = {10.1016/j.eswa.2023.121164}, abstract = {Relevance-based ranking is a popular ingredient in recommenders, but it frequently struggles to meet fairness criteria because social and cultural norms may favor some item groups over others. For instance, some items might receive lower ratings due to some sort of bias (e.g. gender bias). A fair ranking should balance the exposure of items from advantaged and disadvantaged groups. To this end, we propose a novel post-processing framework to produce fair, exposure-aware recommendations. Our approach is based on an integer linear programming model maximizing the expected utility while satisfying a minimum exposure constraint. The model has fewer variables than previous work and thus can be deployed to larger datasets and allows the organization to define a minimum level of exposure for groups of items. We conduct an extensive empirical evaluation indicating that our new framework can increase the exposure of items from disadvantaged groups at a small cost of recommendation accuracy.}, urldate = {2023-09-19}, journal = {Expert Systems with Applications}, author = {Lopes, Ramon and Alves, Rodrigo and Ledent, Antoine and Santos, Rodrygo L. T. and Kloft, Marius}, month = feb, year = {2024}, keywords = {Exposure, Fairness, Integer linear programming, Recommender systems, to-read}, pages = {121164}, }
@article{chamani_test_2024, title = {A {Test} {Collection} for {Offline} {Evaluation} of {Recommender} {Systems}}, url = {https://hdl.handle.net/10012/21175}, abstract = {Recommendation systems have long been evaluated by collecting a large number of individuals' ratings for items, and then dividing these ratings into test and train sets to see how well recommendation algorithms can predict individuals' preferences. A complaint about this approach is that the evaluation measures can only use a small number of known preferences and have no information about the majority of recommended items. Prior research has shown that offline evaluation of recommendation systems using a test/train split methodology may not agree with actual user preferences when all recommended items are judged by the user. To address this issue, we apply traditional information retrieval test collection construction techniques for movie recommendations. An information retrieval test collection is composed of documents, search topics, and relevance judgments that tell us which documents are relevant for each topic. For our test collection, each search topic is an individual who is looking for movies to watch. In other words, while the search topic is always ``Please recommend me movies that I will be interested in watching,'' the context of the search topic changes to be the individual who is requesting the recommendations. When document collections are too large to be completely judged by assessors, the traditional approach is to use pooling. We followed this same approach in the construction of our test collection. For each individual, we used their existing profile of rated movies as input to a wide range of recommendation algorithms to produce recommendations for movies not found in their profile. We then pooled these recommendations separately for each person and asked them to rate the movies. In addition to rating, we also had each individual rate a random sample of movies selected from their ratings profile to measure their consistency in rating. The resulting new test collection consists of 51 individual ratings profiles totaling 123,104 ratings and 31,236 relevance judgments. In this thesis, we detail the creation of the test collection and provide an analysis of the individuals that comprise its search topics, and we analyze the collection's relevance judgments as well as other aspects.}, language = {en}, urldate = {2024-11-22}, author = {Chamani, Houmaan}, month = nov, year = {2024}, note = {Publisher: University of Waterloo}, keywords = {⛔ No DOI found}, }
@inproceedings{baumgart_e-fold_2024, title = {e-{Fold} {Cross}-{Validation} for {Recommender}-{System} {Evaluation}}, url = {https://isg.beel.org/pubs/2024-e-folds-recsys-baumgart.pdf}, abstract = {To combat the rising energy consumption of recommender systems we implement a novel alternative for k-fold cross validation. This alternative, named e-fold cross validation, aims to minimize the number of folds to achieve a reduction in power usage while keeping the reliability and robustness of the test results high. We tested our method on 5 recommender system algorithms across 6 datasets and compared it with 10-fold cross validation. On average e-fold cross validation only needed 41.5\% of the energy that 10-fold cross validation would need, while it’s results only differed by 1.81\%. We conclude that e-fold cross validation is a promising approach that has the potential to be an energy efficient but still reliable alternative to k-fold cross validation.}, language = {en}, booktitle = {First {International} {Workshop} on {Recommender} {Systems} for {Sustainability} and {Social} {Good} ({RecSoGood})}, author = {Baumgart, Moritz and Wegmeth, Lukas and Vente, Tobias and Beel, Joeran}, month = oct, year = {2024}, keywords = {⛔ No DOI found}, }
@inproceedings{pathak_advancing_2024, address = {New York, NY, USA}, series = {{CIKM} '24}, title = {Advancing {Misinformation} {Awareness} in {Recommender} {Systems} for {Social} {Media} {Information} {Integrity}}, isbn = {9798400704369}, url = {https://dl.acm.org/doi/10.1145/3627673.3680259}, doi = {10.1145/3627673.3680259}, abstract = {Recommender systems play an essential role in determining the content users encounter on social media platforms and in uncovering relevant news. However, they also present significant risks, such as reinforcing biases, over-personalizing content, fostering filter bubbles, and inadvertently promoting misinformation. The spread of false information is rampant across various online platforms, such as Twitter (now X), Meta, and TikTok, especially noticeable during events like the COVID-19 pandemic and the US Presidential elections. These instances underscore the critical necessity for transparency and regulatory oversight in the development of recommender systems. Given the challenge of balancing free speech with the risks of outright removal of fake news, this paper aims to address the spread of misinformation from algorithmic biases in recommender systems using a social science perspective.}, urldate = {2024-11-04}, booktitle = {Proceedings of the 33rd {ACM} {International} {Conference} on {Information} and {Knowledge} {Management}}, publisher = {Association for Computing Machinery}, author = {Pathak, Royal}, month = oct, year = {2024}, pages = {5471--5474}, }
@misc{arabzadeh_green_2024, title = {Green {Recommender} {Systems}: {Optimizing} {Dataset} {Size} for {Energy}-{Efficient} {Algorithm} {Performance}}, shorttitle = {Green {Recommender} {Systems}}, url = {http://arxiv.org/abs/2410.09359}, doi = {10.48550/arXiv.2410.09359}, abstract = {As recommender systems become increasingly prevalent, the environmental impact and energy efficiency of training large-scale models have come under scrutiny. This paper investigates the potential for energy-efficient algorithm performance by optimizing dataset sizes through downsampling techniques in the context of Green Recommender Systems. We conducted experiments on the MovieLens 100K, 1M, 10M, and Amazon Toys and Games datasets, analyzing the performance of various recommender algorithms under different portions of dataset size. Our results indicate that while more training data generally leads to higher algorithm performance, certain algorithms, such as FunkSVD and BiasedMF, particularly with unbalanced and sparse datasets like Amazon Toys and Games, maintain high-quality recommendations with up to a 50\% reduction in training data, achieving nDCG@10 scores within approximately 13\% of full dataset performance. These findings suggest that strategic dataset reduction can decrease computational and environmental costs without substantially compromising recommendation quality. This study advances sustainable and green recommender systems by providing insights for reducing energy consumption while maintaining effectiveness.}, urldate = {2024-10-17}, publisher = {arXiv}, author = {Arabzadeh, Ardalan and Vente, Tobias and Beel, Joeran}, month = oct, year = {2024}, note = {Presented at International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood)}, }
@techreport{silva_aprimorando_2024, address = {Ouro Preto, BR}, type = {Bachelor {Thesis}}, title = {Aprimorando a instalação e a configuração de experimentos do {RecSysExp}.}, url = {http://www.monografias.ufop.br/handle/35400000/6571}, abstract = {The paper presents significant enhancements to the RecSysExp framework, used for conducting experiments in recommendation systems. These improvements were aimed at enhancing the usability, scalability, and readability of the system. The new functionalities cover three distinct areas: the development of a graphical user interface, the encapsulation of the framework using Docker, and the restructuring of a class for more cohesive integration with datasets, following established design patterns. The primary goal was to enhance the value provided by the framework, aligned with the vision of its creators, aiming at its use as an academic tool in classroom or research environments. The methodological approach adopted employed specific technologies for each addressed context. For the creation of the user interface, React and Next.js frontend frameworks were employed, while Dockerfile and docker-compose were used for the encapsulation of RecSysExp. Finally, the modification of the class responsible for datasets was carried out following the Template Method design pattern. The project successfully achieved all proposed objectives. The implementation of a container structure simplified the installation of the system, while improvements in the visualization of configurations made experiment creation more intuitive. Additionally, the ability to upload files expanded user options. Although the final version of RecSysExp functions similarly to its original iteration, the additions from this work resulted in an enhanced and more user-friendly version. However, it is important to note that configuration through the graphical interface has limitations, as it is only possible to configure algorithms and modules that can be instantiated via configuration files in the framework. Algorithms and modules implemented solely as libraries in other projects cannot be configured via the frontend.}, language = {pt\_BR}, urldate = {2024-10-11}, institution = {Universidade Federal de Ouro Preto}, author = {Silva, San Cunha da}, year = {2024}, note = {Accepted: 2024-02-29T14:36:20Z}, }
@mastersthesis{stijger_active_2024, address = {Utrecht, NL}, title = {Active learning in recommender systems for predicting vulnerabilities in software}, copyright = {CC-BY-NC-ND}, url = {https://studenttheses.uu.nl/handle/20.500.12932/45783}, abstract = {Due to a rapid advancement of digital technology and growing reliance on the internet, cybersecurity has become a paramount issue for individuals, organizations, and governments. To address this challenge, penetration testing has emerged as a critical tool to ensure the security of computer systems and networks. The reconnaissance phase of penetration testing plays a crucial role in identifying vulnerabilities in a system by gathering relevant information. Although various tools are available to automate this process, most of them are limited to identifying reported vulnerabilities, and they do not provide suggestions or predictions about vulnerabilities. Therefore, this research aims to investigate the application of recommender systems to predict common vulnerabilities during the reconnaissance phase. The main objective of this research is to investigate how active learning affects the performance of a recommender system to identify vulnerabilities in software products. Item-Based k-NN Collaborative Filtering, a recommender system, can improve the identification of potential vulnerabilities and the effectiveness of penetration testing by analyzing information from similar data points. This research involves a comprehensive data preprocessing phase, which utilizes data from the National Vulnerability Database (NVD). Several recommender systems are built using this data, which enables the prediction of potential vulnerabilities during the reconnaissance phase of penetration testing. The performances of these recommender systems are evaluated, and the topperforming recommender system implements active learning to enhance its performance. The findings of this research demonstrate that Item-Based k-NN Collaborative Filtering outperforms other recommender systems in terms of overall performance when it comes to identifying software vulnerabilities. Furthermore, when compared to Item-Based k-NN Collaborative Filtering prior to active learning or with active learning and a random sampling technique, Item-Based k-NN Collaborative Filtering with active learning incorporating a 4- or 10-batch sampling technique with 20 or 40 items added yields a statistically significant improvement in the precision score. This indicates that a greater proportion of the predicted vulnerabilities are correct. Item-Based k-NN Collaborative Filtering with active learning and a single-batch sampling strategy only results in a statistically significant improvement in precision, compared to Item-Based k-NN Collaborative Filtering prior active learning or with active learning and a random sampling technique, when 20 items are added instead of 40. Furthermore, only Item-Based k-NN Collaborative Filtering with a 10-batch sampling strategy adding 20 items demonstrated a statistically significant improvement in nDCG scores compared to Item-Based k-NN Collaborative Filtering prior to active learning. This implies a more accurate ranking of the vulnerabilities. However, this could potentially be a type I error. From these findings, it can be concluded that introducing active learning in Item-Based k-NN Collaborative Filtering, using the approaches outlined, leads to significant improvement in precision score but not necessarily in nDCG score. Considering this conclusion, it is advised to use Item-Based k-NN Collaborative Filtering with active learning to predict vulnerabilities in software products and enhance the reconnaissance phase of penetration testing. This can be achieved by incorporating a single-batch sampling technique with 20 items added or a 4- or 10-batch sampling technique with 20 or 40 added. The insights gained from this research can help individuals, organizations, and governments strengthen their cybersecurity defences and protect against potential cyber threats.}, language = {EN}, urldate = {2024-10-11}, school = {Utrecht University}, author = {Stijger, Elise}, year = {2024}, note = {Accepted: 2024-01-06T00:01:00Z}, }
@misc{daniil_challenges_2024, title = {On the challenges of studying bias in recommender systems: a {UserKNN} case study}, shorttitle = {On the challenges of studying bias in {Recommender} {Systems}}, url = {http://arxiv.org/abs/2409.08046}, doi = {10.48550/arXiv.2409.08046}, abstract = {Statements on the propagation of bias by recommender systems are often hard to verify or falsify. Research on bias tends to draw from a small pool of publicly available datasets and is therefore bound by their specific properties. Additionally, implementation choices are often not explicitly described or motivated in research, while they may have an effect on bias propagation. In this paper, we explore the challenges of measuring and reporting popularity bias. We showcase the impact of data properties and algorithm configurations on popularity bias by combining synthetic data with well known recommender systems frameworks that implement UserKNN. First, we identify data characteristics that might impact popularity bias, based on the functionality of UserKNN. Accordingly, we generate various datasets that combine these characteristics. Second, we locate UserKNN configurations that vary across implementations in literature. We evaluate popularity bias for five synthetic datasets and five UserKNN configurations, and offer insights on their joint effect. We find that, depending on the data characteristics, various UserKNN configurations can lead to different conclusions regarding the propagation of popularity bias. These results motivate the need for explicitly addressing algorithmic configuration and data properties when reporting and interpreting bias in recommender systems.}, urldate = {2024-09-25}, publisher = {arXiv}, author = {Daniil, Savvina and Slokom, Manel and Cuper, Mirjam and Liem, Cynthia C. S. and van Ossenbruggen, Jacco and Hollink, Laura}, month = sep, year = {2024}, note = {Presented at FAccTRec 2024}, }
@inproceedings{wegmeth_recommender_2024, title = {Recommender systems algorithm selection for ranking prediction on implicit feedback datasets}, url = {http://arxiv.org/abs/2409.05461}, doi = {10.1145/3640457.3691718}, abstract = {The recommender systems algorithm selection problem for ranking prediction on implicit feedback datasets is under-explored. Traditional approaches in recommender systems algorithm selection focus predominantly on rating prediction on explicit feedback datasets, leaving a research gap for ranking prediction on implicit feedback datasets. Algorithm selection is a critical challenge for nearly every practitioner in recommender systems. In this work, we take the first steps toward addressing this research gap. We evaluate the NDCG@10 of 24 recommender systems algorithms, each with two hyperparameter configurations, on 72 recommender systems datasets. We train four optimized machine-learning meta-models and one automated machine-learning meta-model with three different settings on the resulting meta-dataset. Our results show that the predictions of all tested meta-models exhibit a median Spearman correlation ranging from 0.857 to 0.918 with the ground truth. We show that the median Spearman correlation between meta-model predictions and the ground truth increases by an average of 0.124 when the meta-model is optimized to predict the ranking of algorithms instead of their performance. Furthermore, in terms of predicting the best algorithm for an unknown dataset, we demonstrate that the best optimized traditional meta-model, e.g., XGBoost, achieves a recall of 48.6\%, outperforming the best tested automated machine learning meta-model, e.g., AutoGluon, which achieves a recall of 47.2\%.}, urldate = {2024-09-25}, booktitle = {{RecSys} '24 {Late}-{Breaking} {Results}}, author = {Wegmeth, Lukas and Vente, Tobias and Beel, Joeran}, month = sep, year = {2024}, note = {arXiv:2409.05461 [cs]}, }
@phdthesis{michiels_methodologies_2024, address = {Antwerp}, title = {Methodologies to evaluate recommender systems}, url = {https://hdl.handle.net/10067/2080040151162165141}, abstract = {In the current digital landscape, recommender systems play a pivotal role in shaping users' online experiences by providing personalized recommendations for relevant products, news articles, media content, and more. Their pervasive use makes the thorough evaluation of these systems of paramount importance. This dissertation addresses two key challenges in the evaluation of recommender systems. Part II of the dissertation focuses on improving methodologies for offline evaluation. Offline evaluation is a prevalent method for assessing recommendation algorithms in both academia and industry. Despite its widespread use, offline evaluations often suffer from methodological flaws that undermine their validity and real-world impact. This dissertation makes three key contributions to improving the reliability, internal and ecological validity, replicability, reproducibility, and reusability of offline evaluations. First, it presents an extensive review of the current state of practice and knowledge in offline evaluation, proposing a comprehensive set of better practices to address the reliability, replicability, and validity of offline evaluations. Next, it introduces RecPack, an open-source experimentation toolkit designed to facilitate reliable, reproducible, and reusable offline evaluations. Finally, it presents RecPack Tests, a test suite designed to ensure the correctness of recommendation algorithm implementations, thereby enhancing the reliability of offline evaluations. Part III of the dissertation examines the measurement of filter bubbles and serendipity. Both concepts have garnered significant attention due to concerns about the potential negative impacts of recommender systems on users of online platforms. One concern is that personalized content, especially on news and media platforms, may lock users into prior beliefs, contributing to increased polarization in society. Another concern is that exposure only to content previously expressed interest in may lead to boredom and eliminate surprise, preventing users from experiencing serendipity. This research makes three contributions to the study of filter bubbles and serendipity. First, it proposes an operational definition of technological filter bubbles, clarifying the ambiguity surrounding the concept. Second, it introduces a regression model for measuring their presence and strength in news recommendations, providing practitioners with the tools to rigorously study filter bubbles and gather real-world evidence of their (non-)existence. Finally, it proposes a feature repository for serendipity in recommender systems, offering a framework for evaluating how system design can influence users' experiences of serendipity in online information environments. In summary, the findings and tools developed in this dissertation advance the theoretical understanding of recommender system evaluation while offering practical tools for industry practitioners and researchers.}, language = {en}, urldate = {2024-09-25}, school = {University of Antwerp}, author = {Michiels, Lien}, year = {2024}, doi = {10.63028/10067/2080040151162165141}, }
@inproceedings{ferraro_its_2024, title = {It's not you, it's me: the impact of choice models and ranking strategies on gender imbalance in music recommendation}, shorttitle = {It's not you, it's me}, url = {http://arxiv.org/abs/2409.03781}, doi = {10.1145/3640457.3688163}, abstract = {As recommender systems are prone to various biases, mitigation approaches are needed to ensure that recommendations are fair to various stakeholders. One particular concern in music recommendation is artist gender fairness. Recent work has shown that the gender imbalance in the sector translates to the output of music recommender systems, creating a feedback loop that can reinforce gender biases over time. In this work, we examine that feedback loop to study whether algorithmic strategies or user behavior are a greater contributor to ongoing improvement (or loss) in fairness as models are repeatedly re-trained on new user feedback data. We simulate user interaction and re-training to investigate the effects of ranking strategies and user choice models on gender fairness metrics. We find re-ranking strategies have a greater effect than user choice models on recommendation fairness over time.}, urldate = {2024-09-25}, booktitle = {Proceedings of the 18th {ACM} {Conference} on {Recommender} {Systems}}, publisher = {ACM}, author = {Ferraro, Andres and Ekstrand, Michael D. and Bauer, Christine}, month = aug, year = {2024}, }
@inproceedings{raj_towards_2024, series = {{LNCS}}, title = {Towards optimizing ranking in grid-layout for provider-side fairness}, volume = {14612}, copyright = {All rights reserved}, url = {https://md.ekstrandom.net/pubs/ecir-fair-grids}, doi = {10.1007/978-3-031-56069-9_7}, abstract = {Information access systems, such as search engines and recommender systems, order and position results based on their estimated relevance. These results are then evaluated for a range of concerns, including provider-side fairness: whether exposure to users is fairly distributed among items and the people who created them. Several fairness-aware ranking and re-ranking techniques have been proposed to ensure fair exposure for providers, but this work focuses almost exclusively on linear layouts in which items are displayed in single ranked list. Many widely-used systems use other layouts, such as the grid views common in streaming platforms, image search, and other applications. Providing fair exposure to providers in such layouts is not well-studied. We seek to fill this gap by providing a grid-aware re-ranking algorithm to optimize layouts for provider-side fairness by adapting existing re-ranking techniques to grid-aware browsing models, and an analysis of the effect of grid-specific factors such as device size on the resulting fairness optimization.}, language = {en}, urldate = {2024-01-04}, booktitle = {Proceedings of the 46th {European} {Conference} on {Information} {Retrieval}}, publisher = {Springer}, author = {Raj, Amifa and Ekstrand, Michael D.}, month = mar, year = {2024}, pages = {90--105}, }
@article{ekstrand_distributionally-informed_2024, title = {Distributionally-informed recommender system evaluation}, volume = {2}, copyright = {All rights reserved}, url = {https://dl.acm.org/doi/10.1145/3613455}, doi = {10.1145/3613455}, abstract = {Current practice for evaluating recommender systems typically focuses on point estimates of user-oriented effectiveness metrics or business metrics, sometimes combined with additional metrics for considerations such as diversity and novelty. In this paper, we argue for the need for researchers and practitioners to attend more closely to various distributions that arise from a recommender system (or other information access system) and the sources of uncertainty that lead to these distributions. One immediate implication of our argument is that both researchers and practitioners must report and examine more thoroughly the distribution of utility between and within different stakeholder groups. However, distributions of various forms arise in many more aspects of the recommender systems experimental process, and distributional thinking has substantial ramifications for how we design, evaluate, and present recommender systems evaluation and research results. Leveraging and emphasizing distributions in the evaluation of recommender systems is a necessary step to ensure that the systems provide appropriate and equitably-distributed benefit to the people they affect.}, number = {1}, urldate = {2023-09-07}, journal = {ACM Transactions on Recommender Systems}, author = {Ekstrand, Michael D. and Carterette, Ben and Diaz, Fernando}, month = mar, year = {2024}, keywords = {distributions, evaluation, exposure, statistics}, pages = {6:1--27}, }
@article{wang_modeling_2023, title = {Modeling uncertainty to improve personalized recommendations via {Bayesian} deep learning}, volume = {16}, issn = {2364-4168}, url = {https://doi.org/10.1007/s41060-020-00241-1}, doi = {10.1007/s41060-020-00241-1}, abstract = {Modeling uncertainty has been a major challenge in developing Machine Learning solutions to solve real world problems in various domains. In Recommender Systems, a typical usage of uncertainty is to balance exploration and exploitation, where the uncertainty helps to guide the selection of new options in exploration. Recent advances in combining Bayesian methods with deep learning enable us to express uncertain status in deep learning models. In this paper, we investigate an approach based on Bayesian deep learning to improve personalized recommendations. We first build deep learning architectures to learn useful representation of user and item inputs for predicting their interactions. We then explore multiple embedding components to accommodate different types of user and item inputs. Based on Bayesian deep learning techniques, a key novelty of our approach is to capture the uncertainty associated with the model output and further utilize it to boost exploration in the context of Recommender Systems. We test the proposed approach in both a Collaborative Filtering and a simulated online recommendation setting. Experimental results on publicly available benchmarks demonstrate the benefits of our approach in improving the recommendation performance.}, language = {en}, number = {2}, urldate = {2024-03-17}, journal = {International Journal of Data Science and Analytics}, author = {Wang, Xin and Kadıoğlu, Serdar}, month = aug, year = {2023}, pages = {191--201}, }
@article{godinot_measuring_2023, title = {Measuring the effect of collaborative filtering on the diversity of users’ attention}, volume = {8}, issn = {2364-8228}, url = {https://hal.science/hal-03926906}, doi = {10.1007/s41109-022-00530-7}, abstract = {AbstractWhile the ever-increasing emergence of online services has led to a growing interest in the development of recommender systems, the algorithms underpinning such systems have begun to be criticized for their role in limiting the variety of content exposed to users. In this context, the notion of diversity has been proposed as a way of mitigating the side effects resulting from the specialization of recommender systems. In this paper, using a well-known recommender system that makes use of collaborative filtering in the context of musical content, we analyze the diversity of recommendations generated through the lens of the recently proposed information network diversity measure. The results of our study offer significant insights into the effect of algorithmic recommendations. On the one hand, we show that the musical selections of a large proportion of users are diversified as a result of the recommendations. On the other hand, however, such improvements do not benefit all users. They are in fact mainly restricted to users with a low level of activity or whose past musical listening selections are very narrow. Through more in-depth investigations, we also discovered that while recommendations generally increase the variety of the songs recommended to users, they nonetheless fail to provide a balanced exposure to the different related categories.}, number = {1}, urldate = {2023-04-30}, journal = {Applied Network Science}, author = {Godinot, Augustin and Tarissan, Fabien}, month = jan, year = {2023}, note = {Publisher: Springer Science and Business Media LLC}, }
@phdthesis{tan_critical_2023, address = {Singapore}, type = {B.{Eng}. {FYP}}, title = {A critical study on {MovieLens} dataset for recommender systems}, url = {https://hdl.handle.net/10356/171942}, abstract = {The growth in recommendation systems (RecSys) research has led to the development of many toolkits, which provide users, who may have varying levels of knowledge in the field, with the necessary tools to build, test, evaluate and benchmark different algorithms. The MovieLens datasets have garnered widespread popularity as a benchmark dataset for RecSys research, but exploratory analysis has shown that the datasets elicit certain issues such as popularity bias and data sparsity as a result. Therefore, evaluation results of baseline algorithms trained on this dataset may pick up these inherent signals present in the data, and therefore should not be generalised across other recommendation scenarios. A comprehensive and consistent experiment involving 3 Python-based Top-N recommendation toolkits: LensKit, RecPack, and daisyRec have shown that toolkits are often built with different purposes or to solve specific issues, which leads to inconsistency in implementation methodology and hence evaluation results. This can be attributed to several main factors: (1) unclear or inconsistent definition of concepts such as evaluation metrics and (2) differences in default preprocessing and splitting strategies being the most significant. The experiments also highlight the disadvantages of using a global time-aware split on the MovieLens dataset, such as eliminating unseen users which are present in the test set but not in the train set. Additionally, analysis showed that having a low absolute number of train interactions, e.g., less than 15, is detrimental to the performance of a model than having a low train to test interaction ratio, with the evaluation metrics showing relatively poorer performance on 2 out of 3 of the toolkits discussed. Lastly, this study proposes some possible improvements to the toolkits based on the issues highlighted, such as clearly defined default dataset preprocessing, fully customisable hyperparameters, and frameworks which allow for quick development of algorithms and metrics, with a possible future work of producing an actively managed, open source toolkit which can solve the problems surfaced during this study.}, language = {en}, urldate = {2025-08-01}, school = {Nanyang Technological University}, author = {Tan, Ernest Yan Heng}, year = {2023}, note = {Publisher: Nanyang Technological University}, keywords = {⛔ No DOI found}, }
@incollection{joshi_recommendation_2023, address = {Cham}, title = {Recommendation {Systems}}, isbn = {978-3-031-12282-8}, url = {https://doi.org/10.1007/978-3-031-12282-8_21}, abstract = {In this chapter, we will study an application of AI for building recommendation systems. We will look at the concept of collaborative filtering that lies at the heart of recommendation systems. We will also look at real-life examples of Netflix and Amazon and how they have used the technique to deliver personalized experiences in vastly different applications. These problems illustrate relatively novel concepts that were not well known to the field few decades before. The application of existing mathematical concepts in solving these problems and seeing the solutions in action in day-today life is quite exciting and satisfying and marks one of the greatest success stories of modern machine learning.}, language = {en}, urldate = {2025-08-01}, booktitle = {Machine {Learning} and {Artificial} {Intelligence}}, publisher = {Springer International Publishing}, author = {Joshi, Ameet V.}, year = {2023}, doi = {10.1007/978-3-031-12282-8_21}, pages = {251--260}, }
@article{silva_recsysexp_2023, title = {{RecSysExp} : um framework de alto nível de abstração para implementação e validação de sistemas de recomendação.}, copyright = {An error occurred on the license name.}, shorttitle = {{RecSysExp}}, url = {http://www.monografias.ufop.br/handle/35400000/5711}, abstract = {Este trabalho consiste na implementação de um framework para realização de experimentos em sistemas de recomendação, seu intuito é permitir que cenários de experimentação em sistemas de recomendação possam ser criados e analisados de forma prática, isso será possível dado que o projeto fornece uma abordagem ponta-a-ponta que conta com etapas como a de pré-processamento dos dados de entrada, modelagem e treinamento de algoritmos de recomendação, avaliação dos resultados através de diferentes métricas até a visualização dos resultados. Nesse framework temos um conjunto de conceitos e técnicas que serão base para criação de quase todo o projeto, como principais exemplos, destacam-se a recomendação e a reprodutibilidade de experimentos. A recomendação pode ser vista como um sistema capaz de sugerir a um usuário objetos úteis e/ou interessantes considerando um grande conjunto de opções. No caso da reprodutibilidade estamos nos referindo à capacidade de diferentes investigadores tirarem as mesmas conclusões a partir de um experimento, essa característica será garantida através do arquivo de configuração criado para o RecSysExp. Essa construção é baseada principalmente na extração das melhores características dentre as bibliotecas e frameworks que possuem componentes relacionados às variadas etapas do ciclo de desenvolvimento e experimentação em sistemas de recomendação, sendo assim, esse trabalho visa facilitar a inclusão de novos paradigmas, recomendadores, meta-features, métricas e outros recursos. Até o momento, esse trabalho conta com um conjunto de algoritmos base para o processo de predição e recomendação, alguns deles são: UserKNN, ItemKNN, Bias, BiasedSVD, ImplicitMF, BiasedMF, SlopeOne, PopScore e outros. Como resultados desse trabalhos foi obtida uma revisão da literatura que nos proporcionou a definição de componentes, interfaces e classes que permitem flexibilidade e extensibilidade na inclusão de novos recursos ao framework, uma estrutura de recomendação que abrange diferentes estratégias além de maneiras de analisar e reaproveitar os resultados de cada etapa. Além disso, a partir da definição de alguns experimentos foram encontradas diferentes formas de visualização dos resultados, cálculo e armazenamento dos resultados de predições e recomendações, esses resultados foram submetidos a um conjunto de métricas como RMSE, NDCG e MAE. Desses resultados, conclui-se que a base geral para o framework foi consolidada através de diferentes estruturas de classe que permitem a extensão do projeto, métodos base relacionados a pré-processamento, recomendação e avaliação, armazenamento dos resultados de forma padronizada garantindo que todos os artefatos gerados pelo experimentos estejam organizados e disponíveis, integração entre os projetos relacionados ao RecSysExp, além da documentação do framework e dos trabalhos relacionados.}, language = {pt\_BR}, urldate = {2025-08-01}, author = {Silva, Lucas Natali Magalhães}, year = {2023}, note = {Accepted: 2023-07-06T12:34:06Z}, keywords = {⛔ No DOI found}, }
@article{depessemier_recipe_2023, title = {Recipe recommendations for individual users and groups in a cooking assistance app}, volume = {53}, issn = {1573-7497}, url = {https://doi.org/10.1007/s10489-023-04909-6}, doi = {10.1007/s10489-023-04909-6}, abstract = {Recommender systems are commonly-used tools to assist people in making decisions. However, most research has focused on the domain of recommendations for audio-visual content and e-commerce, whereas the specific characteristics of recommendations for recipes and cooking did not receive enough attention. Since meals are often consumed in group (with friends or family), there is a need for group recommendations, taking into account the preferences of all group members. Also cuisine, allergies, disliked ingredients, diets, dish type, and required time to prepare are important factors for recipe selection. For 13 algorithms, we evaluated the recommendations for individuals and for groups using a dataset of recipe ratings. The best algorithm and a baseline algorithm based on popularity were selected for our mobile kitchen experience and recipe application, which assists users in the cooking process and provides recipe recommendations. Although significant differences between both algorithms were witnessed in the offline evaluation with the dataset, the differences were less noticeable in the online evaluation with real users. Because of the cold-start problem, the advanced algorithm failed to reach its full accuracy potential, but excelled in other quality features such as diversity, perceived usefulness, and confidence. We also witnessed a better evaluation (about half a star) of the recommendations by the more advanced cooks.}, language = {en}, number = {22}, urldate = {2025-08-01}, journal = {Applied Intelligence}, author = {De Pessemier, Toon and Vanhecke, Kris and All, Anissa and Van Hove, Stephanie and De Marez, Lieven and Martens, Luc and Joseph, Wout and Plets, David}, month = nov, year = {2023}, pages = {27027--27043}, }
@mastersthesis{huys_augmented_2023, title = {Augmented decision trees in active learning tackling the cold start problem in recommender systems}, url = {https://lib.ugent.be/catalog/rug01:003150166}, school = {Ghent University}, author = {Huys, Aaron}, month = may, year = {2023}, }
@mastersthesis{torjusen_privacy_2023, title = {Privacy in {Recommender} {Systems}: {Inferring} {User} {Personality} {Traits} {From} {Personalized} {Movie} {Recommendations}}, shorttitle = {Privacy in {Recommender} {Systems}}, url = {https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/3092170}, abstract = {Anbefalingssystemer er personaliserte systemer som samler inn brukerdata for å anbefale og skreddersy innhold. Denne masteroppgaven undersøker hvordan personlighetstrekk kan utledes fra personaliserte topp 10-filmanbefalinger for å undersøke potensiell lekkasje fra brukerdataen. Oppgaven utforsker effekten på treffsikkerheten av klassifikasjonen ved hjelp av ulike metoder, nærmere bestemt ulike inndelinger av personlighetstrekk, integrering av personlighetstrekk i et anbefalingssystem og resampling-teknikker. I eksperimentene ble det generert tilfeldige, vurderingsbaserte og personlighetsbaserte anbefalinger, og seks ulike klassifiseringsmetoder ble brukt til å utlede personlighetstrekk fra dem. Funnene indikerer potensiell informasjonslekkasje av personlighet fra de personaliserte anbefalingene. Likevel ble det ikke observert noen konsekvente mønstre for personlighetstrekkene, ettersom ulike eksperimentelle oppsett ga forskjellige resultater. Videre ble det ikke observert noen signifikant forskjell i treffsikkerheten på klassifiseringen etter at personlighetstrekkene ble integrert i anbefalingssystemet. Disse funnene bidrar til å øke forståelsen av personvernhensyn i forbindelse med brukeranbefalinger og gir innsikt til fremtidig forskning på personvern i anbefalingssystemer.}, language = {eng}, urldate = {2025-08-01}, school = {NTNU}, author = {Torjusen, Hanna and Barstad, Caroline}, year = {2023}, note = {Accepted: 2023-09-26T17:20:33Z}, }
@phdthesis{dawei_privacy_2023, address = {Singapore}, type = {Ph.{D}. {Dissertation}}, title = {Privacy protection : {Pets} by individuals, {PDPS} by firms}, url = {http://ezproxy2.library.drexel.edu/login?url=https://www.proquest.com/dissertations-theses/privacy-protection-pets-individuals-pdps-firms/docview/3143978418/se-2}, abstract = {Advances in data collection and mining techniques have given rise to the necessity of privacy protection. Apart from privacy regulations, individuals and firms also play considerable roles in the process of privacy protection. For example, to combat the threat of privacy invasion, individuals are proactively adopting privacy enhancing technologies (PETs) to protect their personal information. For enterprises, it takes great effort and resources, such as privacy dark patterns (PDPs) practices, for them to “wisely” comply with privacy regulations. This dissertation seeks to understand the role individuals and firms play in the process of privacy protection through two studies.The first study examines the impact of end-user PETs on firms’ analytics capabilities. After a comprehensive review of end-user PETs, we propose an inductively derived framework which qualitatively shows that end-user PETs induce measurement error and/or missing values with regards to attributes, entities, and relationships in firms’ customer databases, but the impact of specific end-user PETs may vary by analytics use case. We propose a value-oriented framework through which firms can study and quantify the impact of end-user PETs. We illustrate the value of this framework by applying it with simulation experiments in the context of product recommendations that quantitatively find that consumers’ adoption characteristics (i.e., adoption rate and pattern) and PETs protection characteristics (i.e., protection mechanism and intensity) significantly affect the performance of recommender systems. In addition, our results reveal the presence of spillover effects. In the presence of end-user PETs adoption, not only PET users but also non-users become worse off; moreover, PET users suffer more in term of recommendation accuracy. Even though observations from PET users are problematic, we find that their removal could actually further deteriorate recommendation accuracy.The second study investigates the economic implications of privacy dark patterns (PDPs) through which firms could “wisely” play privacy protection games. It is commonly believed that PDPs advantage firms by deceiving and collecting more information from consumers. Nevertheless, they could also hinder firms’ credibility and consumers might stop sharing information to and purchase products from firms. Thus, the second study, firstly, aims to examine whether PDPs always benefit firms and hurt consumers. We also try to answer whether market force is sufficient to keep PDPs at low levels. Our results show that the presence of PDPs indeed makes users weakly worse off and the seller weakly better off. Nevertheless, the seller has incentives to not utilize any PDPs when users’ privacy cost is high, and the ratio of privacy concern and the reduced search cost of opt-in is either too high or too low. This could be attributed to the fact that the market shrinkage effect dominates the market division effect under these conditions. In other words, the gain from making more users opt-in will be outweighed by the loss from total market shrinkage when the seller increases its level of PDPs. Finally, we show that a welfare maximizing social planner would allow the presence of PDPs when the users’ privacy cost is sufficiently low.}, language = {English}, school = {National University of Singapore}, author = {Dawei, Chen}, year = {2023}, note = {ISBN: 9798346780625 Pages: 157}, keywords = {0384:Behavioral psychology, 0386:Family and consumer sciences, 0404:Climate Change, 0501:Economics, 0800:Artificial intelligence, Artificial intelligence, Behavioral psychology, Climate change, Consent, Consumer behavior, Consumers, Data integrity, Economics, Home economics, Market segmentation, Personal information, Privacy, Regulation, Spillover effect, Value creation}, }
@misc{castellini_supplier_2023, address = {Rochester, NY}, type = {{SSRN} {Scholarly} {Paper}}, title = {Supplier competition on subscription-based platforms in the presence of recommender systems}, url = {https://papers.ssrn.com/abstract=4428125}, doi = {10.2139/ssrn.4428125}, abstract = {Subscription-based platforms offer consumers access to a large selection of content at a fixed subscription fee. Recommender systems (RS) can help consumers by reducing the size of this choice set by predicting consumers' preferences. However, because the prediction is based on limited information on the consumers and sometimes even on the content, the recommendations are susceptible to biases, a phenomenon widely evidenced in the computer science literature. Intuitively, if these biases systematically favour certain suppliers over others, this could impact competition between suppliers. To study this intuition, we introduce a simple framework of a platform that sells to consumers with quasi-linear utility functions via a recommender system. We find that RS biases lead to more concentrated markets and increased entry barriers even when the platform is not self-preferencing their own products, and users are rational. Limited-attention users can reduce the market concentrating impact of RS biases and harm top-selling products, but the platform can counteract this effect by a choice architecture that gives more prominence to popular items. Self-preferencing does not further increase concentration but it ensures that the winners are the products preferred by the platform. Although encouraging more exploration can reduce these market consolidating effects, we show that they also reduce recommendation relevance in the short-run.}, language = {en}, urldate = {2025-08-01}, publisher = {Social Science Research Network}, author = {Castellini, Jacopo and Fletcher, Amelia and Ormosi, Peter L. and Savani, Rahul}, month = apr, year = {2023}, }
@article{halpern_representation_2023, title = {Representation with {Incomplete} {Votes}}, volume = {37}, copyright = {Copyright (c) 2023 Association for the Advancement of Artificial Intelligence}, issn = {2374-3468}, url = {https://ojs.aaai.org/index.php/AAAI/article/view/25702}, doi = {10.1609/aaai.v37i5.25702}, abstract = {Platforms for online civic participation rely heavily on methods for condensing thousands of comments into a relevant handful, based on whether participants agree or disagree with them. These methods should guarantee fair representation of the participants, as their outcomes may affect the health of the conversation and inform impactful downstream decisions. To that end, we draw on the literature on approval-based committee elections. Our setting is novel in that the approval votes are incomplete since participants will typically not vote on all comments. We prove that this complication renders non-adaptive algorithms impractical in terms of the amount of information they must gather. Therefore, we develop an adaptive algorithm that uses information more efficiently by presenting incoming participants with statements that appear promising based on votes by previous participants. We prove that this method satisfies commonly used notions of fair representation, even when participants only vote on a small fraction of comments. Finally, an empirical evaluation using real data shows that the proposed algorithm provides representative outcomes in practice.}, language = {en}, number = {5}, urldate = {2025-08-01}, journal = {Proceedings of the AAAI Conference on Artificial Intelligence}, author = {Halpern, Daniel and Kehne, Gregory and Procaccia, Ariel D. and Tucker-Foltz, Jamie and Wüthrich, Manuel}, month = jun, year = {2023}, note = {Number: 5}, pages = {5657--5664}, }
@mastersthesis{solbjorg_using_2023, title = {Using {Sentiment} {Analysis} to {Improve} {Course} {Recommendations} for {MOOCs}}, url = {https://ntnuopen.ntnu.no/ntnu-xmlui/handle/11250/3094283}, abstract = {Massive åpne nettkurs (MOOCer) har blitt brukt i e-læring det siste tiåret, og fremveksten deres eksploderte under COVID-19-pandemien. Nye kurs blir stadig tilgjengeliggjort, noe som gjør at studentene blir overveldet og sliter med å finne kurs som passer deres interesser. Dette, kombinert med at frafallet på MOOCer er 90\%, gjør det enda viktigere for studentene å finne passende kurs. Som et resultat har anbefalingssystemer blitt utviklet for å redusere tiden som brukes på å finne kurs, ved å filtrere ut irrelevante kurs og anbefale de mest relevante til studentene. Disse systemene må imidlertid forbedres for å hjelpe studentene med å finne passende kurs og redusere frafallet fra MOOCer. Nylig forskning innen andre fagområder viser at det å kombinere anmeldelser med tallvurderinger forbedrer anbefalinger. Målet med denne oppgaven er å adoptere denne ideen ved å utvikle et anbefalingssystem som inkorporerer sentiment fra kursanmeldelser i kursenes tallvurderinger. Sentimentene uthentes gjennom sentimentanalyse ved hjelp av en BERT-modell, og kombineres deretter med de opprinnelige tallvurderingene ved bruk av vekter. En rekke anbefalingsalgoritmer ble implementert for å analysere innvirkningen av de justerte tallvurderingene. Deretter ble anbefalingssystemet evaluert på COCO-datasettet, som inneholder 4,5 millioner kursanmeldelser. Alle anbefalingsalgoritmene presterte noe bedre med de justerte tallvurderingene. Algoritmen med den største forbedringen var ALSImplicitMF, som forbedret sin nDCG-score med 1,54\%. Imidlertid var algoritmene generelt dårlige sammenlignet med lignenede forskning, blant annet siden datasettet har få interaksjoner per student og kurs.}, language = {eng}, urldate = {2025-08-01}, school = {NTNU}, author = {Solbjørg, Ingrid Amalie}, year = {2023}, note = {Accepted: 2023-10-04T17:21:54Z}, }
@inproceedings{ihemelandu_candidate_2023, title = {Candidate set sampling for evaluating top-{N} recommendation}, url = {https://doi.org/10.1109/WI-IAT59888.2023.00018}, doi = {10.1109/WI-IAT59888.2023.00018}, abstract = {The strategy for selecting candidate sets -- the set of items that the recommendation system is expected to rank for each user -- is an important decision in carrying out an offline top-\$N\$ recommender system evaluation. The set of candidates is composed of the union of the user's test items and an arbitrary number of non-relevant items that we refer to as decoys. Previous studies have aimed to understand the effect of different candidate set sizes and selection strategies on evaluation. In this paper, we extend this knowledge by studying the specific interaction of candidate set selection strategies with popularity bias, and use simulation to assess whether sampled candidate sets result in metric estimates that are less biased with respect to the true metric values under complete data that is typically unavailable in ordinary experiments.}, urldate = {2023-11-08}, booktitle = {Proceedings of the 22nd {IEEE}/{WIC} international conference on web intelligence and intelligent agent technology}, author = {Ihemelandu, Ngozi and Ekstrand, Michael D.}, month = oct, year = {2023}, note = {arXiv:2309.11723 [cs]}, keywords = {Computer Science - Information Retrieval}, pages = {88--94}, }
@mastersthesis{falch_measuring_2022, title = {Measuring the {Effect} of {Recommender} {Systems} in {Online} {Video} {Learning} {Platforms}: {A} {Case} {Study} with {Utdannet}.no}, shorttitle = {Measuring the {Effect} of {Recommender} {Systems} in {Online} {Video} {Learning} {Platforms}}, url = {https://hdl.handle.net/11250/3041057}, abstract = {Anbefalingssystemer er over alt i dagens samfunn. Deres nytteverdi gjør at de ser bruk i mange domener, fra søkemotorer, til handel, til utdanning. I dag finnes det mye billig, pålitelig teknologi, og dette har banet vei for e-læring og robuste systemer som presenterer gode læringsmaterialer. Men å evaluere nyanserte spørsmål om e-læring kan være vanskelig. I samarbeid med Utdannet utforsker denne avhandlingen personaliserte anbefalinger og deres effekt på brukerengasjement og tid brukt på Utdannet sin plattform. For å undersøke denne effekten ble to A/B-tester utført. To forskjellige anbefalingsstrategier ble brukt for å måle klikkraten og dveleraten mellom de forskjellige strategiene. Resultatene av eksperimentene er inkonklusive. Mangel på data, en for simplistisk modell, og en lav adopsjonsrate er hypotetiserte årsaker til hvorfor resultatet er som det er. Selv om ingen konklusjoner kan bli tatt, er det en signifikant forskjell i antall observasjoner mellom de forskjellige strategiene som ble brukt i hybridmodellen. På bakgrunn av dette anbefaler jeg mer forskning på mer tilpassede modeller, og en lengre eksperimentperiode.}, language = {eng}, urldate = {2025-08-01}, school = {NTNU}, author = {Falch, William Tallis}, year = {2022}, note = {Accepted: 2023-01-04T18:19:42Z}, }
@inproceedings{lin_privacy-preserving_2022, title = {Privacy-{Preserving} {Recommendation} with {Debiased} {Obfuscaiton}}, url = {https://ieeexplore.ieee.org/abstract/document/10063398}, doi = {10.1109/TrustCom56396.2022.00086}, abstract = {As people enjoy the personalized services recommended by Recommender Systems (RSs), the privacy disclosure risk increases with frequent interactions. Malicious adversary often collects public information online to infer private information for illicit profit. As privacy concerns grew, researchers introduced data obfuscation into recommender systems. However, there still exists several limitations in current work. First, although the existing methods effectively reduce the risk of privacy disclosure, they can be detrimental to the quality of the recommendation service. Second, a range of practical issues under the application of recommendation systems are not considered, e.g., long-tail, density, etc. To address those challenges, we propose a novel framework named Want User Defending Inference (WUDI), a high-performance privacy-preserving debiased framework based on data obfuscation. Unlike the original strategies, i.e., adding or removing user ratings, we introduced some novel strategies to generate an obfuscated matrix. Firstly, we define a new method called Cluster Recommend for alleviating the long-tail skewness and data sparsity in RSs. Then we investigate the gender bias in obfuscation and apply a bias mitigating strategy to RSs. Experiments on public datasets demonstrate that WUDI can outperform the state-of-the-art baselines in obfuscation.}, urldate = {2025-08-01}, booktitle = {2022 {IEEE} {International} {Conference} on {Trust}, {Security} and {Privacy} in {Computing} and {Communications} ({TrustCom})}, author = {Lin, Chennan and Liu, Baisong and Zhang, Xueyuan and Wang, Zhiye and Hu, Ce and Luo, Linze}, month = dec, year = {2022}, note = {ISSN: 2324-9013}, pages = {590--597}, }
@inproceedings{wei_recommender_2021, address = {New York, NY, USA}, title = {Recommender {Systems} for {Software} {Project} {Managers}}, url = {https://doi.org/10.1145/3463274.3463951}, doi = {10.1145/3463274.3463951}, abstract = {The design of recommendation systems is based on complex information processing and big data interaction. This personalized view has evolved into a hot area in the past decade, where applications might have been proved to help for solving problem in the software development field. Therefore, with the evolvement of Recommendation System in Software Engineering (RSSE), the coordination of software projects with their stakeholders is improving. This experiment examines four open source recommender systems and implemented a customized recommender engine with two industrial-oriented packages: Lenskit and Mahout. Each of the main functions was examined and issues were identified during the experiment.}, urldate = {2021-09-14}, booktitle = {{EASE} 2021}, publisher = {Association for Computing Machinery}, author = {Wei, Liang and Capretz, Luiz Fernando}, month = jun, year = {2021}, note = {Journal Abbreviation: EASE 2021}, keywords = {RSSE, Recommender Engine, Project Management, Recommendation System, Recommendation System in Software Engineering}, pages = {412--417}, }
@article{wischenbart_engaging_2021, title = {Engaging end-user driven recommender systems: personalization through web augmentation}, volume = {80}, issn = {1573-7721}, shorttitle = {Engaging end-user driven recommender systems}, url = {https://doi.org/10.1007/s11042-020-09803-8}, doi = {10.1007/s11042-020-09803-8}, abstract = {In the past decades recommender systems have become a powerful tool to improve personalization on the Web. Yet, many popular websites lack such functionality, its implementation usually requires certain technical skills, and, above all, its introduction is beyond the scope and control of end-users. To alleviate these problems, this paper presents a novel tool to empower end-users without programming skills, without any involvement of website providers, to embed personalized recommendations of items into arbitrary websites on client-side. For this we have developed a generic meta-model to capture recommender system configuration parameters in general as well as in a web augmentation context. Thereupon, we have implemented a wizard in the form of an easy-to-use browser plug-in, allowing the generation of so-called user scripts, which are executed in the browser to engage collaborative filtering functionality from a provided external rest service. We discuss functionality and limitations of the approach, and in a study with end-users we assess the usability and show its suitability for combining recommender systems with web augmentation techniques, aiming to empower end-users to implement controllable recommender applications for a more personalized browsing experience.}, language = {en}, number = {5}, urldate = {2025-08-01}, journal = {Multimedia Tools and Applications}, author = {Wischenbart, Martin and Firmenich, Sergio and Rossi, Gustavo and Bosetti, Gabriela and Kapsammer, Elisabeth}, month = feb, year = {2021}, pages = {6785--6809}, }
link bibtex
@mastersthesis{rete_multi-armed_2021, address = {Dublin, Ireland}, title = {Multi-{Armed} {Bandit} algorithm for news recommendation systems}, language = {en}, school = {Trintity College Dublin}, author = {Rete, Catalina}, month = may, year = {2021}, keywords = {⛔ No DOI found}, }
@inproceedings{musto_fairness_2021, title = {Fairness and {Popularity} {Bias} in {Recommender} {Systems}: an {Empirical} {Evaluation}}, url = {https://ceur-ws.org/Vol-3078/paper-69.pdf}, abstract = {In this paper, we present the results of an empirical evaluation investigating how recommendation algorithms are affected by popularity bias. Popularity bias makes more popular items to be recommended more frequently than less popular ones, thus it is one of the most relevant issues that limits the fairness of recommender systems. In particular, we define an experimental protocol based on two state-of-theart datasets containing users’ preferences on movies and books and three different recommendation paradigms, i.e., collaborative filtering, content-based filtering and graph-based algorithms. In order to evaluate the overall fairness of the recommendations we use well-known metrics such as Catalogue Coverage, Gini Index and Group Average Popularity (ΔGAP). The goal of this paper is: (i) to provide a clear picture of how recommendation techniques are affected by popularity bias; (ii) to trigger further research in the area aimed to introduce methods to mitigate or reduce biases in order to provide fairer recommendations.}, language = {en}, booktitle = {{AIxIA} 2021 {Discussion} {Papers}, 20th {International} {Conference} {Italian} {Association} for {Artificial} {Intelligence}}, author = {Musto, Cataldo and Lops, Pasquale and Semeraro, Giovanni}, year = {2021}, keywords = {⛔ No DOI found}, }
@mastersthesis{vanhaesebroeck_music_2020, address = {Belgium}, title = {Music recommendation using genetic programming}, url = {https://libstore.ugent.be/fulltxt/RUG01/002/945/760/RUG01-002945760_2021_0001_AC.pdf}, urldate = {2025-05-30}, school = {Ghent University}, author = {Vanhaesebroeck, Robbe}, year = {2020}, }
@mastersthesis{da_silva_user-specific_2020, address = {Portugal}, title = {User-{Specific} {Bicluster}-{Based} {Collaborative} {Filtering}}, copyright = {Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.}, url = {https://www.proquest.com/docview/2652593247/abstract/29EEDE32E67A4219PQ/1}, abstract = {Os sistemas de recomendação são um conjunto de técnicas e software que têm como objetivo sugerir itens a um determinado utilizador. Sugestões essas que têm como objetivo ajudar os utilizadores durante a tomada de decisão. O processo para uma tomada de decisão pode ser difícil, especialmente quando existe um enorme número de opções para escolher. Grandes empresas tiram partido dos sistemas de recomendação para melhorar o seu serviço e aumentar as suas receitas. Um exemplo é a plataforma de streaming Netflix que, utilizando um sistema de recomendação, personaliza os filmes ou séries destacados para cada cliente. As recomendações personalizadas normalmente têm como base os dados que as empresas recolhem dos utilizadores, que vão desde reações explícitas, por exemplo através avaliações do utilizador a produtos, a reações implícitas, examinando a forma como o utilizador interage com o sistema. Uma das abordagens mais populares dos sistemas de recomendação é a filtragem colaborativa. Os métodos baseados em filtragem colaborativa produzem recomendações personalizadas de itens, tendo por base padrões encontrados em dados de uso ou avaliações anteriores. Os modelos de filtragem colaborativa normalmente usam uma simples matriz de dados, conhecida como matriz de interação U-I, que contém as avaliações que os utilizadores deram aos itens do sistema. Explorando os dados da matriz U-I, a filtragem colaborativa assume que, se um determinado utilizador teve as mesmas preferências que outro utilizador no passado, é provável que também venha a ter no futuro. Desta forma, os modelos de filtragem colaborativa têm como objetivo recomendar uma lista de N itens a um utilizador (denominado utilizador ativo), ou prever o rating que esse utilizador iria dar a um item que ainda não avaliou. Na literatura, os métodos de filtragem colaborativa são divididos em duas classes: os baseados em memórias e os baseados em modelos. Os algoritmos baseados em memória, também conhecidos como algoritmos de vizinhança, usam toda a matriz U-I para realizar as tarefas de recomendação. Os dois principais métodos são conhecidos como “User-based” e “Item-based”. O User-based tenta encontrar utilizadores com preferências parecidas ao utilizador a que se pretende fazer recomendações e usa os dados dessa vizinhança de utilizadores similares para fazer as previsões ou recomendações. Por outro lado, os algoritmos Item-based utilizam os itens já avaliados pelo utilizador ativo, calculam a similaridade entre esses itens e o item que se quer avaliar, construindo assim uma vizinhança de itens. A partir dessa vizinhança de itens, prevê-se uma futura avaliação do utilizador a esse mesmo item. Apesar de os algoritmos de vizinhança obterem bom resultados de previsão e recomendação, apresentam duas grandes debilidades que limitam o seu uso em ambientes de recomendação do mundo real. Os dados de recomendação são normalmente de grandes dimensões e esparsos, isto é, com muitos valores em falta. Dada a complexidade resultante do facto de terem de comparar todos os utilizadores ou itens entre si, o que se traduz em n 2 comparações, torna-se impraticável o uso de algoritmos deste género em sistemas com grande quantidade de users e itens. Além disso, o facto de haver muitos valores em falta, faz que seja recorrente alguns utilizadores/itens terem pequenas vizinhanças. Para tentar lidar com as fraquezas dos algoritmos baseados em memórias, surgiram os algoritmos baseados em modelos. Estas abordagens utilizam modelos que aprendem com os dados e reconhecem padrões para realizar as tarefas de filtragem colaborativa. Técnicas de redução de dimensionalidade como “Singular Value Decomposition” e “Latent Semantic Analysis” são agora as abordagens standard para reduzir a natureza esparsa da matriz de interação. Existem ainda abordagens baseadas em aprendizagem automática, como redes bayesianas, agrupamento de dados, entre outras. Estes modelos de redução de dimensionalidade, apesar de perderem informação que geralmente resulta em piores resultados em termos de previsão/recomendação, conseguem lidar com o problema da escalabilidade apresentado pelos modelos baseados em memória. Alternate abstract: Collaborative Filtering is one of the most popular and successful approaches for Recommender Systems. However, some challenges limit the effectiveness of Collaborative Filtering approaches when dealing with recommendation data, mainly due to the vast amounts of data and their sparse nature. In order to improve the scalability and performance of Collaborative Filtering approaches, several authors proposed successful approaches combining Collaborative Filtering with clustering techniques. In this work, we study the effectiveness of biclustering, an advanced clustering technique that groups rows and columns simultaneously, in Collaborative Filtering. When applied to the classic U-I interaction matrices, biclustering considers the duality relations between users and items, creating clusters of users who are similar under a particular group of items. We propose USBCF, a novel biclustering-based Collaborative Filtering approach that creates user specific models to improve the scalability of traditional CF approaches. Using a realworld dataset, we conduct a set of experiments to objectively evaluate the performance of the proposed approach, comparing it against baseline and state-of-the-art Collaborative Filtering methods. Our results show that the proposed approach can successfully suppress the main limitation of the previously proposed state-of-the-art biclustering-based Collaborative Filtering (BBCF) since BBCF can only output predictions for a small subset of the system users and item (lack of coverage). Moreover, USBCF produces rating predictions with quality comparable to the state-of-the-art approaches.}, language = {English}, urldate = {2025-05-30}, school = {Universidade de Lisboa (Portugal)}, author = {da Silva, Miguel Miranda Garção}, year = {2020}, note = {ISBN: 9798209925156}, }
Original LensKit (Java)
If you publish research that uses the old Java version of LensKit, cite:
BibTeX
@INPROCEEDINGS{LensKit,
title = "Rethinking the Recommender Research Ecosystem: Reproducibility, Openness, and {LensKit}",
booktitle = "Proceedings of the Fifth {ACM} Conference on Recommender Systems",
author = "Ekstrand, Michael D and Ludwig, Michael and Konstan, Joseph A and Riedl, John T",
publisher = "ACM",
pages = "133--140",
series = "RecSys '11",
year = 2011,
url = "http://doi.acm.org/10.1145/2043932.2043958",
conference = "RecSys '11",
doi = "10.1145/2043932.2043958"
}

<script src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2FTJPPJ92X%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1&jsonp=1"></script>
<?php
$contents = file_get_contents("https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2FTJPPJ92X%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1");
print_r($contents);
?>
<iframe src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2FTJPPJ92X%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1"></iframe>
For more details see the documention.
To the site owner:
Action required! Mendeley is changing its API. In order to keep using Mendeley with BibBase past April 14th, you need to:
- renew the authorization for BibBase on Mendeley, and
- update the BibBase URL in your page the same way you did when you initially set up this page.
@unpublished{aridor_economics_2022, title = {The {Economics} of {Recommender} {Systems}: {Evidence} from a {Field} {Experiment} on {MovieLens}}, url = {http://arxiv.org/abs/2211.14219}, abstract = {We conduct a field experiment on a movie-recommendation platform to identify if and how recommendations affect consumption. We use within-consumer randomization at the good level and elicit beliefs about unconsumed goods to disentangle exposure from informational effects. We find recommendations increase consumption beyond its role in exposing goods to consumers. We provide support for an informational mechanism: recommendations affect consumers' beliefs, which in turn explain consumption. Recommendations reduce uncertainty about goods consumers are most uncertain about and induce information acquisition. Our results highlight the importance of recommender systems' informational role when considering policies targeting these systems in online marketplaces.}, author = {Aridor, Guy and Goncalves, Duarte and Kluver, Daniel and Kong, Ruoyan and Konstan, Joseph}, month = nov, year = {2022}, note = {ISBN: 2211.14219 Publication Title: arXiv [econ.GN]}, }
@inproceedings{wei_recommender_2021, address = {New York, NY, USA}, title = {Recommender {Systems} for {Software} {Project} {Managers}}, url = {https://doi.org/10.1145/3463274.3463951}, doi = {10.1145/3463274.3463951}, abstract = {The design of recommendation systems is based on complex information processing and big data interaction. This personalized view has evolved into a hot area in the past decade, where applications might have been proved to help for solving problem in the software development field. Therefore, with the evolvement of Recommendation System in Software Engineering (RSSE), the coordination of software projects with their stakeholders is improving. This experiment examines four open source recommender systems and implemented a customized recommender engine with two industrial-oriented packages: Lenskit and Mahout. Each of the main functions was examined and issues were identified during the experiment.}, urldate = {2021-09-14}, booktitle = {{EASE} 2021}, publisher = {Association for Computing Machinery}, author = {Wei, Liang and Capretz, Luiz Fernando}, month = jun, year = {2021}, note = {Journal Abbreviation: EASE 2021}, keywords = {RSSE, Recommender Engine, Project Management, Recommendation System, Recommendation System in Software Engineering}, pages = {412--417}, }
@article{wischenbart_engaging_2021, title = {Engaging end-user driven recommender systems: personalization through web augmentation}, volume = {80}, issn = {1573-7721}, shorttitle = {Engaging end-user driven recommender systems}, url = {https://doi.org/10.1007/s11042-020-09803-8}, doi = {10.1007/s11042-020-09803-8}, abstract = {In the past decades recommender systems have become a powerful tool to improve personalization on the Web. Yet, many popular websites lack such functionality, its implementation usually requires certain technical skills, and, above all, its introduction is beyond the scope and control of end-users. To alleviate these problems, this paper presents a novel tool to empower end-users without programming skills, without any involvement of website providers, to embed personalized recommendations of items into arbitrary websites on client-side. For this we have developed a generic meta-model to capture recommender system configuration parameters in general as well as in a web augmentation context. Thereupon, we have implemented a wizard in the form of an easy-to-use browser plug-in, allowing the generation of so-called user scripts, which are executed in the browser to engage collaborative filtering functionality from a provided external rest service. We discuss functionality and limitations of the approach, and in a study with end-users we assess the usability and show its suitability for combining recommender systems with web augmentation techniques, aiming to empower end-users to implement controllable recommender applications for a more personalized browsing experience.}, language = {en}, number = {5}, urldate = {2025-08-01}, journal = {Multimedia Tools and Applications}, author = {Wischenbart, Martin and Firmenich, Sergio and Rossi, Gustavo and Bosetti, Gabriela and Kapsammer, Elisabeth}, month = feb, year = {2021}, pages = {6785--6809}, }
@article{ibrahim_hybrid_2021, title = {Hybrid {Recommender} for {Research} {Papers} and {Articles}}, volume = {10}, url = {http://article.ijoiis.com/pdf/10.11648.j.ijiis.20211002.11.pdf}, abstract = {… GroupLens called LensKit , along with set of tools for such system was used to implement Collaborative filtering algorithm. This research uses only the LensKit -core and LensKit -data-structures modules to implement this section of the algorithm …}, number = {2}, journal = {Int. J. Intell. Inf. Database Syst.}, author = {Ibrahim, Alhassan Jamilu and Zira, Peter and Abdulganiyyi, Nuraini}, year = {2021}, note = {Publisher: Science Publishing Group}, pages = {9}, }
@inproceedings{zhou_privacy_2021, title = {Privacy and performance in recommender systems: {Exploration} of potential influence of {CCPA}}, url = {http://2021.cswimworkshop.org/wp-content/uploads/2021/06/cswim2021_paper_80.pdf}, urldate = {2021-07-12}, author = {Zhou, Meizi and Song, Yicheng and Adomavicius, Gediminas}, year = {2021}, }
@unpublished{bellogin_improving_2021, title = {Improving {Accountability} in {Recommender} {Systems} {Research} {Through} {Reproducibility}}, url = {http://arxiv.org/abs/2102.00482}, abstract = {Reproducibility is a key requirement for scientific progress. It allows the reproduction of the works of others, and, as a consequence, to fully trust the reported claims and results. In this work, we argue that, by facilitating reproducibility of recommender systems experimentation, we indirectly address the issues of accountability and transparency in recommender systems research from the perspectives of practitioners, designers, and engineers aiming to assess the capabilities of published research works. These issues have become increasingly prevalent in recent literature. Reasons for this include societal movements around intelligent systems and artificial intelligence striving towards fair and objective use of human behavioral data (as in Machine Learning, Information Retrieval, or Human-Computer Interaction). Society has grown to expect explanations and transparency standards regarding the underlying algorithms making automated decisions for and around us. This work surveys existing definitions of these concepts, and proposes a coherent terminology for recommender systems research, with the goal to connect reproducibility to accountability. We achieve this by introducing several guidelines and steps that lead to reproducible and, hence, accountable experimental workflows and research. We additionally analyze several instantiations of recommender system implementations available in the literature, and discuss the extent to which they fit in the introduced framework. With this work, we aim to shed light on this important problem, and facilitate progress in the field by increasing the accountability of research.}, author = {Bellogín, Alejandro and Said, Alan}, month = jan, year = {2021}, note = {ISBN: 2102.00482 Publication Title: arXiv [cs.IR]}, }
@article{cheng_understanding_2020, title = {Understanding the {Impact} of {Individual} {Users}’ {Rating} {Characteristics} on the {Predictive} {Accuracy} of {Recommender} {Systems}}, volume = {32}, issn = {1091-9856}, url = {https://doi.org/10.1287/ijoc.2018.0882}, doi = {10.1287/ijoc.2018.0882}, abstract = {In this study, we investigate how individual users? rating characteristics affect the user-level performance of recommendation algorithms. We measure users? rating characteristics from three perspectives: rating value, rating structure, and neighborhood network embeddedness. We study how these three categories of measures influence the predictive accuracy of popular recommendation algorithms for each user. Our experiments use five real-world data sets with varying characteristics. For each individual user, we estimate the predictive accuracy of three recommendation algorithms. We then apply regression-based models to uncover the relationships between rating characteristics and recommendation performance at the individual user level. Our experimental results show consistent and significant effects of several rating measures on recommendation accuracy. Understanding how rating characteristics affect the recommendation performance at the individual user level has practical implications for the design of recommender systems.}, number = {2}, journal = {INFORMS J. Comput.}, author = {Cheng, Xiaoye and Zhang, Jingjing and Yan, Lu (lucy)}, month = apr, year = {2020}, note = {Publisher: INFORMS}, pages = {303--320}, }
@article{kotkov_how_2020, title = {How does serendipity affect diversity in recommender systems? {A} serendipity-oriented greedy algorithm}, volume = {102}, issn = {0144-3097}, url = {http://link.springer.com/10.1007/s00607-018-0687-5}, doi = {10.1007/s00607-018-0687-5}, abstract = {Most recommender systems suggest items that are popular among all users and similar to items a user usually consumes. As a result, the user receives recommendations that she/he is already familiar with or would find anyway, leading to low satisfaction. To overcome this problem, a recommender system should suggest novel, relevant and unexpected i.e., serendipitous items. In this paper, we propose a serendipity-oriented, reranking algorithm called a serendipity-oriented greedy (SOG) algorithm, which improves serendipity of recommendations through feature diversification and helps overcome the overspecialization problem. To evaluate our algorithm, we employed the only publicly available dataset containing user feedback regarding serendipity. We compared our SOG algorithm with topic diversification, popularity baseline, singular value decomposition, serendipitous personalized ranking and Zheng’s algorithms relying on the above dataset. SOG outperforms other algorithms in terms of serendipity and diversity. It also outperforms serendipity-oriented algorithms in terms of accuracy, but underperforms accuracy-oriented algorithms in terms of accuracy. We found that the increase of diversity can hurt accuracy and harm or improve serendipity depending on the size of diversity increase.}, number = {2}, journal = {Computing}, author = {Kotkov, Denis and Veijalainen, Jari and Wang, Shuaiqiang}, month = feb, year = {2020}, pages = {393--411}, }
@phdthesis{noffsinger_predictive_2020, address = {Ann Arbor, United States}, title = {Predictive {Accuracy} of {Recommender} {Algorithms}}, url = {https://libproxy.boisestate.edu/login?url=https://www-proquest-com.libproxy.boisestate.edu/dissertations-theses/predictive-accuracy-recommender-algorithms/docview/2466761384/se-2}, abstract = {Recommender systems present a customized list of items based upon user or item characteristics with the objective of reducing a large number of possible choices to a smaller ranked set most likely to appeal to the user. A variety of algorithms for recommender systems have been developed and refined including applications of deep learning neural networks. Recent research reports point to a need to perform carefully controlled experiments to gain insights about the relative accuracy of different recommender algorithms, because studies evaluating different methods have not used a common set of benchmark data sets, baseline models, and evaluation metrics.The dissertation used publicly available sources of ratings data with a suite of three conventional recommender algorithms and two deep learning (DL) algorithms in controlled experiments to assess their comparative accuracy. Results for the non-DL algorithms conformed well to published results and benchmarks. The two DL algorithms did not perform as well and illuminated known challenges implementing DL recommender algorithms as reported in the literature. Model overfitting is discussed as a potential explanation for the weaker performance of the DL algorithms and several regularization strategies are reviewed as possible approaches to improve predictive error. Findings justify the need for further research in the use of deep learning models for recommender systems.}, school = {Nova Southeastern University}, author = {Noffsinger, William B}, collaborator = {Mukherjee, Sumitra}, year = {2020}, note = {Publication Title: Information Systems (DISS)}, }
@article{gazdar_new_2020, title = {A new similarity measure for collaborative filtering based recommender systems}, volume = {188}, issn = {0950-7051}, url = {http://www.sciencedirect.com/science/article/pii/S0950705119304484}, doi = {10.1016/j.knosys.2019.105058}, abstract = {The objective of a recommender system is to provide customers with personalized recommendations while selecting an item among a set of products (movies, books, etc.). The collaborative filtering is the most used technique for recommender systems. One of the main components of a recommender system based on the collaborative filtering technique, is the similarity measure used to determine the set of users having the same behavior with regard to the selected items. Several similarity functions have been proposed, with different performances in terms of accuracy and quality of recommendations. In this paper, we propose a new simple and efficient similarity measure. Its mathematical expression is determined through the following paper contributions: 1) transforming some intuitive and qualitative conditions, that should be satisfied by the similarity measure, into relevant mathematical equations namely: the integral equation, the linear system of differential equations and a non-linear system and 2) resolving the equations to achieve the kernel function of the similarity measure. The extensive experimental study driven on a benchmark datasets shows that the proposed similarity measure is very competitive, especially in terms of accuracy, with regards to some representative similarity measures of the literature.}, journal = {Knowledge-Based Systems}, author = {Gazdar, Achraf and Hidri, Lotfi}, month = jan, year = {2020}, keywords = {Collaborative filtering, Neighborhood based CF, Recommendation systems, Similarity measure}, pages = {105058}, }
@inproceedings{polychronou_machine_2020, title = {Machine {Learning} {Algorithms} for {Food} {Intelligence}: {Towards} a {Method} for {More} {Accurate} {Predictions}}, url = {http://dx.doi.org/10.1007/978-3-030-39815-6_16}, doi = {10.1007/978-3-030-39815-6_16}, abstract = {It is evident that machine learning algorithms are being widely impacting industrial applications and platforms. Beyond typical research experimentation scenarios, there is a need for companies that wish to enhance their online data and analytics solutions to incorporate ways in which they can select, experiment, benchmark, parameterise and choose the version of a machine learning algorithm that seems to be most appropriate for their specific application context. In this paper, we describe such a need for a big data platform that supports food data analytics and intelligence. More specifically, we introduce Agroknow’s big data platform and identify the need to extend it with a flexible and interactive experimentation environment where different machine learning algorithms can be tested using a variation of synthetic and real data. A typical usage scenario is described, based on our need to experiment with various machine learning algorithms to support price prediction for food products and ingredients. The initial requirements for an experimentation environment are also introduced.}, publisher = {Springer International Publishing}, author = {Polychronou, Ioanna and Katsivelis, Panagis and Papakonstantinou, Mihalis and Stoitsis, Giannis and Manouselis, Nikos}, year = {2020}, pages = {165--172}, }
@article{asenova_personalized_2019, title = {Personalized {Micro}-{Service} {Recommendation} {System} for {Online} {News}}, volume = {160}, issn = {1877-0509}, url = {http://www.sciencedirect.com/science/article/pii/S1877050919317399}, doi = {10.1016/j.procs.2019.11.039}, abstract = {In the era of artificial intelligence and high technology advance our life is dependent on them in every aspect. The dynamic environment forces us to plan our time with conscious and every minute is valuable. To help individuals and corporations see information that is only relevant to them, recommendation systems are in place. Popular platforms that such as Amazon, Ebay, Netflix, YouTube, make use of advanced recommendation systems to better serve the needed of their users. This research paper gives insight of building a microservice recommendation system for online news. Research in recommendation systems is mainly focused on improving user’s experience based mainly on personalization information, such as preferences, and searching history. To determine the initial preferences of a user an initial menu of topics/themes is provided for the user to choose from. In order to reflect as precise as possible the searching interests regarding news of user, all of his interactions are thoroughly recorded and in depth analyzed, based on advanced machine learning techniques, when adjusting the news topics, the user is interested for. Based on the aforementioned approach, a personalized recommendation system for online news has been developed. Existing techniques has been researched and evaluated to aid the decision about picking the best approach for the software to be implemented. Frameworks/technologies used for the development are Java 8, Spring boot, Spring MVC, Maven and MongoDB.}, journal = {Procedia Comput. Sci.}, author = {Asenova, Marchela and Chrysoulas, Christos}, month = jan, year = {2019}, keywords = {TF-IDF, collaborative filtering, cosine similarity, recommendation engine, recommendation phases}, pages = {610--615}, }
@inproceedings{shriver_evaluating_2019, title = {Evaluating {Recommender} {System} {Stability} with {Influence}-{Guided} {Fuzzing}}, url = {https://www.comp.nus.edu.sg/~david/Publications/aaai2019-preprint.pdf}, abstract = {Recommender systems help users to find products or services they may like when lacking personal experience or facing an overwhelming set of choices. Since unstable recommendations can lead to distrust, loss of profits, and a poor user experience, it is important to test recommender system stability. In this work, we present an approach based on inferred models of influence that underlie recommender systems to guide the generation of dataset modifications to assess a recommender's stability. We implement our approach …}, publisher = {AAAI}, author = {Shriver, David and Elbaum, Sebastian and Dwyer, Matthew B and Rosenblum, David S}, year = {2019}, }
@inproceedings{karpus_things_2019, title = {Things you might not know about the k-{Nearest} neighbors algorithm}, url = {https://www.researchgate.net/profile/Adam_Przybylek/publication/336235570_Things_You_Might_Not_Know_about_the_k-Nearest_Neighbors_Algorithm/links/5daf2307a6fdccc99d92bf9f/Things-You-Might-Not-Know-about-the-k-Nearest-Neighbors-Algorithm.pdf}, author = {Karpus, Aleksandra and Raczyńska, M and Przybyłek, A}, year = {2019}, }
@inproceedings{ekstrand_all_2018, series = {Proceedings of {Machine} {Learning} {Research}}, title = {All the cool kids, how do they fit in?: popularity and demographic biases in recommender evaluation and effectiveness}, volume = {81}, url = {https://proceedings.mlr.press/v81/ekstrand18b.html}, abstract = {In the research literature, evaluations of recommender system effectiveness typically report results over a given data set, providing an aggregate measure of effectiveness over each instance (e.g. user) in the data set. Recent advances in information retrieval evaluation, however, demonstrate the importance of considering the distribution of effectiveness across diverse groups of varying sizes. For example, do users of different ages or genders obtain similar utility from the system, particularly if their group is a relatively small subset of the user base? We apply this consideration to recommender systems, using offline evaluation and a utility-based metric of recommendation effectiveness to explore whether different user demographic groups experience similar recommendation accuracy. We find demographic differences in measured recommender effectiveness across two data sets containing different types of feedback in different domains; these differences sometimes, but not always, correlate with the size of the user group in question. Demographic effects also have a complex—and likely detrimental—interaction with popularity bias, a known deficiency of recommender evaluation. These results demonstrate the need for recommender system evaluation protocols that explicitly quantify the degree to which the system is meeting the information needs of all its users, as well as the need for researchers and operators to move beyond naïve evaluations that favor the needs of larger subsets of the user population while ignoring smaller subsets.}, booktitle = {Proceedings of the 1st {Conference} on {Fairness}, {Accountability} and {Transparency}}, publisher = {PMLR}, author = {Ekstrand, Michael D and Tian, Mucun and Azpiazu, Ion Madrazo and Ekstrand, Jennifer D and Anuyah, Oghenemaro and McNeill, David and Pera, Maria Soledad}, editor = {Friedler, Sorelle A and Wilson, Christo}, year = {2018}, note = {Journal Abbreviation: Proceedings of Machine Learning Research}, pages = {172--186}, }
@inproceedings{ekstrand_exploring_2018, address = {New York, NY, USA}, title = {Exploring author gender in book rating and recommendation}, url = {https://dl.acm.org/doi/10.1145/3240323.3240373}, doi = {10.1145/3240323.3240373}, abstract = {Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items. Many of the patterns in rating datasets reflect important real-world differences between the various users and items in the data; other patterns may be irrelevant or possibly undesirable for social or ethical reasons, particularly if they reflect undesired discrimination, such as gender or ethnic discrimination in publishing. In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to a dimension of social concern, namely content creator gender. Using publicly-available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data. We find that common collaborative filtering algorithms differ in the gender distribution of their recommendation lists, and in the relationship of that output distribution to user profile distribution.}, publisher = {ACM}, author = {Ekstrand, Michael D and Tian, Mucun and Kazi, Mohammed R Imran and Mehrpouyan, Hoda and Kluver, Daniel}, month = sep, year = {2018}, }
@inproceedings{dragovic_recommendation_2018, title = {From recommendation to curation: when the system becomes your personal docent}, url = {http://ceur-ws.org/Vol-2225/paper6.pdf}, abstract = {Curation is the act of selecting, organizing, and presenting content. Some applications emulate this process by turning users into curators, while others use recommenders to select items, seldom achieving the focus or selectivity of human curators. We bridge this gap with a …}, author = {Dragovic, Nevena and Azpiazu, Ion Madrazo and Pera, Maria Soledad}, month = oct, year = {2018}, pages = {37--44}, }
@article{cami_user_2018, title = {User preferences modeling using dirichlet process mixture model for a content-based recommender system}, issn = {0950-7051}, url = {http://www.sciencedirect.com/science/article/pii/S0950705118304799}, doi = {10.1016/j.knosys.2018.09.028}, abstract = {Recommender systems have been developed to assist users in retrieving relevant resources. Collaborative and content-based filtering are two basic approaches that are used in recommender systems. The former employs the feedback of users with similar interests, while the latter is based on the feature of the selected resources by each user. Recommender systems can consider users’ behavior to more accurately estimate their preferences via a list of recommendations. However, the existing approaches rarely consider both interests and preferences of the users. Also, the dynamic nature of user behavior poses an additional challenge for recommender systems. In this paper, we consider the interactions of each individual user, and analyze them to propose a user model and capture user’s interests. We construct the user model based on a Bayesian nonparametric framework, called the Dirichlet Process Mixture Model. The proposed model evolves following the dynamic nature of user behavior to adapt both the user interests and preferences. We implemented the proposed model and evaluated it using both the MovieLens dataset, and a real-world dataset that contains news tweets from five news channels (New York Times, BBC, CNN, Reuters and Associated Press). The experimental results and comparisons with several recently developed approaches show the superiority in accuracy of the proposed approach, and its ability to adapt with user behavior over time.}, journal = {Knowledge-Based Systems}, author = {Cami, Bagher Rahimpour and Hassanpour, Hamid and Mashayekhi, Hoda}, month = sep, year = {2018}, keywords = {Temporal content-based recommender systems, User behavior modeling, User preferences modeling}, }
@inproceedings{carvalho_fair_2018, title = {{FAiR}: {A} {Framework} for {Analyses} and {Evaluations} on {Recommender} {Systems}}, url = {http://dx.doi.org/10.1007/978-3-319-95168-3_26}, doi = {10.1007/978-3-319-95168-3_26}, abstract = {Recommender systems (RSs) have become essential tools in e-commerce applications, helping users in the decision-making process. Evaluation on these tools is, however, a major divergence point nowadays, since there is no consensus regarding which metrics are necessary to consolidate new RSs. For this reason, distinct frameworks have been developed to ease the deployment of RSs in research and/or production environments. In the present work, we perform an extensive study of the most popular evaluation metrics, organizing them into three groups: Effectiveness-based, Complementary Dimensions of Quality and Domain Profiling. Further, we consolidate a framework named FAiR to help researchers in evaluating their RSs using these metrics, besides identifying the characteristics of data collections that may intrinsically affect RSs performance. FAiR is compatible with the output format of the main existing RSs libraries (i.e., MyMediaLite and LensKit).}, publisher = {Springer International Publishing}, author = {Carvalho, Diego and Silva, Nícollas and Silveira, Thiago and Mourão, Fernando and Pereira, Adriano and Dias, Diego and Rocha, Leonardo}, year = {2018}, pages = {383--397}, }
@inproceedings{coba_replicating_2018, address = {New York, NY, USA}, title = {Replicating and {Improving} {Top}-{N} {Recommendations} in {Open} {Source} {Packages}}, url = {http://doi.acm.org/10.1145/3227609.3227671}, doi = {10.1145/3227609.3227671}, booktitle = {{WIMS} '18}, publisher = {ACM}, author = {Coba, Ludovik and Symeonidis, Panagiotis and Zanker, Markus}, year = {2018}, note = {Journal Abbreviation: WIMS '18}, keywords = {Collaborative Filtering, Recommendation algorithms, evaluation}, pages = {40:1--40:7}, }
@article{yang_improving_2018, title = {Improving {Existing} {Collaborative} {Filtering} {Recommendations} via {Serendipity}-{Based} {Algorithm}}, volume = {20}, issn = {1520-9210}, url = {http://dx.doi.org/10.1109/TMM.2017.2779043}, doi = {10.1109/TMM.2017.2779043}, abstract = {In this paper, we study how to address the sparsity, accuracy and serendipity issues of top-N recommendation with collaborative filtering (CF). Existing studies commonly use rated items (which form only a small section in a rating matrix) or import some additional information (e.g., details about the items and users) to improve the performance of CF. Unlike these methods, we propose a novel notion towards a huge amount of unrated items: serendipity item. By utilizing serendipity items, we propose concise satisfaction and interest injection (CSII), a method that can effectively find interesting, satisfying, and serendipitous items in unrated items. By preventing uninteresting and unsatisfying items to be recommended as top-N items, this concise-but-novel method improves accuracy and recommendation quality (especially serendipity) substantially. Meanwhile, it can address the sparsity and cold-start issues by enriching the rating matrix in CF without additional information. As our method tackles rating matrix before recommendation procedure, it can be applied to most existing CF methods, such as item-based CF, user-based CF and matrix factorization-based CF. Through comprehensive experiments using abundant real-world datasets with LensKit implementation, we successfully demonstrate that our solution improves the performance of existing CF methods consistently and universally. Moreover, comparing with baseline methods, CSII can extract uninteresting items more carefully and cautiously, avoiding potential items inferred by mistake.}, number = {7}, journal = {IEEE Trans. Multimedia}, author = {Yang, Y and Xu, Y and Wang, E and Han, J and Yu, Z}, month = jul, year = {2018}, keywords = {CF methods, CSII, Collaboration, Collaborative filtering, Computer science, Data mining, Lifting equipment, Multimedia communication, Recommender systems, cold-start issues, collaborative filtering, collaborative filtering recommendations, concise satisfaction and interest injection, item-based CF, matrix decomposition, matrix factorization, matrix factorization-based CF, rating matrix, recommendation quality, recommender systems, serendipitous recommendation, serendipity item, top-N items, top-N recommendation, unrated items, user-based CF}, pages = {1888--1900}, }
@mastersthesis{shriver_assessing_2018, title = {Assessing the {Quality} and {Stability} of {Recommender} {Systems}}, url = {https://digitalcommons.unl.edu/computerscidiss/147}, abstract = {Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these challenges. We first present a general and extensible approach for assessing the quality of the behavior of a recommender system using logical property templates. The approach is general in that it defines recommendation systems in terms of sets of rankings, ratings, users, and items on which property templates are defined. It is extensible in that these property templates define a space of properties that can be instantiated and parameterized to characterize a recommendation system. We study the application of the approach to several recommendation systems. Our findings demonstrate the potential of these properties, illustrating the insights they can provide about the different algorithms and evolving datasets. We also present an approach for influence-guided fuzz testing of recommender system stability. We infer influence models for aspects of a dataset, such as users or items, from the recommendations produced by a recommender system and its training data. We define dataset fuzzing heuristics that use these influence models for generating modifications to an original dataset and we present a test oracle based on a threshold of acceptable instability. We implement our approach and evaluate it on several recommender algorithms using the MovieLens dataset and we find that influence-guided fuzzing can effectively find small sets of modifications that cause significantly more instability than random approaches. Adviser: Sebastian Elbaum}, urldate = {2018-05-08}, school = {University of Nebraska - Lincoln}, author = {Shriver, David}, collaborator = {Elbaum, Sebastian}, year = {2018}, note = {Publication Title: Computer Science and Engineering}, }
@book{kotkov_serendipity_2018, title = {Serendipity in recommender systems}, isbn = {978-951-39-7438-1}, url = {https://jyx.jyu.fi/handle/123456789/58207}, abstract = {The number of goods and services (such as accommodation or music streaming) offered by e-commerce websites does not allow users to examine all the available options in a reasonable amount of time. Recommender systems are auxiliary systems designed to help users find interesting goods or services (items) on a website when the number of available items is overwhelming. Traditionally, recommender systems have been optimized for accuracy, which indicates how often a user consumed the items recommended by system. To increase accuracy, recommender systems often suggest items that are popular and suitably similar to items these users have consumed in the past. As a result, users often lose interest in using these systems, as they either know about the recommended items already or can easily find these items themselves. One way to increase user satisfaction and user retention is to suggest serendipitous items. These items are items that users would not find themselves or even look for, but would enjoy consuming. Serendipity in recommender systems has not been thoroughly investigated. There is not even a consensus on the concept’s definition. In this dissertation, serendipitous items are defined as relevant, novel and unexpected to a user. In this dissertation, we (a) review different definitions of the concept and evaluate them in a user study, (b) assess the proportion of serendipitous items in a typical recommender system, (c) review ways to measure and improve serendipity, (d) investigate serendipity in cross-domain recommender systems (systems that take advantage of multiple domains, such as movies, songs and books) and (e) discuss challenges and future directions concerning this topic. We applied a Design Science methodology as the framework for this study and developed four artifacts: (1) a collection of eight variations of serendipity definition, (2) a measure of the serendipity of suggested items, (3) an algorithm that generates serendipitous suggestions, (4) a dataset of user feedback regarding serendipitous movies in the recommender system MovieLens. These artifacts are evaluated using suitable methods and communicated through publications.}, urldate = {2018-07-06}, publisher = {University of Jyväskylä}, author = {Kotkov, Denis}, year = {2018}, }
@article{de_pessemier_heart_2018, title = {Heart rate monitoring, activity recognition, and recommendation for e-coaching}, issn = {1380-7501}, url = {https://link.springer.com/article/10.1007/s11042-018-5640-2}, doi = {10.1007/s11042-018-5640-2}, abstract = {Equipped with hardware, such as accelerometer and heart rate sensor, wearables enable measuring physical activities and heart rate. However, the accuracy of these heart rate measurements is still unclear and the coupling with activity recognition is often missing in health apps. This study evaluates heart rate monitoring with four different device types: a specialized sports device with chest strap, a fitness tracker, a smart watch, and a smartphone using photoplethysmography. In a state of rest, similar measurement results are obtained with the four devices. During physical activities, the fitness tracker, smart watch, and smartphone measure sudden variations in heart rate with a delay, due to movements of the wrist. Moreover, this study showed that physical activities, such as squats and dumbbell curl, can be recognized with fitness trackers. By combining heart rate monitoring and activity recognition, personal suggestions for physical activities are generated using a tag-based recommender and rule-based filter.}, urldate = {2018-02-08}, journal = {Multimed. Tools Appl.}, author = {De Pessemier, Toon and Martens, Luc}, month = jan, year = {2018}, note = {Publisher: Springer US}, pages = {1--18}, }
@inproceedings{ekstrand_sturgeon_2017, series = {{FLAIRS} 30}, title = {Sturgeon and the {Cool} {Kids}: {Problems} with {Top}-{N} {Recommender} {Evaluation}}, url = {https://aaai.org/papers/639-flairs-2017-15534/}, abstract = {Top-N evaluation of recommender systems, typically carried out using metrics from information retrieval or machine learning, has several challenges. Two of these challenges are popularity bias, where the evaluation intrinsically favors algorithms that recommend popular items, and misclassified decoys, where items for which no user relevance is known are actually relevant to the user, but the evaluation is unaware and penalizes the recommender for suggesting them. One strategy for mitigating the misclassified decoy problem is the one-plus-random evaluation strategy and its generalization, which we call random decoys. In this work, we explore the random decoy strategy through both a theoretical treatment and an empirical study, but find little evidence to guide its tuning and show that it has complex and deleterious interactions with popularity bias.}, booktitle = {Proceedings of the 30th {Florida} {Artificial} {Intelligence} {Research} {Society} {Conference}}, publisher = {AAAI Press}, author = {Ekstrand, Michael D and Mahant, Vaibhav}, month = may, year = {2017}, }
@inproceedings{channamsetty_recommender_2017, title = {Recommender response to diversity and popularity bias in user profiles}, url = {https://aaai.org/papers/657-flairs-2017-15524/}, abstract = {Recommender system evaluation usually focuses on the overall effectiveness of the algorithms, either in terms of measurable accuracy or ability to deliver user satisfaction or improve business metrics. When additional factors are considered, such as the diversity or novelty of the recommendations, the focus typically remains on the algorithm’s overall performance. We examine the relationship of the recommender’s output characteristics – accuracy, popularity (as an inverse of novelty), and diversity – to characteristics of the user’s rating profile. The aims of this analysis are twofold: (1) to probe the conditions under which common algorithms produce more or less diverse or popular recommendations, and (2) to determine if these personalized recommender algorithms reflect a user’s preference for diversity or novelty. We trained recommenders on the MovieLens data and looked for correlation between the user profile and the recommender’s output for both diversity and popularity bias using different metrics. We find that the diversity and popularity of movies in users’ profiles has little impact on the recommendations they receive.}, urldate = {2017-05-29}, booktitle = {Proceedings of the 30th {Florida} artificial intelligence research society conference}, publisher = {AAAI Press}, author = {Channamsetty, Sushma and Ekstrand, Michael D}, month = may, year = {2017}, }
@inproceedings{sardianos_scaling_2017, title = {Scaling {Collaborative} {Filtering} to {Large}-{Scale} {Bipartite} {Rating} {Graphs} {Using} {Lenskit} and {Spark}}, url = {http://dx.doi.org/10.1109/BigDataService.2017.28}, doi = {10.1109/BigDataService.2017.28}, abstract = {Popular social networking applications such as Facebook, Twitter, Friendster, etc. generate very large graphs with different characteristics. These social networks are huge, comprising millions of nodes and edges that push existing graph mining algorithms and architectures to their limits. In product-rating graphs, users connect with each other and rate items in tandem. In such bipartite graphs users and items are the nodes and ratings are the edges and collaborative filtering algorithms use the edge information (i.e. user ratings for items) in order to suggest items of potential interest to users. Existing algorithms can hardly scale up to the size of the entire graph and require unlimited resources to finish. This work employs a machine learning method for predicting the performance of Collaborative Filtering algorithms using the structural features of the bipartite graphs. Using a fast graph partitioning algorithm and information from the user friendship graph, the original bipartite graph is partitioned into different schemes (i.e. sets of smaller bipartite graphs). The schemes are evaluated against the predicted performance of the Collaborative Filtering algorithm and the best partitioning scheme is employed for generating the recommendations. As a result, the Collaborative Filtering algorithms are applied to smaller bipartite graphs, using limited resources and allowing the problem to scale or be parallelized. Tests on a large, real-life, rating graph, show that the proposed method allows the collaborative filtering algorithms to run in parallel and complete using limited resources.}, author = {Sardianos, C and Varlamis, I and Eirinaki, M}, month = apr, year = {2017}, keywords = {Bipartite graph, Collaboration, Collaborative Filtering, Graph Metrics, Graph Partitioning, Lenskit, Machine learning algorithms, Partitioning algorithms, Prediction algorithms, Recommender Systems, Recommender systems, Social Networks, Social network services, Spark, bipartite graphs, collaborative filtering, collaborative filtering algorithms, data mining, fast graph partitioning algorithm, graph theory, large-scale bipartite rating graphs, learning (artificial intelligence), machine learning, product-rating graphs, social networking (online), social networking applications, structural features, user-friendship graph}, pages = {70--79}, }
@article{papadakis_scor_2017, title = {{SCoR}: {A} {Synthetic} {Coordinate} based {Recommender} system}, volume = {79}, issn = {0957-4174}, url = {http://www.sciencedirect.com/science/article/pii/S0957417417301070}, doi = {10.1016/j.eswa.2017.02.025}, abstract = {Recommender systems try to predict the preferences of users for specific items, based on an analysis of previous consumer preferences. In this paper, we propose SCoR, a Synthetic Coordinate based Recommendation system which is shown to outperform the most popular algorithmic techniques in the field, approaches like matrix factorization and collaborative filtering. SCoR assigns synthetic coordinates to nodes (users and items), so that the distance between a user and an item provides an accurate prediction of the user’s preference for that item. The proposed framework has several benefits. It is parameter free, thus requiring no fine tuning to achieve high performance, and is more resistance to the cold-start problem compared to other algorithms. Furthermore, it provides important annotations of the dataset, such as the physical detection of users and items with common and unique characteristics as well as the identification of outliers. SCoR is compared against nine other state-of-the-art recommender systems, sever of them based on the well known matrix factorization and two on collaborative filtering. The comparison is performed against four real datasets, including a brief version of the dataset used in the well known Netflix challenge. The extensive experiments prove that SCoR outperforms previous techniques while demonstrating its improved stability and high performance.}, journal = {Expert Syst. Appl.}, author = {Papadakis, Harris and Panagiotakis, Costas and Fragopoulou, Paraskevi}, month = aug, year = {2017}, keywords = {Graph, Matrix factorization, Netflix, Recommender systems, Synthetic coordinates, Vivaldi}, pages = {8--19}, }
@mastersthesis{solvang_video_2017, title = {Video {Recommendation} {Systems}: {Finding} a {Suitable} {Recommendation} {Approach} for an {Application} {Without} {Sufficient} {Data}}, url = {http://hdl.handle.net/10852/59239}, author = {Solvang, Marius Lørstad}, year = {2017}, }
@article{pera_recommending_2017, title = {Recommending books to be exchanged online in the absence of wish lists}, issn = {2330-1643}, url = {http://dx.doi.org/10.1002/asi.23978}, doi = {10.1002/asi.23978}, abstract = {An online exchange system is a web service that allows communities to trade items without the burden of manually selecting them, which saves users' time and effort. Even though online book-exchange systems have been developed, their services can further be improved by reducing the workload imposed on their users. To accomplish this task, we propose a recommendation-based book exchange system, called EasyEx, which identifies potential exchanges for a user solely based on a list of items the user is willing to part with. EasyEx is a novel and unique book-exchange system because unlike existing online exchange systems, it does not require a user to create and maintain a wish list, which is a list of items the user would like to receive as part of the exchange. Instead, EasyEx directly suggests items to users to increase serendipity and as a result expose them to items which may be unfamiliar, but appealing, to them. In identifying books to be exchanged, EasyEx employs known recommendation strategies, that is, personalized mean and matrix factorization, to predict book ratings, which are treated as the degrees of appeal to a user on recommended books. Furthermore, EasyEx incorporates OptaPlanner, which solves constraint satisfaction problems efficiently, as part of the recommendation-based exchange process to create exchange cycles. Experimental results have verified that EasyEx offers users recommended books that satisfy the users' interests and contributes to the item-exchange mechanism with a new design methodology.}, journal = {Journal of the Association for Information Science and Technology}, author = {Pera, Maria Soledad and Ng, Yiu-Kai}, month = nov, year = {2017}, }
@inproceedings{coba_rrecsys_2016, title = {rrecsys: {An} {R}-package for {Prototyping} {Recommendation} {Algorithms}}, url = {https://pdfs.semanticscholar.org/1856/b9e4c19a8ed34c3041911e43c0f3f9e1baa5.pdf}, abstract = {ABSTRACT We introduce rrecsys , an open source extension package in R for rapid prototyping and intuitive assessment of recommender system algorithms. As the only currently available R package for recommender algorithms (recommenderlab) did not}, author = {Çoba, Ludovik and Zanker, Markus}, year = {2016}, keywords = {toolkit}, }
@phdthesis{saha_multi-objective_2016, title = {A {Multi}-objective {Autotuning} {Framework} {For} {The} {Java} {Virtual} {Machine}}, url = {https://digital.library.txstate.edu/handle/10877/6096}, abstract = {Due to inherent limitations in performance, Java was not considered a suitable platform for for scalable high-performance computing (HPC) for a long time. The scenario is changing because of the development of frameworks like Hadoop, Spark and Fast-MPJ. In spite of the increase in usage, achieving high performance with Java is not trivial. High performance in Java relies on libraries providing explicit threads or relying on runnable-like interfaces for distributed programming. In this thesis, we develop an autotuning framework for JVM that manages multiple objective functions including execution time, power consumption, energy and perfomance-per-watt. The framework searches the combined space of JIT optimization sequences and different classes of JVM runtime parameters. To discover good configurations more quickly, the framework implements novel heuristic search algorithms. To reduce the size of the search space machine-learning based pruning techniques are used. Evaluation on recommender system workloads show that significant improvements in both performance and power can be gained by fine-tuning JVM runitme parameters.}, urldate = {2016-07-05}, school = {Texas State University}, author = {Saha, Shuvabrata}, month = apr, year = {2016}, }
@inproceedings{colucci_evaluating_2016, address = {New York, NY, USA}, title = {Evaluating {Item}-{Item} {Similarity} {Algorithms} for {Movies}}, url = {http://doi.acm.org/10.1145/2851581.2892362}, doi = {10.1145/2851581.2892362}, booktitle = {{CHI} {EA} '16}, publisher = {ACM}, author = {Colucci, Lucas and Doshi, Prachi and Lee, Kun-Lin and Liang, Jiajie and Lin, Yin and Vashishtha, Ishan and Zhang, Jia and Jude, Alvin}, year = {2016}, note = {Journal Abbreviation: CHI EA '16}, keywords = {algorithm evaluation, item-item similarity, recommender systems}, pages = {2141--2147}, }
@inproceedings{kharrat_recommendation_2016, title = {Recommendation system based contextual analysis of {Facebook} comment}, url = {http://dx.doi.org/10.1109/AICCSA.2016.7945792}, doi = {10.1109/AICCSA.2016.7945792}, abstract = {This paper present a new recommendation algorithm based on contextual analysis and new measurements. Social Network is one of the most popular Web 2.0 applications and related services, like Facebook, have evolved into a practical means for sharing opinions. Consequently, Social Network web sites have since become rich data sources for opinion mining. This paper proposes to introduce external resource from comments posted by users to predict recommendation and relieve the cold start problem. The novelty of the proposed approach is that posts are not simply characterized by an opinion score, as is the case with machine learning-based classifiers, but instead receive an opinion grade for each distinct notion in the post. Our approach has been implemented with Java and Lenskit framework; the study we have conducted on a movie dataset has shown competitive results. We compared our algorithm to SVD and Slope One algorithms. We have obtained an improvement of 8\% in precision and recall as well an improvement of 16\% in RMSE and nDCG.}, author = {Kharrat, F Ben and Elkhleifi, A and Faiz, R}, month = nov, year = {2016}, keywords = {Algorithm design and analysis, Classification algorithms, Collaboration, Collaborative filtering, Facebook, Motion pictures, Recommendation system, Recommender systems, Social network, User cold start, User profile}, pages = {1--6}, }
@phdthesis{salam_patrous_evaluating_2016, address = {Stockholm, Sweden}, title = {Evaluating {Prediction} {Accuracy} for {Collaborative} {Filtering} {Algorithms} in {Recommender} {Systems}}, url = {http://kth.diva-portal.org/smash/record.jsf?aq2=%5B%5B%5D%5D&c=1&af=%5B%5D&searchType=LIST_LATEST&query=&language=en&pid=diva2%3A927356&aq=%5B%5B%5D%5D&sf=all&aqe=%5B%5D&sortOrder=author_sort_asc&onlyFullText=false&noOfRows=50&dswid=-7195}, abstract = {Recommender systems are a relatively new technology that is commonly used by e-commerce websites and streaming services among others, to predict user opinion about products. This report studies two ...}, urldate = {2016-06-13}, school = {KTH Royal Institute of Technology}, author = {Salam Patrous, Ziad and Najafi, Safir}, year = {2016}, }
@phdthesis{chang_leveraging_2016, address = {Minneapolis, MN, USA}, title = {Leveraging {Collective} {Intelligence} in {Recommender} {System}}, url = {http://hdl.handle.net/11299/182725}, abstract = {Recommender systems, since their introduction 20 years ago, have been widely deployed in web services to alleviate user information overload. Driven by business objectives of their applications, the focus of recommender systems has shifted from accurately modeling and predicting user preferences to offering good personalized user experience. The later is difficult because there are many factors, e.g., tenure of a user, context of recommendation and transparency of recommender system, that affect users' perception of recommendations. Many of these factors are subjective and not easily quantifiable, posing challenges to recommender algorithms. When pure algorithmic solutions are at their limits in providing good user experience in recommender systems, we turn to the collective intelligence of human and computer. Computer and human are complementary to each other: computers are fast at computation and data processing and have accurate memory; humans are capable of complex reasoning, being creative and relating to other humans. In fact, such close collaborations between human and computer have precedent: after chess master Garry Kasparov lost to IBM computer ``Deep Blue'', he invited a new form of chess --- advanced chess, in which human player and a computer program teams up against such pairs. In this thesis, we leverage the collective intelligence of human and computer to tackle several challenges in recommender systems and demonstrate designs of such hybrid systems. We make contributions to the following aspects of recommender systems: providing better new user experience, enhancing topic modeling component for items, composing better recommendation sets and generating personalized natural language explanations. These four applications demonstrate different ways of designing systems with collective intelligence, applicable to domains other than recommender systems. We believe the collective intelligence of human and computer can power more intelligent, user friendly and creative systems, worthy of continuous research effort in future.}, urldate = {2016-11-01}, school = {University of Minnesota}, author = {Chang, Shuo}, month = aug, year = {2016}, }
@article{pessemier_hybrid_2016, title = {Hybrid group recommendations for a travel service}, volume = {75}, issn = {1380-7501}, url = {http://link.springer.com/article/10.1007/s11042-016-3265-x}, doi = {10.1007/s11042-016-3265-x}, abstract = {Recommendation techniques have proven their usefulness as a tool to cope with the information overload problem in many classical domains such as movies, books, and music. Additional challenges for recommender systems emerge in the domain of tourism such as acquiring metadata and feedback, the sparsity of the rating matrix, user constraints, and the fact that traveling is often a group activity. This paper proposes a recommender system that offers personalized recommendations for travel destinations to individuals and groups. These recommendations are based on the users’ rating profile, personal interests, and specific demands for their next destination. The recommendation algorithm is a hybrid approach combining a content-based, collaborative filtering, and knowledge-based solution. For groups of users, such as families or friends, individual recommendations are aggregated into group recommendations, with an additional opportunity for users to give feedback on these group recommendations. A group of test users evaluated the recommender system using a prototype web application. The results prove the usefulness of individual and group recommendations and show that users prefer the hybrid algorithm over each individual technique. This paper demonstrates the added value of various recommendation algorithms in terms of different quality aspects, compared to an unpersonalized list of the most-popular destinations.}, number = {5}, urldate = {2016-03-11}, journal = {Multimed. Tools Appl.}, author = {Pessemier, Toon De and Dhondt, Jeroen and Martens, Luc}, month = jan, year = {2016}, pages = {1--25}, }
@phdthesis{nguyen_enhancing_2016, address = {Minneapolis, MN, USA}, title = {Enhancing {User} {Experience} {With} {Recommender} {Systems} {Beyond} {Prediction} {Accuracies}}, url = {http://hdl.handle.net/11299/182780}, abstract = {In this dissertation, we examine to improve the user experience with recommender systems beyond prediction accuracy. We focus on the following aspects of the user experience. In chapter 3 we examine if a recommender system exposes users to less diverse contents over time. In chapter 4 we look at the relationships between user personality and user preferences for recommendation diversity, popularity, and serendipity. In chapter 5 we investigate the relations between the self-reported user satisfaction and the three recommendation properties with the inferred user recommendation consumption. In chapter 6 we look at four different rating inter- faces and evaluated how these interfaces affected the user rating experience. We find that over time a recommender system exposes users to less-diverse contents and that users rate less-diverse items. However, users who took recommendations were exposed to more diverse recommendations than those who did not. Furthermore, users with different personalities have different preferences for recommendation diversity, popularity, and serendipity (e.g. some users prefer more diverse recommendations, while others prefer similar ones). We also find that user satisfaction with recommendation popularity and serendipity measured with survey questions strongly relate to user recommendation consumption inferred with logged data. We then propose a way to get better signals about user preferences and help users rate items in the recommendation systems more consistently. That is providing exemplars to users at the time they rate the items improved the consistency of users’ ratings. Our results suggest several ways recommender system practitioners and re- searchers can enrich the user experience. For example, by integrating users’ personality into recommendation frameworks, we can help recommender systems deliver recommendations with the preferred levels of diversity, popularity, and serendipity to individual users. We can also facilitate the rating process by integrating a set of proven rating-support techniques into the systems’ interfaces.}, urldate = {2016-11-01}, school = {University of Minnesota}, author = {Nguyen, Tien}, month = aug, year = {2016}, }
@misc{noauthor_machine_2016, title = {Machine ‘{Unlearning}’ {Technique} {Wipes} {Out} {Unwanted} {Data} {Quickly} and {Completely}}, url = {http://www.scientificcomputing.com/news/2016/03/machine-unlearning-technique-wipes-out-unwanted-data-quickly-and-completely}, abstract = {Cao and Yang believe that easy adoption of forgetting systems will be increasingly in demand. The pair has developed a way to do it faster and more effectively than what is currently available. Their concept, called "machine unlearning," is so promising that the duo have been awarded a four-year, \$1.2 million National Science Foundation grant — split between Lehigh and Columbia — to develop the approach.}, urldate = {2016-03-16}, month = mar, year = {2016}, }
@article{MovieLens, title = {The {MovieLens} datasets: history and context}, volume = {5}, issn = {2160-6455}, url = {http://doi.acm.org/10.1145/2827872}, doi = {10.1145/2827872}, abstract = {The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.}, number = {4}, urldate = {2016-03-11}, journal = {ACM Transactions on Interactive Intelligent Systems}, author = {Harper, F Maxwell and Konstan, Joseph A}, month = dec, year = {2015}, keywords = {dataset}, pages = {19:1--19:19}, }
@inproceedings{harper_putting_2015, address = {New York, NY, USA}, title = {Putting {Users} in {Control} of {Their} {Recommendations}}, url = {http://doi.acm.org/10.1145/2792838.2800179}, doi = {10.1145/2792838.2800179}, abstract = {The essence of a recommender system is that it can recommend items personalized to the preferences of an individual user. But typically users are given no explicit control over this personalization, and are instead left guessing about how their actions affect the resulting recommendations. We hypothesize that any recommender algorithm will better fit some users' expectations than others, leaving opportunities for improvement. To address this challenge, we study a recommender that puts some control in the hands of users. Specifically, we build and evaluate a system that incorporates user-tuned popularity and recency modifiers, allowing users to express concepts like "show more popular items". We find that users who are given these controls evaluate the resulting recommendations much more positively. Further, we find that users diverge in their preferred settings, confirming the importance of giving control to users.}, urldate = {2015-09-19}, booktitle = {{RecSys} '15}, publisher = {ACM}, author = {Harper, F Maxwell and Xu, Funing and Kaur, Harmanpreet and Condiff, Kyle and Chang, Shuo and Terveen, Loren}, year = {2015}, note = {Journal Abbreviation: RecSys '15}, pages = {3--10}, }
@inproceedings{chang_using_2015, address = {New York, NY, USA}, title = {Using {Groups} of {Items} for {Preference} {Elicitation} in {Recommender} {Systems}}, url = {http://doi.acm.org/10.1145/2675133.2675210}, doi = {10.1145/2675133.2675210}, abstract = {To achieve high quality initial personalization, recommender systems must provide an efficient and effective process for new users to express their preferences. We propose that this goal is best served not by the classical method where users begin by expressing preferences for individual items - this process is an inefficient way to convert a user's effort into improved personalization. Rather, we propose that new users can begin by expressing their preferences for groups of items. We test this idea by designing and evaluating an interactive process where users express preferences across groups of items that are automatically generated by clustering algorithms. We contribute a strategy for recommending items based on these preferences that is generalizable to any collaborative filtering-based system. We evaluate our process with both offline simulation methods and an online user experiment. We find that, as compared with a baseline rate-15-items interface, (a) users are able to complete the preference elicitation process in less than half the time, and (b) users are more satisfied with the resulting recommended items. Our evaluation reveals several advantages and other trade-offs involved in moving from item-based preference elicitation to group-based preference elicitation.}, urldate = {2015-09-19}, booktitle = {{CSCW} '15}, publisher = {ACM}, author = {Chang, Shuo and Harper, F Maxwell and Terveen, Loren}, year = {2015}, note = {Journal Abbreviation: CSCW '15}, pages = {1258--1269}, }
@inproceedings{magnuson_event_2015, address = {New York, NY, USA}, title = {Event {Recommendation} {Using} {Twitter} {Activity}}, url = {http://doi.acm.org/10.1145/2792838.2796556}, doi = {10.1145/2792838.2796556}, abstract = {User interactions with Twitter (social network) frequently take place on mobile devices - a user base that it strongly caters to. As much of Twitter's traffic comes with geo-tagging information associated with it, it is a natural platform for geographic recommendations. This paper proposes an event recommender system for Twitter users, which identifies twitter activity co-located with previous events, and uses it to drive geographic recommendations via item-based collaborative filtering.}, urldate = {2015-09-19}, booktitle = {{RecSys} '15}, publisher = {ACM}, author = {Magnuson, Axel and Dialani, Vijay and Mallela, Deepa}, year = {2015}, note = {Journal Abbreviation: RecSys '15}, pages = {331--332}, }
@article{ghoshal_recommendations_2015, title = {Recommendations {Using} {Information} from {Multiple} {Association} {Rules}: {A} {Probabilistic} {Approach}}, volume = {26}, issn = {1047-7047}, url = {http://pubsonline.informs.org/doi/abs/10.1287/isre.2015.0583}, doi = {10.1287/isre.2015.0583}, abstract = {Business analytics has evolved from being a novelty used by a select few to an accepted facet of conducting business. Recommender systems form a critical component of the business analytics toolkit and, by enabling firms to effectively target customers with products and services, are helping alter the e-commerce landscape. A variety of methods exist for providing recommendations, with collaborative filtering, matrix factorization, and association-rule-based methods being the most common. In this paper, we propose a method to improve the quality of recommendations made using association rules. This is accomplished by combining rules when possible and stands apart from existing rule-combination methods in that it is strongly grounded in probability theory. Combining rules requires the identification of the best combination of rules from the many combinations that might exist, and we use a maximum-likelihood framework to compare alternative combinations. Because it is impractical to apply the maximum likelihood framework directly in real time, we show that this problem can equivalently be represented as a set partitioning problem by translating it into an information theoretic context—the best solution corresponds to the set of rules that leads to the highest sum of mutual information associated with the rules. Through a variety of experiments that evaluate the quality of recommendations made using the proposed approach, we show that (i) a greedy heuristic used to solve the maximum likelihood estimation problem is very effective, providing results comparable to those from using the optimal set partitioning solution; (ii) the recommendations made by our approach are more accurate than those made by a variety of state-of-the-art benchmarks, including collaborative filtering and matrix factorization; and (iii) the recommendations can be made in a fraction of a second on a desktop computer, making it practical to use in real-world applications.}, number = {3}, urldate = {2015-09-19}, journal = {Information Systems Research}, author = {Ghoshal, Abhijeet and Menon, Syam and Sarkar, Sumit}, month = jul, year = {2015}, pages = {532--551}, }
link bibtex
@inproceedings{wischenbart_recommender_2015, title = {Recommender {Systems} for the {People} — {Enhancing} {Personalization} in {Web} {Augmentation}}, author = {Wischenbart, Martin and Firmenich, Sergio and Rossi, Gustavo and Wimmer, Manuel}, month = sep, year = {2015}, }
@incollection{chowdhury_boostmf_2015, series = {Lecture {Notes} in {Computer} {Science}}, title = {{BoostMF}: {Boosted} {Matrix} {Factorisation} for {Collaborative} {Ranking}}, isbn = {978-3-319-23524-0}, url = {http://link.springer.com/chapter/10.1007/978-3-319-23525-7_1}, urldate = {2015-09-19}, booktitle = {Machine {Learning} and {Knowledge} {Discovery} in {Databases}}, publisher = {Springer International Publishing}, author = {Chowdhury, Nipa and Cai, Xiongcai and Luo, Cheng}, editor = {Appice, Annalisa and Rodrigues, Pedro Pereira and Costa, Vítor Santos and Gama, João and Jorge, Alípio and Soares, Carlos}, month = sep, year = {2015}, pages = {3--18}, }
@incollection{kille_stream-based_2015, series = {Lecture {Notes} in {Computer} {Science}}, title = {Stream-{Based} {Recommendations}: {Online} and {Offline} {Evaluation} as a {Service}}, isbn = {978-3-319-24026-8}, url = {http://link.springer.com/chapter/10.1007/978-3-319-24027-5_48}, urldate = {2015-09-19}, booktitle = {Experimental {IR} {Meets} {Multilinguality}, {Multimodality}, and {Interaction}}, publisher = {Springer International Publishing}, author = {Kille, Benjamin and Lommatzsch, Andreas and Turrin, Roberto and Serény, András and Larson, Martha and Brodt, Torben and Seiler, Jonas and Hopfgartner, Frank}, editor = {Mothe, Josiane and Savoy, Jacques and Kamps, Jaap and Pinel-Sauvagnat, Karen and Jones, Gareth J F and SanJuan, Eric and Cappellato, Linda and Ferro, Nicola}, year = {2015}, pages = {497--517}, }
@phdthesis{ek_recommender_2015, address = {Gothenburg, Sweden}, title = {Recommender {Systems}; {Contextual} {Multi}-{Armed} {Bandit} {Algorithms} for the purpose of targeted advertisement within e-commerce}, url = {http://publications.lib.chalmers.se/records/fulltext/219662/219662.pdf}, school = {Chalmers University of Technology}, author = {Ek, Frederik and Stigsson, Robert}, year = {2015}, }
@article{christou_amore_2015, title = {{AMORE}: design and implementation of a commercial-strength parallel hybrid movie recommendation engine}, issn = {0219-1377}, url = {http://link.springer.com.libproxy.txstate.edu/article/10.1007/s10115-015-0866-z}, doi = {10.1007/s10115-015-0866-z}, urldate = {2015-09-19}, journal = {Knowl. Inf. Syst.}, author = {Christou, Ioannis T and Amolochitis, Emmanouil and Tan, Zheng-Hua}, month = aug, year = {2015}, pages = {1--26}, }
@inproceedings{cao_towards_2015, title = {Towards {Making} {Systems} {Forget} with {Machine} {Unlearning}}, url = {http://www.ieee-security.org/TC/SP2015/papers-archived/6949a463.pdf}, abstract = {Today’s systems produce a wealth of data every day, and the data further generates more data, i.e., the derived data, forming into a complex data propagation network, defined as the data’s lineage. There are many reasons for users and administrators to forget certain data including the data’s lineage. From the privacy perspective, a system may leak private information of certain users, and those users unhappy about privacy leaks naturally want to forget their data and its lineage. From the security perspective, an anomaly detection system can be polluted by adversaries through injecting manually crafted data into the training set. Therefore, we envision forgetting systems, capable of completely forgetting certain data and its lineage. In this paper, we focus on making learning systems forget, the process of which is defined as machine unlearning or unlearning. To perform unlearning upon learning system, we present general unlearning criteria, i.e., converting a learning system or part of it into a summation form of statistical query learning model, and updating all the summations to achieve unlearning. Then, we integrate our unlearning criteria into an unlearning architecture that interacts with all the components of a learning system, such as sample clustering and feature selection. To demonstrate our unlearning criteria and architecture, we select four real-world learning systems, including an item-item recommendation system, an online social network spam filter, and a malware detection system. These systems are first exposed to an adversarial environment, e.g., if the system is potentially vulnerable to training data pollution, we first pollute the training data set and show that the detection rate drops significantly. Then, we apply our unlearning technique upon those affected systems, either polluted or leaking private information. Our results show that after unlearning, the detection rate of a polluted system increases back to the one before pollution, and a system leaking a particular user’s private information completely forgets that information.}, publisher = {IEEE}, author = {Cao, Yinzhi and Yang, Junfeng}, month = may, year = {2015}, }
@inproceedings{elkhelifi_recommendation_2015, title = {Recommendation {Systems} {Based} on {Online} {User}'s {Action}}, url = {http://dx.doi.org/10.1109/CIT/IUCC/DASC/PICOM.2015.69}, doi = {10.1109/CIT/IUCC/DASC/PICOM.2015.69}, abstract = {In this paper, we propose a new recommender algorithm based on multi-dimensional users behavior and new measurements. It's used in the framework of our recommender system that use knowledge discovery techniques to the problem of making product recommendations during a live user interaction. Most of Collaborative filtering algorithms based on user's rating or similar item that other users bought, we propose to combine all user's action to predict recommendation. These systems are achieving widespread success in E-tourism nowadays. We evaluate our algorithm on tourism dataset. Evaluations have shown good results. We compared our algorithm to Slope One and Weight Slope One. We obtained an improvement of 5\% in precision and recall. And an improvement of 12\% in RMSE and nDCG.}, author = {Elkhelifi, A and Kharrat, F Ben and Faiz, R}, month = oct, year = {2015}, pages = {485--490}, }
@inproceedings{dragovic_exploiting_2015, title = {Exploiting {Reviews} to {Guide} {Users}’ {Selections}}, url = {http://ceur-ws.org/Vol-1441/recsys2015_poster7.pdf}, urldate = {2017-03-01}, author = {Dragovic, Nevena and Pera, Maria Soledad}, year = {2015}, }
@inproceedings{larrain_towards_2015, title = {Towards {Improving} {Top}-{N} {Recommendation} by {Generalization} of {SLIM}}, url = {http://ceur-ws.org/Vol-1441/recsys2015_poster22.pdf}, author = {Larraín, Santiago and Parra, Denis and Soto, Alvaro}, year = {2015}, }
link bibtex
@phdthesis{dhondt_hybrid_2015, address = {Gent, Belgium}, title = {A hybrid group recommender system for travel destinations}, school = {University of Gent}, author = {Dhondt, Jeroen}, month = may, year = {2015}, }
@inproceedings{ekstrand_user_2014, address = {New York, NY, USA}, title = {User perception of differences in movie recommendation algorithms}, url = {http://dx.doi.org/10.1145/2645710.2645737}, doi = {10.1145/2645710.2645737}, abstract = {Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithm implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations. In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.}, booktitle = {Proceedings of the {Eighth} {ACM} {Conference} on {Recommender} {Systems}}, publisher = {ACM}, author = {Ekstrand, Michael D and Harper, F Maxwell and Willemsen, Martijn C and Konstan, Joseph A}, month = oct, year = {2014}, note = {Journal Abbreviation: RecSys '14}, pages = {161--168}, }
@inproceedings{said_comparative_2014, address = {New York, NY, USA}, title = {Comparative {Recommender} {System} {Evaluation}: {Benchmarking} {Recommendation} {Frameworks}}, url = {http://dx.doi.org/10.1145/2645710.2645746}, doi = {10.1145/2645710.2645746}, abstract = {Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithmic implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations. In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.}, urldate = {2017-02-03}, booktitle = {{RecSys} '14}, publisher = {ACM Press}, author = {Said, Alan and Bellogin, Alejandro}, month = oct, year = {2014}, note = {Journal Abbreviation: RecSys '14}, keywords = {toolkit}, pages = {129--136}, }
@phdthesis{ekstrand_towards_2014, address = {Minneapolis, MN}, title = {Towards {Recommender} {Engineering}: {Tools} and {Experiments} in {Recommender} {Differences}}, url = {http://hdl.handle.net/11299/165307}, abstract = {Since the introduction of their modern form 20 years ago, recommender systems have proven a valuable tool for help users manage information overload. Two decades of research have produced many algorithms for computing recommendations, mechanisms for evaluating their effectiveness, and user interfaces and experiences to embody them. It has also been found that the outputs of different recommendation algorithms differ in user-perceptible ways that affect their suitability to different tasks and information needs. However, there has been little work to systematically map out the space of algorithms and the characteristics they exhibit that makes them more or less effective in different applications. As a result, developers of recommender systems must experiment, conducting basic science on each application and its users to determine the approach(es) that will meet their needs. This thesis presents our work towards recommender engineering: the design of recommender systems from well-understood principles of user needs, domain properties, and algorithm behaviors. This will reduce the experimentation required for each new recommender application, allowing developers to design recommender systems that are likely to be effective for their particular application. To that end, we make four contributions: the LensKit toolkit for conducting experiments on a wide variety of recommender algorithms and data sets under different experimental conditions (offline experiments with diverse metrics, online user studies, and the ability to grow to support additional methodologies), along with new developments in object-oriented software configuration to support this toolkit; experiments on the configuration options of widely-used algorithms to provide guidance on tuning and configuring them; an offline experiment on the differences in the errors made by different algorithms; and a user study on the user-perceptible differences between lists of movie recommendations produced by three common recommender algorithms. Much research is needed to fully realize the vision of recommender engineering in the coming years; it is our hope that LensKit will prove a valuable foundation for much of this work, and our experiments represent a small piece of the kinds of studies that must be carried out, replicated, and validated to enable recommender systems to be engineered.}, school = {University of Minnesota}, author = {Ekstrand, Michael D}, collaborator = {Konstan, Joseph A}, month = jul, year = {2014}, note = {Publication Title: Computer Science and Engineering Volume: Ph.D}, }
@inproceedings{konstan_teaching_2014, address = {New York, NY, USA}, title = {Teaching {Recommender} {Systems} at {Large} {Scale}: {Evaluation} and {Lessons} {Learned} from a {Hybrid} {MOOC}}, url = {http://doi.acm.org/10.1145/2556325.2566244}, doi = {10.1145/2556325.2566244}, abstract = {In Fall 2013 we offered an open online Introduction to Recommender Systems through Coursera, while simultaneously offering a for-credit version of the course on-campus using the Coursera platform and a flipped classroom instruction model. As the goal of offering this course was to experiment with this type of instruction, we performed extensive evaluation including surveys of demographics, self-assessed skills, and learning intent; we also designed a knowledge-assessment tool specifically for the subject matter in this course, administering it before and after the course to measure learning. We also tracked students through the course, including separating out students enrolled for credit from those enrolled only for the free, open course. This article reports on our findings.}, urldate = {2014-03-19}, booktitle = {L@{S} '14}, publisher = {ACM}, author = {Konstan, Joseph A and Walker, J D and Brooks, D Christopher and Brown, Keith and Ekstrand, Michael D}, month = mar, year = {2014}, note = {Journal Abbreviation: L@S '14}, pages = {61--70}, }
@inproceedings{kluver_evaluating_2014, title = {Evaluating {Recommender} {Behavior} for {New} {Users}}, url = {http://dx.doi.org/10.1145/2645710.2645742}, doi = {10.1145/2645710.2645742}, publisher = {ACM}, author = {Kluver, Daniel and Konstan, Joseph A}, month = oct, year = {2014}, }
@article{de_nart_personalized_2014, title = {A {Personalized} {Concept}-driven {Recommender} {System} for {Scientific} {Libraries}}, volume = {38}, issn = {1877-0509}, url = {http://www.sciencedirect.com/science/article/pii/S1877050914013751}, doi = {10.1016/j.procs.2014.10.015}, abstract = {Recommender Systems can greatly enhance the exploitation of large digital libraries; however, in order to achieve good accuracy with collaborative recommenders some domain assumptions must be met, such as having a large number of users sharing similar interests over time. Such assumptions may not hold in digital libraries, where users are structured in relatively small groups of experts whose interests may change in unpredictable ways: this is the case of scientific and technical documents archives. Moreover, when recommending documents, users often expect insights on the recommended content as well as a detailed explanation of why the system has selected it, which cannot be provided by collaborative techniques. In this paper we consider the domain of scientific publications repositories and propose a content-based recommender based upon a graph representation of concepts built up by linked keyphrases. This recommender is coupled with a keyphrase extraction system able to generate meaningful metadata for the documents, which are the basis for providing helpful and explainable recommendations.}, urldate = {2015-09-23}, journal = {Procedia Comput. Sci.}, author = {De Nart, D and Tasso, C}, year = {2014}, pages = {84--91}, }
@inproceedings{nguyen_improving_2014, address = {New York, NY, USA}, title = {Improving {Recommender} {Systems}: {User} {Roles} and {Lifecycles}}, url = {http://doi.acm.org/10.1145/2645710.2653363}, doi = {10.1145/2645710.2653363}, abstract = {In the era of big data, it is usually agreed that the more data we have, the better results we can get. However, for some domains that heavily depend on user inputs (such as recommender systems), the performance evaluation metrics are sensitive to the amount of noise introduced by users. Such noise can be from users who only wanted to explore the systems, and thus did not spend efforts to provide accurate inputs. Noise can also be introduced by the methods of collecting user ratings. In my dissertation, I study how user data can affect prediction accuracies and performances of recommendation algorithms. To that end, I investigate how the data collection methods and the life cycles of users affect the prediction accuracies and the performance of recommendation algorithms.}, urldate = {2015-09-23}, booktitle = {{RecSys} '14}, publisher = {ACM}, author = {Nguyen, Tien T}, year = {2014}, note = {Journal Abbreviation: RecSys '14}, pages = {417--420}, }
@inproceedings{zhao_privacy-aware_2014, address = {ICST, Brussels, Belgium, Belgium}, title = {Privacy-aware {Location} {Privacy} {Preference} {Recommendations}}, url = {http://dx.doi.org/10.4108/icst.mobiquitous.2014.258017}, doi = {10.4108/icst.mobiquitous.2014.258017}, abstract = {Location-Based Services have become increasingly popular due to the prevalence of smart devices and location-sharing applications such as Facebook and Foursquare. The protection of people's sensitive location data in such applications is an important requirement. Conventional location privacy protection methods, however, such as manually defining privacy rules or asking users to make decisions each time they enter a new location may be overly complex, intrusive or unwieldy. An alternative is to use machine learning to predict people's privacy preferences and automatically configure settings. Model-based machine learning classifiers may be too computationally complex to be used in real-world applications, or suffer from poor performance when training data are insufficient. In this paper we propose a location-privacy recommender that can provide people with recommendations of appropriate location privacy settings through user-user collaborative filtering. Using a real-world location-sharing dataset, we show that the prediction accuracy of our scheme (73.08\%) is similar to the best performance of model-based classifiers (75.30\%), and at the same time causes fewer privacy leaks (11.75\% vs 12.70\%). Our scheme further outperforms model-based classifiers when there are insufficient training data. Since privacy preferences are innately private, we make our recommender privacy-aware by obfuscating people's preferences. Our results show that obfuscation leads to a minimal loss of prediction accuracy (0.76\%).}, urldate = {2015-09-23}, booktitle = {{MOBIQUITOUS} '14}, publisher = {ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)}, author = {Zhao, Yuchen and Ye, Juan and Henderson, Tristan}, year = {2014}, note = {Journal Abbreviation: MOBIQUITOUS '14}, pages = {120--129}, }
@article{amolochitis_implementing_2014, title = {Implementing a {Commercial}-{Strength} {Parallel} {Hybrid} {Movie} {Recommendation} {Engine}}, volume = {29}, issn = {1541-1672}, url = {http://dx.doi.org/10.1109/MIS.2014.23}, doi = {10.1109/MIS.2014.23}, abstract = {AMORE is a hybrid recommendation system that provides movie recommendations for a major triple-play services provider in Greece. Combined with our own implementations of several user-, item-, and content-based recommendation algorithms, AMORE significantly outperforms other state-of-the-art implementations both in solution quality and response time. AMORE currently serves daily recommendation requests for all active subscribers of the provider's video-on-demand services and has contributed to an increase of rental profits and customer retention.}, number = {2}, journal = {IEEE Intell. Syst.}, author = {Amolochitis, E and Christou, I T and Tan, Zheng-Hua}, month = mar, year = {2014}, pages = {92--96}, }
@inproceedings{nguyen_rating_2013, address = {New York, NY, USA}, title = {Rating {Support} {Interfaces} to {Improve} {User} {Experience} and {Recommender} {Accuracy}}, url = {http://doi.acm.org/10.1145/2507157.2507188}, doi = {10.1145/2507157.2507188}, abstract = {One of the challenges for recommender systems is that users struggle to accurately map their internal preferences to external measures of quality such as ratings. We study two methods for supporting the mapping process: (i) reminding the user of characteristics of items by providing personalized tags and (ii) relating rating decisions to prior rating decisions using exemplars. In our study, we introduce interfaces that provide these methods of support. We also present a set of methodologies to evaluate the efficacy of the new interfaces via a user experiment. Our results suggest that presenting exemplars during the rating process helps users rate more consistently, and increases the quality of the data.}, urldate = {2014-04-28}, booktitle = {{RecSys} '13}, publisher = {ACM}, author = {Nguyen, Tien T and Kluver, Daniel and Wang, Ting-Yu and Hui, Pik-Mai and Ekstrand, Michael D and Willemsen, Martijn C and Riedl, John}, year = {2013}, note = {Journal Abbreviation: RecSys '13}, pages = {149--156}, }
@article{benjamin_heitmann_technical_2013, title = {Technical {Report} on evaluation of recommendations generated by spreading activation}, url = {http://www.researchgate.net/publication/237020679_Technical_Report_on_evaluation_of_recommendations_generated_by_spreading_activation}, author = {Benjamin Heitmann, Conor Hayes}, year = {2013}, }
@inproceedings{ekstrand_when_2012, address = {New York, NY, USA}, title = {When recommenders fail: predicting recommender failure for algorithm selection and combination}, url = {http://doi.acm.org/10.1145/2365952.2366002}, doi = {10.1145/2365952.2366002}, abstract = {Hybrid recommender systems --- systems using multiple algorithms together to improve recommendation quality --- have been well-known for many years and have shown good performance in recent demonstrations such as the NetFlix Prize. Modern hybridization techniques, such as feature-weighted linear stacking, take advantage of the hypothesis that the relative performance of recommenders varies by circumstance and attempt to optimize each item score to maximize the strengths of the component recommenders. Less attention, however, has been paid to understanding what these strengths and failure modes are. Understanding what causes particular recommenders to fail will facilitate better selection of the component recommenders for future hybrid systems and a better understanding of how individual recommender personalities can be harnessed to improve the recommender user experience. We present an analysis of the predictions made by several well-known recommender algorithms on the MovieLens 10M data set, showing that for many cases in which one algorithm fails, there is another that will correctly predict the rating.}, urldate = {2012-12-13}, booktitle = {{RecSys} '12}, publisher = {ACM}, author = {Ekstrand, Michael D and Riedl, John T}, year = {2012}, note = {Journal Abbreviation: RecSys '12}, pages = {233--236}, }
@inproceedings{kluver_how_2012, address = {New York, NY, USA}, title = {How many bits per rating?}, url = {http://doi.acm.org/10.1145/2365952.2365974}, doi = {10.1145/2365952.2365974}, abstract = {Most recommender systems assume user ratings accurately represent user preferences. However, prior research shows that user ratings are imperfect and noisy. Moreover, this noise limits the measurable predictive power of any recommender system. We propose an information theoretic framework for quantifying the preference information contained in ratings and predictions. We computationally explore the properties of our model and apply our framework to estimate the efficiency of different rating scales for real world datasets. We then estimate how the amount of information predictions give to users is related to the scale ratings are collected on. Our findings suggest a tradeoff in rating scale granularity: while previous research indicates that coarse scales (such as thumbs up / thumbs down) take less time, we find that ratings with these scales provide less predictive value to users. We introduce a new measure, preference bits per second, to quantitatively reconcile this tradeoff.}, urldate = {2013-09-12}, booktitle = {{RecSys} '12}, publisher = {ACM}, author = {Kluver, Daniel and Nguyen, Tien T and Ekstrand, Michael and Sen, Shilad and Riedl, John}, year = {2012}, note = {Journal Abbreviation: RecSys '12}, pages = {99--106}, }
@inproceedings{schelter_scalable_2012, address = {New York, NY, USA}, title = {Scalable {Similarity}-based {Neighborhood} {Methods} with {MapReduce}}, url = {http://doi.acm.org/10.1145/2365952.2365984}, doi = {10.1145/2365952.2365984}, abstract = {Similarity-based neighborhood methods, a simple and popular approach to collaborative filtering, infer their predictions by finding users with similar taste or items that have been similarly rated. If the number of users grows to millions, the standard approach of sequentially examining each item and looking at all interacting users does not scale. To solve this problem, we develop a MapReduce algorithm for the pairwise item comparison and top-N recommendation problem that scales linearly with respect to a growing number of users. This parallel algorithm is able to work on partitioned data and is general in that it supports a wide range of similarity measures. We evaluate our algorithm on a large dataset consisting of 700 million song ratings from Yahoo! Music.}, urldate = {2015-09-23}, booktitle = {{RecSys} '12}, publisher = {ACM}, author = {Schelter, Sebastian and Boden, Christoph and Markl, Volker}, year = {2012}, note = {Journal Abbreviation: RecSys '12}, pages = {163--170}, }
@article{guimera_predicting_2012, title = {Predicting {Human} {Preferences} {Using} the {Block} {Structure} of {Complex} {Social} {Networks}}, volume = {7}, url = {http://dx.doi.org/10.1371/journal.pone.0044620}, doi = {10.1371/journal.pone.0044620}, abstract = {With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a “new” computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38\% and 99\% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups.}, number = {9}, urldate = {2014-10-04}, journal = {PLoS One}, author = {Guimerà, Roger and Llorente, Alejandro and Moro, Esteban and Sales-Pardo, Marta}, month = sep, year = {2012}, pages = {e44620}, }