Research with LensKit
LensKit is intended to be particularly useful in recommender systems research.
If you use LensKit in published research, cite:
BibTeX
@inproceedings{LKPY,
title={LensKit for Python: Next-Generation Software for Recommender Systems Experiments},
booktitle={Proceedings of the 29th ACM International Conference on Information and Knowledge Management},
DOI={10.1145/3340531.3412778},
author={Ekstrand, Michael D.},
year={2020},
month={Oct},
extra={arXiv:1809.03125}
}
We would appreciate it if you sent a copy of your published paper to ekstrand@acm.org, so we can know where LensKit is being used and add it to this list. Following is a list of papers that have used the Python version of LensKit; we maintain a separate list of ones using the Java version.

<script src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2F3TB3KT36%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1&jsonp=1"></script>
<?php
$contents = file_get_contents("https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2F3TB3KT36%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1");
print_r($contents);
?>
<iframe src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2F3TB3KT36%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1"></iframe>
For more details see the documention.
To the site owner:
Action required! Mendeley is changing its API. In order to keep using Mendeley with BibBase past April 14th, you need to:
- renew the authorization for BibBase on Mendeley, and
- update the BibBase URL in your page the same way you did when you initially set up this page.
@misc{dilworth_privacy_2025, title = {Privacy {Preservation} through {Practical} {Machine} {Unlearning}}, url = {http://arxiv.org/abs/2502.10635}, doi = {10.48550/arXiv.2502.10635}, abstract = {Machine Learning models thrive on vast datasets, continuously adapting to provide accurate predictions and recommendations. However, in an era dominated by privacy concerns, Machine Unlearning emerges as a transformative approach, enabling the selective removal of data from trained models. This paper examines methods such as Naive Retraining and Exact Unlearning via the SISA framework, evaluating their Computational Costs, Consistency, and feasibility using the \${\textbackslash}texttt\{HSpam14\}\$ dataset. We explore the potential of integrating unlearning principles into Positive Unlabeled (PU) Learning to address challenges posed by partially labeled datasets. Our findings highlight the promise of unlearning frameworks like \${\textbackslash}textit\{DaRE\}\$ for ensuring privacy compliance while maintaining model performance, albeit with significant computational trade-offs. This study underscores the importance of Machine Unlearning in achieving ethical AI and fostering trust in data-driven systems.}, urldate = {2025-05-22}, publisher = {arXiv}, author = {Dilworth, Robert}, month = feb, year = {2025}, note = {arXiv:2502.10635 [cs]}, }
doi link bibtex abstract
@inproceedings{slokom_how_2025, address = {Cham}, title = {How to {Diversify} any {Personalized} {Recommender}?}, isbn = {978-3-031-88717-8}, doi = {10.1007/978-3-031-88717-8_23}, abstract = {In this paper, we introduce a novel approach to improve the diversity of Top-N recommendations while maintaining accuracy. Our approach employs a user-centric pre-processing strategy aimed at exposing users to a wide array of content categories and topics. We personalize this strategy by selectively adding and removing a percentage of interactions from user profiles. This personalization ensures we remain closely aligned with user preferences while gradually introducing distribution shifts. Our pre-processing technique offers flexibility and can seamlessly integrate into any recommender architecture. We run extensive experiments on two publicly available data sets for news and book recommendations to evaluate our approach. We test various standard and neural network-based recommender system algorithms. Our results show that our approach generates diverse recommendations, ensuring users are exposed to a wider range of items. Furthermore, using pre-processed data for training leads to recommender systems achieving performance levels comparable to, and in some cases, better than those trained on original, unmodified data. Additionally, our approach promotes provider fairness by facilitating exposure to minority categories. (Our GitHub code is available at: https://github.com/SlokomManel/How-to-Diversify-any-Personalized-Recommender-).}, language = {en}, booktitle = {Advances in {Information} {Retrieval}}, publisher = {Springer Nature Switzerland}, author = {Slokom, Manel and Daniil, Savvina and Hollink, Laura}, editor = {Hauff, Claudia and Macdonald, Craig and Jannach, Dietmar and Kazai, Gabriella and Nardini, Franco Maria and Pinelli, Fabio and Silvestri, Fabrizio and Tonellotto, Nicola}, year = {2025}, pages = {307--323}, }
@phdthesis{lansman_using_2025, type = {Ph.{D}. {Dissertation}}, title = {Using emotion diversification based on movie reviews to improve the user experience of movie recommender systems}, url = {https://www.proquest.com/docview/3196617694}, abstract = {Movies are made with the intention of evoking an emotional response. In recent years, researchers have hypothesized that the emotional response evoked by a movie can be leveraged to augment recommender system algorithms. In this work, we demonstrate that emotion diversification improves the user experience of a movie recommender system. We augmented the 10M MovieLens dataset with values of the eight dimensions of Plutchik’s wheel of emotions by leveraging an emotion analysis method that extracts these eight dimensions from movie reviews on IMDB to form an ’emotional signature’. Based on the finding of Mokryn et al. (October 2020) that showed that a film’s emotional signature reflects the emotions the film elicits in viewers, we used each movie’s emotional signature to diversify the output of our recommender algorithm. We tested this novel emotion diversification method against an existing latent diversification method and a baseline version without diversification in an online user experiment with a custom-built movie recommender system. We also tested two different types of visualization, a graph view against a baseline of a list view, as the graph view would increase user understandability regarding the reason behind the recommended items provided. The results of this study show that the emotion diversification method significantly improves the user experience of the movie recommender system, surpassing both the baseline system and the latent diversification method in terms of perceived taste coverage and system satisfaction without significantly reducing the perceived recommendation quality or increasing the trade-off difficulty. Going beyond the traditional rating and/or interaction data used by traditional recommender systems, our work demonstrates the user experience benefits of extracting emotional data from rich, qualitative user feedback and using it to give users a more emotionally diverse set of recommendations.}, language = {English}, author = {Lansman, Lior}, year = {2025}, note = {ISBN: 9798311930970 Pages: 83}, keywords = {0489:Information Technology, Decision making, Emotions, Information technology, Motion pictures, Recommender systems, User behavior}, }
@misc{mancino_datarec_2025, title = {{DataRec}: {A} {Python} {Library} for {Standardized} and {Reproducible} {Data} {Management} in {Recommender} {Systems}}, shorttitle = {{DataRec}}, url = {http://arxiv.org/abs/2410.22972}, doi = {10.48550/arXiv.2410.22972}, abstract = {Recommender systems have demonstrated significant impact across diverse domains, yet ensuring the reproducibility of experimental findings remains a persistent challenge. A primary obstacle lies in the fragmented and often opaque data management strategies employed during the preprocessing stage, where decisions about dataset selection, filtering, and splitting can substantially influence outcomes. To address these limitations, we introduce DataRec, an open-source Python-based library specifically designed to unify and streamline data handling in recommender system research. By providing reproducible routines for dataset preparation, data versioning, and seamless integration with other frameworks, DataRec promotes methodological standardization, interoperability, and comparability across different experimental setups. Our design is informed by an in-depth review of 55 state-of-the-art recommendation studies ensuring that DataRec adopts best practices while addressing common pitfalls in data management. Ultimately, our contribution facilitates fair benchmarking, enhances reproducibility, and fosters greater trust in experimental results within the broader recommender systems community. The DataRec library, documentation, and examples are freely available at https://github.com/sisinflab/DataRec.}, urldate = {2025-04-15}, publisher = {arXiv}, author = {Mancino, Alberto Carlo Maria and Bufi, Salvatore and Fazio, Angela Di and Ferrara, Antonio and Malitesta, Daniele and Pomo, Claudio and Noia, Tommaso Di}, month = apr, year = {2025}, note = {arXiv:2410.22972 [cs] version: 2}, }
@article{daniil_challenges_2025, title = {On the challenges of studying bias in {Recommender} {Systems}: {The} effect of data characteristics and algorithm configuration}, volume = {1}, copyright = {Copyright (c) 2025 Savvina Daniil, Manel Slokom, Mirjam Cuper, Cynthia Liem, Jacco van Ossenbruggen, Laura Hollink (Author)}, issn = {3050-9114}, shorttitle = {On the challenges of studying bias in {Recommender} {Systems}}, url = {https://irrj.org/article/view/19607}, doi = {10.54195/irrj.19607}, abstract = {Statements on the propagation of bias by recommender systems are often hard to verify or falsify. Research on bias tends to draw from a small pool of publicly available datasets and is therefore bound by their specific properties. Additionally, implementation choices are often not explicitly described or motivated in research, while they may have an effect on bias propagation. In this paper, we explore the challenges of measuring and reporting popularity bias. We showcase the impact of data properties and algorithm configurations on popularity bias by combining real and synthetic data with well known recommender systems frameworks. First, we identify data characteristics that might impact popularity bias, and explore their presence in a set of available online datasets. Accordingly, we generate various datasets that combine these characteristics. Second, we locate algorithm configurations that vary across implementations in literature. We evaluate popularity bias for a number of datasets, three real and five synthetic, and configurations, and offer insights on their joint effect. We find that, depending on the data characteristics, various configurations of the algorithms examined can lead to different conclusions regarding the propagation of popularity bias. These results motivate the need for explicitly addressing algorithmic configuration and data properties when reporting and interpreting bias in recommender systems.}, language = {en}, number = {1}, urldate = {2025-04-15}, journal = {Information Retrieval Research}, author = {Daniil, Savvina and Slokom, Manel and Cuper, Mirjam and Liem, Cynthia and Ossenbruggen, Jacco van and Hollink, Laura}, month = feb, year = {2025}, note = {Number: 1}, pages = {3--27}, }
@mastersthesis{arabzadeh_optimal_2025, title = {Optimal {Dataset} {Size} for {Recommender} {Systems}: {Evaluating} {Algorithms}' {Performance} via {Downsampling}}, shorttitle = {Optimal {Dataset} {Size} for {Recommender} {Systems}}, url = {http://arxiv.org/abs/2502.08845}, abstract = {The analysis reveals that algorithm performance under different downsampling portions is influenced by factors such as dataset characteristics, algorithm complexity, and the specific downsampling configuration (scenario dependent). In particular, some algorithms, which generally showed lower absolute nDCG@10 scores compared to those that performed better, exhibited lower sensitivity to the amount of training data provided, demonstrating greater potential to achieve optimal efficiency in lower downsampling portions. For instance, on average, these algorithms retained ∼81\% of their full-size performance when using only 50\% of the training set. In certain configurations of the downsampling method, where the focus was on progressively involving more users while keeping the test set fixed in size, they even demonstrated higher nDCG@10 scores than when using the original full-size dataset. These findings underscore the feasibility of balancing sustainability and effectiveness, providing practical insights for designing energy-efficient recommender systems and advancing sustainable AI practices.}, language = {en}, urldate = {2025-04-15}, school = {University of Siegen}, author = {Arabzadeh, Ardalan}, month = feb, year = {2025}, note = {arXiv:2502.08845 [cs]}, }
@article{akhadam_comparative_2025, title = {A {Comparative} {Evaluation} of {Recommender} {Systems} {Tools}}, volume = {13}, issn = {2169-3536}, url = {https://ieeexplore.ieee.org/abstract/document/10879478}, doi = {10.1109/ACCESS.2025.3541014}, abstract = {Due to the vast flow of information on the Internet, easy and effective access to information has become crucial. Recommender systems are important in information filtering, as they significantly impact large-scale internet web services such as YouTube, Netflix, and Amazon. As the demand for personalized recommendations continues to grow, researchers and practitioners alike strive to develop tools specifically designed for this purpose to meet the increasing need. In this work, we address the challenges associated with selecting software frameworks and Machine Learning (ML) algorithms for Recommender Systems (RSs), thus, we offer a detailed comparison of 42 open-source RS software to provide insights into their different features and capabilities. Furthermore, the paper presents a concise overview of various ML algorithms to generate recommendations, reviews the most used performance metrics to evaluate RS, and then compares several ML algorithms provided by four popular recommendation tools: Microsoft Recommenders, Lenskit, Turi Create, and Cornac.}, urldate = {2025-04-15}, journal = {IEEE Access}, author = {Akhadam, Ayoub and Kbibchi, Oumayma and Mekouar, Loubna and Iraqi, Youssef}, year = {2025}, pages = {29493--29522}, }
@misc{smucker_extending_2025, title = {Extending {MovieLens}-{32M} to {Provide} {New} {Evaluation} {Objectives}}, url = {http://arxiv.org/abs/2504.01863}, doi = {10.48550/arXiv.2504.01863}, abstract = {Offline evaluation of recommender systems has traditionally treated the problem as a machine learning problem. In the classic case of recommending movies, where the user has provided explicit ratings of which movies they like and don't like, each user's ratings are split into test and train sets, and the evaluation task becomes to predict the held out test data using the training data. This machine learning style of evaluation makes the objective to recommend the movies that a user has watched and rated highly, which is not the same task as helping the user find movies that they would enjoy if they watched them. This mismatch in objective between evaluation and task is a compromise to avoid the cost of asking a user to evaluate recommendations by watching each movie. As a resource available for download, we offer an extension to the MovieLens-32M dataset that provides for new evaluation objectives. Our primary objective is to predict the movies that a user would be interested in watching, i.e. predict their watchlist. To construct this extension, we recruited MovieLens users, collected their profiles, made recommendations with a diverse set of algorithms, pooled the recommendations, and had the users assess the pools. Notably, we found that the traditional machine learning style of evaluation ranks the Popular algorithm, which recommends movies based on total number of ratings in the system, in the middle of the twenty-two recommendation runs we used to build the pools. In contrast, when we rank the runs by users' interest in watching movies, we find that recommending popular movies as a recommendation algorithm becomes one of the worst performing runs. It appears that by asking users to assess their personal recommendations, we can alleviate the popularity bias issues created by using information retrieval effectiveness measures for the evaluation of recommender systems.}, urldate = {2025-04-15}, publisher = {arXiv}, author = {Smucker, Mark D. and Chamani, Houmaan}, month = apr, year = {2025}, note = {arXiv:2504.01863 [cs]}, }
@article{diaz_recall_2025, title = {Recall, {Robustness}, and {Lexicographic} {Evaluation}}, url = {https://dl.acm.org/doi/10.1145/3728373}, doi = {10.1145/3728373}, abstract = {Although originally developed to evaluate sets of items, recall is often used to evaluate rankings of items, including those produced by recommender, retrieval, and other machine learning systems. The application of recall without a formal evaluative motivation has led to criticism of recall as a vague or inappropriate measure. In light of this debate, we reflect on the measurement of recall in rankings from a formal perspective. Our analysis is composed of three tenets: recall, robustness, and lexicographic evaluation. First, we formally define ‘recall-orientation’ as the sensitivity of a metric to a user interested in finding every relevant item. Second, we analyze recall-orientation from the perspective of robustness with respect to possible content consumers and providers, connecting recall to recent conversations about fair ranking. Finally, we extend this conceptual and theoretical treatment of recall by developing a practical preference-based evaluation method based on lexicographic comparison. Through extensive empirical analysis across multiple recommendation and retrieval tasks, we establish that our new evaluation method, lexirecall, has convergent validity (i.e., it is correlated with existing recall metrics) and exhibits substantially higher sensitivity in terms of discriminative power and stability in the presence of missing labels. Our conceptual, theoretical, and empirical analysis substantially deepens our understanding of recall and motivates its adoption through connections to robustness and fairness.}, urldate = {2025-04-15}, journal = {ACM Trans. Recomm. Syst.}, author = {Diaz, Fernando and Ekstrand, Michael D. and Mitra, Bhaskar}, month = apr, year = {2025}, note = {Just Accepted}, }
@phdthesis{danesi_um_2024, title = {Um estudo sobre bibliotecas para sistemas de recomendação em {Python}}, copyright = {Acesso Aberto}, url = {http://repositorio.ufsm.br/handle/1/33964}, abstract = {This paper presents a study on recommendation systems, with an emphasis on the analysis and implementation of algorithms using Python libraries for the Collaborative Filtering approach. Identifying the relevance of personalized recommendations in various applications, this research explores algorithms available for the development of such systems, using libraries as tools that facilitate their implementation. In particular, libraries implemented in the Python programming language are examined in the context of recommendation systems, such as Surprise and LensKit for Python (LKPY), presenting the functioning of their main algorithms, K -Nearest Neighbors (K-NN) and Slope One. Thus, the theoretical analysis of these tools is complemented by practical implementation and application in a real scenario demonstrating the performance and applicability of the libraries.}, language = {por}, urldate = {2025-05-22}, school = {Universidade Federal de Santa Maria}, author = {Danesi, Lorenzo Dalla Corte}, month = dec, year = {2024}, note = {Accepted: 2025-01-28T16:09:12Z Publisher: Universidade Federal de Santa Maria}, }
@article{lopes_recommendations_2024, title = {Recommendations with minimum exposure guarantees: a post-processing framework}, volume = {236}, issn = {0957-4174}, shorttitle = {Recommendations with minimum exposure guarantees}, url = {https://www.sciencedirect.com/science/article/pii/S0957417423016664}, doi = {10.1016/j.eswa.2023.121164}, abstract = {Relevance-based ranking is a popular ingredient in recommenders, but it frequently struggles to meet fairness criteria because social and cultural norms may favor some item groups over others. For instance, some items might receive lower ratings due to some sort of bias (e.g. gender bias). A fair ranking should balance the exposure of items from advantaged and disadvantaged groups. To this end, we propose a novel post-processing framework to produce fair, exposure-aware recommendations. Our approach is based on an integer linear programming model maximizing the expected utility while satisfying a minimum exposure constraint. The model has fewer variables than previous work and thus can be deployed to larger datasets and allows the organization to define a minimum level of exposure for groups of items. We conduct an extensive empirical evaluation indicating that our new framework can increase the exposure of items from disadvantaged groups at a small cost of recommendation accuracy.}, urldate = {2023-09-19}, journal = {Expert Systems with Applications}, author = {Lopes, Ramon and Alves, Rodrigo and Ledent, Antoine and Santos, Rodrygo L. T. and Kloft, Marius}, month = feb, year = {2024}, keywords = {Exposure, Fairness, Integer linear programming, Recommender systems, to-read}, pages = {121164}, }
@article{chamani_test_2024, title = {A {Test} {Collection} for {Offline} {Evaluation} of {Recommender} {Systems}}, url = {https://hdl.handle.net/10012/21175}, abstract = {Recommendation systems have long been evaluated by collecting a large number of individuals' ratings for items, and then dividing these ratings into test and train sets to see how well recommendation algorithms can predict individuals' preferences. A complaint about this approach is that the evaluation measures can only use a small number of known preferences and have no information about the majority of recommended items. Prior research has shown that offline evaluation of recommendation systems using a test/train split methodology may not agree with actual user preferences when all recommended items are judged by the user. To address this issue, we apply traditional information retrieval test collection construction techniques for movie recommendations. An information retrieval test collection is composed of documents, search topics, and relevance judgments that tell us which documents are relevant for each topic. For our test collection, each search topic is an individual who is looking for movies to watch. In other words, while the search topic is always ``Please recommend me movies that I will be interested in watching,'' the context of the search topic changes to be the individual who is requesting the recommendations. When document collections are too large to be completely judged by assessors, the traditional approach is to use pooling. We followed this same approach in the construction of our test collection. For each individual, we used their existing profile of rated movies as input to a wide range of recommendation algorithms to produce recommendations for movies not found in their profile. We then pooled these recommendations separately for each person and asked them to rate the movies. In addition to rating, we also had each individual rate a random sample of movies selected from their ratings profile to measure their consistency in rating. The resulting new test collection consists of 51 individual ratings profiles totaling 123,104 ratings and 31,236 relevance judgments. In this thesis, we detail the creation of the test collection and provide an analysis of the individuals that comprise its search topics, and we analyze the collection's relevance judgments as well as other aspects.}, language = {en}, urldate = {2024-11-22}, author = {Chamani, Houmaan}, month = nov, year = {2024}, note = {Publisher: University of Waterloo}, keywords = {⛔ No DOI found}, }
@inproceedings{baumgart_e-fold_2024, title = {e-{Fold} {Cross}-{Validation} for {Recommender}-{System} {Evaluation}}, url = {https://isg.beel.org/pubs/2024-e-folds-recsys-baumgart.pdf}, abstract = {To combat the rising energy consumption of recommender systems we implement a novel alternative for k-fold cross validation. This alternative, named e-fold cross validation, aims to minimize the number of folds to achieve a reduction in power usage while keeping the reliability and robustness of the test results high. We tested our method on 5 recommender system algorithms across 6 datasets and compared it with 10-fold cross validation. On average e-fold cross validation only needed 41.5\% of the energy that 10-fold cross validation would need, while it’s results only differed by 1.81\%. We conclude that e-fold cross validation is a promising approach that has the potential to be an energy efficient but still reliable alternative to k-fold cross validation.}, language = {en}, booktitle = {First {International} {Workshop} on {Recommender} {Systems} for {Sustainability} and {Social} {Good} ({RecSoGood})}, author = {Baumgart, Moritz and Wegmeth, Lukas and Vente, Tobias and Beel, Joeran}, month = oct, year = {2024}, keywords = {⛔ No DOI found}, }
@inproceedings{pathak_advancing_2024, address = {New York, NY, USA}, series = {{CIKM} '24}, title = {Advancing {Misinformation} {Awareness} in {Recommender} {Systems} for {Social} {Media} {Information} {Integrity}}, isbn = {9798400704369}, url = {https://dl.acm.org/doi/10.1145/3627673.3680259}, doi = {10.1145/3627673.3680259}, abstract = {Recommender systems play an essential role in determining the content users encounter on social media platforms and in uncovering relevant news. However, they also present significant risks, such as reinforcing biases, over-personalizing content, fostering filter bubbles, and inadvertently promoting misinformation. The spread of false information is rampant across various online platforms, such as Twitter (now X), Meta, and TikTok, especially noticeable during events like the COVID-19 pandemic and the US Presidential elections. These instances underscore the critical necessity for transparency and regulatory oversight in the development of recommender systems. Given the challenge of balancing free speech with the risks of outright removal of fake news, this paper aims to address the spread of misinformation from algorithmic biases in recommender systems using a social science perspective.}, urldate = {2024-11-04}, booktitle = {Proceedings of the 33rd {ACM} {International} {Conference} on {Information} and {Knowledge} {Management}}, publisher = {Association for Computing Machinery}, author = {Pathak, Royal}, month = oct, year = {2024}, pages = {5471--5474}, }
@misc{arabzadeh_green_2024, title = {Green {Recommender} {Systems}: {Optimizing} {Dataset} {Size} for {Energy}-{Efficient} {Algorithm} {Performance}}, shorttitle = {Green {Recommender} {Systems}}, url = {http://arxiv.org/abs/2410.09359}, doi = {10.48550/arXiv.2410.09359}, abstract = {As recommender systems become increasingly prevalent, the environmental impact and energy efficiency of training large-scale models have come under scrutiny. This paper investigates the potential for energy-efficient algorithm performance by optimizing dataset sizes through downsampling techniques in the context of Green Recommender Systems. We conducted experiments on the MovieLens 100K, 1M, 10M, and Amazon Toys and Games datasets, analyzing the performance of various recommender algorithms under different portions of dataset size. Our results indicate that while more training data generally leads to higher algorithm performance, certain algorithms, such as FunkSVD and BiasedMF, particularly with unbalanced and sparse datasets like Amazon Toys and Games, maintain high-quality recommendations with up to a 50\% reduction in training data, achieving nDCG@10 scores within approximately 13\% of full dataset performance. These findings suggest that strategic dataset reduction can decrease computational and environmental costs without substantially compromising recommendation quality. This study advances sustainable and green recommender systems by providing insights for reducing energy consumption while maintaining effectiveness.}, urldate = {2024-10-17}, publisher = {arXiv}, author = {Arabzadeh, Ardalan and Vente, Tobias and Beel, Joeran}, month = oct, year = {2024}, note = {Presented at International Workshop on Recommender Systems for Sustainability and Social Good (RecSoGood)}, }
@techreport{silva_aprimorando_2024, address = {Ouro Preto, BR}, type = {Bachelor {Thesis}}, title = {Aprimorando a instalação e a configuração de experimentos do {RecSysExp}.}, url = {http://www.monografias.ufop.br/handle/35400000/6571}, abstract = {The paper presents significant enhancements to the RecSysExp framework, used for conducting experiments in recommendation systems. These improvements were aimed at enhancing the usability, scalability, and readability of the system. The new functionalities cover three distinct areas: the development of a graphical user interface, the encapsulation of the framework using Docker, and the restructuring of a class for more cohesive integration with datasets, following established design patterns. The primary goal was to enhance the value provided by the framework, aligned with the vision of its creators, aiming at its use as an academic tool in classroom or research environments. The methodological approach adopted employed specific technologies for each addressed context. For the creation of the user interface, React and Next.js frontend frameworks were employed, while Dockerfile and docker-compose were used for the encapsulation of RecSysExp. Finally, the modification of the class responsible for datasets was carried out following the Template Method design pattern. The project successfully achieved all proposed objectives. The implementation of a container structure simplified the installation of the system, while improvements in the visualization of configurations made experiment creation more intuitive. Additionally, the ability to upload files expanded user options. Although the final version of RecSysExp functions similarly to its original iteration, the additions from this work resulted in an enhanced and more user-friendly version. However, it is important to note that configuration through the graphical interface has limitations, as it is only possible to configure algorithms and modules that can be instantiated via configuration files in the framework. Algorithms and modules implemented solely as libraries in other projects cannot be configured via the frontend.}, language = {pt\_BR}, urldate = {2024-10-11}, institution = {Universidade Federal de Ouro Preto}, author = {Silva, San Cunha da}, year = {2024}, note = {Accepted: 2024-02-29T14:36:20Z}, }
@mastersthesis{stijger_active_2024, address = {Utrecht, NL}, title = {Active learning in recommender systems for predicting vulnerabilities in software}, copyright = {CC-BY-NC-ND}, url = {https://studenttheses.uu.nl/handle/20.500.12932/45783}, abstract = {Due to a rapid advancement of digital technology and growing reliance on the internet, cybersecurity has become a paramount issue for individuals, organizations, and governments. To address this challenge, penetration testing has emerged as a critical tool to ensure the security of computer systems and networks. The reconnaissance phase of penetration testing plays a crucial role in identifying vulnerabilities in a system by gathering relevant information. Although various tools are available to automate this process, most of them are limited to identifying reported vulnerabilities, and they do not provide suggestions or predictions about vulnerabilities. Therefore, this research aims to investigate the application of recommender systems to predict common vulnerabilities during the reconnaissance phase. The main objective of this research is to investigate how active learning affects the performance of a recommender system to identify vulnerabilities in software products. Item-Based k-NN Collaborative Filtering, a recommender system, can improve the identification of potential vulnerabilities and the effectiveness of penetration testing by analyzing information from similar data points. This research involves a comprehensive data preprocessing phase, which utilizes data from the National Vulnerability Database (NVD). Several recommender systems are built using this data, which enables the prediction of potential vulnerabilities during the reconnaissance phase of penetration testing. The performances of these recommender systems are evaluated, and the topperforming recommender system implements active learning to enhance its performance. The findings of this research demonstrate that Item-Based k-NN Collaborative Filtering outperforms other recommender systems in terms of overall performance when it comes to identifying software vulnerabilities. Furthermore, when compared to Item-Based k-NN Collaborative Filtering prior to active learning or with active learning and a random sampling technique, Item-Based k-NN Collaborative Filtering with active learning incorporating a 4- or 10-batch sampling technique with 20 or 40 items added yields a statistically significant improvement in the precision score. This indicates that a greater proportion of the predicted vulnerabilities are correct. Item-Based k-NN Collaborative Filtering with active learning and a single-batch sampling strategy only results in a statistically significant improvement in precision, compared to Item-Based k-NN Collaborative Filtering prior active learning or with active learning and a random sampling technique, when 20 items are added instead of 40. Furthermore, only Item-Based k-NN Collaborative Filtering with a 10-batch sampling strategy adding 20 items demonstrated a statistically significant improvement in nDCG scores compared to Item-Based k-NN Collaborative Filtering prior to active learning. This implies a more accurate ranking of the vulnerabilities. However, this could potentially be a type I error. From these findings, it can be concluded that introducing active learning in Item-Based k-NN Collaborative Filtering, using the approaches outlined, leads to significant improvement in precision score but not necessarily in nDCG score. Considering this conclusion, it is advised to use Item-Based k-NN Collaborative Filtering with active learning to predict vulnerabilities in software products and enhance the reconnaissance phase of penetration testing. This can be achieved by incorporating a single-batch sampling technique with 20 items added or a 4- or 10-batch sampling technique with 20 or 40 added. The insights gained from this research can help individuals, organizations, and governments strengthen their cybersecurity defences and protect against potential cyber threats.}, language = {EN}, urldate = {2024-10-11}, school = {Utrecht University}, author = {Stijger, Elise}, year = {2024}, note = {Accepted: 2024-01-06T00:01:00Z}, }
@misc{vente_potential_2024, title = {The {Potential} of {AutoML} for {Recommender} {Systems}}, url = {http://arxiv.org/abs/2402.04453}, doi = {10.48550/arXiv.2402.04453}, abstract = {Automated Machine Learning (AutoML) has greatly advanced applications of Machine Learning (ML) including model compression, machine translation, and computer vision. Recommender Systems (RecSys) can be seen as an application of ML. Yet, AutoML has found little attention in the RecSys community; nor has RecSys found notable attention in the AutoML community. Only few and relatively simple Automated Recommender Systems (AutoRecSys) libraries exist that adopt AutoML techniques. However, these libraries are based on student projects and do not offer the features and thorough development of AutoML libraries. We set out to determine how AutoML libraries perform in the scenario of an inexperienced user who wants to implement a recommender system. We compared the predictive performance of 60 AutoML, AutoRecSys, ML, and RecSys algorithms from 15 libraries, including a mean predictor baseline, on 14 explicit feedback RecSys datasets. To simulate the perspective of an inexperienced user, the algorithms were evaluated with default hyperparameters. We found that AutoML and AutoRecSys libraries performed best. AutoML libraries performed best for six of the 14 datasets (43\%), but it was not always the same AutoML library performing best. The single-best library was the AutoRecSys library Auto-Surprise, which performed best on five datasets (36\%). On three datasets (21\%), AutoML libraries performed poorly, and RecSys libraries with default parameters performed best. Although, while obtaining 50\% of all placements in the top five per dataset, RecSys algorithms fall behind AutoML on average. ML algorithms generally performed the worst.}, urldate = {2024-10-11}, publisher = {arXiv}, author = {Vente, Tobias and Beel, Joeran}, month = feb, year = {2024}, note = {arXiv:2402.04453 [cs]}, }
@misc{daniil_challenges_2024, title = {On the challenges of studying bias in recommender systems: a {UserKNN} case study}, shorttitle = {On the challenges of studying bias in {Recommender} {Systems}}, url = {http://arxiv.org/abs/2409.08046}, doi = {10.48550/arXiv.2409.08046}, abstract = {Statements on the propagation of bias by recommender systems are often hard to verify or falsify. Research on bias tends to draw from a small pool of publicly available datasets and is therefore bound by their specific properties. Additionally, implementation choices are often not explicitly described or motivated in research, while they may have an effect on bias propagation. In this paper, we explore the challenges of measuring and reporting popularity bias. We showcase the impact of data properties and algorithm configurations on popularity bias by combining synthetic data with well known recommender systems frameworks that implement UserKNN. First, we identify data characteristics that might impact popularity bias, based on the functionality of UserKNN. Accordingly, we generate various datasets that combine these characteristics. Second, we locate UserKNN configurations that vary across implementations in literature. We evaluate popularity bias for five synthetic datasets and five UserKNN configurations, and offer insights on their joint effect. We find that, depending on the data characteristics, various UserKNN configurations can lead to different conclusions regarding the propagation of popularity bias. These results motivate the need for explicitly addressing algorithmic configuration and data properties when reporting and interpreting bias in recommender systems.}, urldate = {2024-09-25}, publisher = {arXiv}, author = {Daniil, Savvina and Slokom, Manel and Cuper, Mirjam and Liem, Cynthia C. S. and van Ossenbruggen, Jacco and Hollink, Laura}, month = sep, year = {2024}, note = {Presented at FAccTRec 2024}, }
@inproceedings{wegmeth_recommender_2024, title = {Recommender systems algorithm selection for ranking prediction on implicit feedback datasets}, url = {http://arxiv.org/abs/2409.05461}, doi = {10.1145/3640457.3691718}, abstract = {The recommender systems algorithm selection problem for ranking prediction on implicit feedback datasets is under-explored. Traditional approaches in recommender systems algorithm selection focus predominantly on rating prediction on explicit feedback datasets, leaving a research gap for ranking prediction on implicit feedback datasets. Algorithm selection is a critical challenge for nearly every practitioner in recommender systems. In this work, we take the first steps toward addressing this research gap. We evaluate the NDCG@10 of 24 recommender systems algorithms, each with two hyperparameter configurations, on 72 recommender systems datasets. We train four optimized machine-learning meta-models and one automated machine-learning meta-model with three different settings on the resulting meta-dataset. Our results show that the predictions of all tested meta-models exhibit a median Spearman correlation ranging from 0.857 to 0.918 with the ground truth. We show that the median Spearman correlation between meta-model predictions and the ground truth increases by an average of 0.124 when the meta-model is optimized to predict the ranking of algorithms instead of their performance. Furthermore, in terms of predicting the best algorithm for an unknown dataset, we demonstrate that the best optimized traditional meta-model, e.g., XGBoost, achieves a recall of 48.6\%, outperforming the best tested automated machine learning meta-model, e.g., AutoGluon, which achieves a recall of 47.2\%.}, urldate = {2024-09-25}, booktitle = {{RecSys} '24 {Late}-{Breaking} {Results}}, author = {Wegmeth, Lukas and Vente, Tobias and Beel, Joeran}, month = sep, year = {2024}, note = {arXiv:2409.05461 [cs]}, }
@phdthesis{michiels_methodologies_2024, address = {Antwerp}, title = {Methodologies to evaluate recommender systems}, url = {https://hdl.handle.net/10067/2080040151162165141}, abstract = {In the current digital landscape, recommender systems play a pivotal role in shaping users' online experiences by providing personalized recommendations for relevant products, news articles, media content, and more. Their pervasive use makes the thorough evaluation of these systems of paramount importance. This dissertation addresses two key challenges in the evaluation of recommender systems. Part II of the dissertation focuses on improving methodologies for offline evaluation. Offline evaluation is a prevalent method for assessing recommendation algorithms in both academia and industry. Despite its widespread use, offline evaluations often suffer from methodological flaws that undermine their validity and real-world impact. This dissertation makes three key contributions to improving the reliability, internal and ecological validity, replicability, reproducibility, and reusability of offline evaluations. First, it presents an extensive review of the current state of practice and knowledge in offline evaluation, proposing a comprehensive set of better practices to address the reliability, replicability, and validity of offline evaluations. Next, it introduces RecPack, an open-source experimentation toolkit designed to facilitate reliable, reproducible, and reusable offline evaluations. Finally, it presents RecPack Tests, a test suite designed to ensure the correctness of recommendation algorithm implementations, thereby enhancing the reliability of offline evaluations. Part III of the dissertation examines the measurement of filter bubbles and serendipity. Both concepts have garnered significant attention due to concerns about the potential negative impacts of recommender systems on users of online platforms. One concern is that personalized content, especially on news and media platforms, may lock users into prior beliefs, contributing to increased polarization in society. Another concern is that exposure only to content previously expressed interest in may lead to boredom and eliminate surprise, preventing users from experiencing serendipity. This research makes three contributions to the study of filter bubbles and serendipity. First, it proposes an operational definition of technological filter bubbles, clarifying the ambiguity surrounding the concept. Second, it introduces a regression model for measuring their presence and strength in news recommendations, providing practitioners with the tools to rigorously study filter bubbles and gather real-world evidence of their (non-)existence. Finally, it proposes a feature repository for serendipity in recommender systems, offering a framework for evaluating how system design can influence users' experiences of serendipity in online information environments. In summary, the findings and tools developed in this dissertation advance the theoretical understanding of recommender system evaluation while offering practical tools for industry practitioners and researchers.}, language = {en}, urldate = {2024-09-25}, school = {University of Antwerp}, author = {Michiels, Lien}, year = {2024}, doi = {10.63028/10067/2080040151162165141}, }
@inproceedings{ferraro_its_2024, title = {It's not you, it's me: the impact of choice models and ranking strategies on gender imbalance in music recommendation}, shorttitle = {It's not you, it's me}, url = {http://arxiv.org/abs/2409.03781}, doi = {10.1145/3640457.3688163}, abstract = {As recommender systems are prone to various biases, mitigation approaches are needed to ensure that recommendations are fair to various stakeholders. One particular concern in music recommendation is artist gender fairness. Recent work has shown that the gender imbalance in the sector translates to the output of music recommender systems, creating a feedback loop that can reinforce gender biases over time. In this work, we examine that feedback loop to study whether algorithmic strategies or user behavior are a greater contributor to ongoing improvement (or loss) in fairness as models are repeatedly re-trained on new user feedback data. We simulate user interaction and re-training to investigate the effects of ranking strategies and user choice models on gender fairness metrics. We find re-ranking strategies have a greater effect than user choice models on recommendation fairness over time.}, urldate = {2024-09-25}, booktitle = {Proceedings of the 18th {ACM} {Conference} on {Recommender} {Systems}}, publisher = {ACM}, author = {Ferraro, Andres and Ekstrand, Michael D. and Bauer, Christine}, month = aug, year = {2024}, }
@inproceedings{raj_towards_2024, series = {{LNCS}}, title = {Towards optimizing ranking in grid-layout for provider-side fairness}, volume = {14612}, copyright = {All rights reserved}, url = {https://md.ekstrandom.net/pubs/ecir-fair-grids}, doi = {10.1007/978-3-031-56069-9_7}, abstract = {Information access systems, such as search engines and recommender systems, order and position results based on their estimated relevance. These results are then evaluated for a range of concerns, including provider-side fairness: whether exposure to users is fairly distributed among items and the people who created them. Several fairness-aware ranking and re-ranking techniques have been proposed to ensure fair exposure for providers, but this work focuses almost exclusively on linear layouts in which items are displayed in single ranked list. Many widely-used systems use other layouts, such as the grid views common in streaming platforms, image search, and other applications. Providing fair exposure to providers in such layouts is not well-studied. We seek to fill this gap by providing a grid-aware re-ranking algorithm to optimize layouts for provider-side fairness by adapting existing re-ranking techniques to grid-aware browsing models, and an analysis of the effect of grid-specific factors such as device size on the resulting fairness optimization.}, language = {en}, urldate = {2024-01-04}, booktitle = {Proceedings of the 46th {European} {Conference} on {Information} {Retrieval}}, publisher = {Springer}, author = {Raj, Amifa and Ekstrand, Michael D.}, month = mar, year = {2024}, pages = {90--105}, }
@article{ekstrand_distributionally-informed_2024, title = {Distributionally-informed recommender system evaluation}, volume = {2}, copyright = {All rights reserved}, url = {https://dl.acm.org/doi/10.1145/3613455}, doi = {10.1145/3613455}, abstract = {Current practice for evaluating recommender systems typically focuses on point estimates of user-oriented effectiveness metrics or business metrics, sometimes combined with additional metrics for considerations such as diversity and novelty. In this paper, we argue for the need for researchers and practitioners to attend more closely to various distributions that arise from a recommender system (or other information access system) and the sources of uncertainty that lead to these distributions. One immediate implication of our argument is that both researchers and practitioners must report and examine more thoroughly the distribution of utility between and within different stakeholder groups. However, distributions of various forms arise in many more aspects of the recommender systems experimental process, and distributional thinking has substantial ramifications for how we design, evaluate, and present recommender systems evaluation and research results. Leveraging and emphasizing distributions in the evaluation of recommender systems is a necessary step to ensure that the systems provide appropriate and equitably-distributed benefit to the people they affect.}, number = {1}, urldate = {2023-09-07}, journal = {ACM Transactions on Recommender Systems}, author = {Ekstrand, Michael D. and Carterette, Ben and Diaz, Fernando}, month = mar, year = {2024}, keywords = {distributions, evaluation, exposure, statistics}, pages = {6:1--27}, }
@inproceedings{vente_clicks_2024, title = {From {Clicks} to {Carbon}: {The} {Environmental} {Toll} of {Recommender} {Systems}}, shorttitle = {From {Clicks} to {Carbon}}, url = {http://arxiv.org/abs/2408.08203}, doi = {10.1145/3640457.203688074}, abstract = {As global warming soars, evaluating the environmental impact of research is more critical now than ever before. However, we find that few to no recommender systems research papers document their impact on the environment. Consequently, in this paper, we conduct a comprehensive analysis of the environmental impact of recommender system research by reproducing a characteristic recommender systems experimental pipeline. We focus on estimating the carbon footprint of recommender systems research papers, highlighting the evolution of the environmental impact of recommender systems research experiments over time. We thoroughly evaluated all 79 full papers from the ACM RecSys conference in the years 2013 and 2023 to analyze representative experimental pipelines for papers utilizing traditional, so-called good old-fashioned AI algorithms and deep learning algorithms, respectively. We reproduced these representative experimental pipelines, measured electricity consumption using a hardware energy meter, and converted the measured energy consumption into CO2 equivalents to estimate the environmental impact. Our results show that a recommender systems research paper utilizing deep learning algorithms emits approximately 42 times more CO2 equivalents than a paper utilizing traditional algorithms. Furthermore, on average, such a paper produces 3,297 kilograms of CO2 equivalents, which is more than one person produces by flying from New York City to Melbourne or the amount one tree sequesters in 300 years.}, urldate = {2024-08-16}, booktitle = {Proceedings of the 18th {ACM} {Conference} on {Recommender} {Systems}}, publisher = {ACM}, author = {Vente, Tobias and Wegmeth, Lukas and Said, Alan and Beel, Joeran}, month = oct, year = {2024}, note = {arXiv:2408.08203 [cs]}, }
@misc{lichtenberg_large_2024, title = {Large {Language} {Models} as {Recommender} {Systems}: {A} {Study} of {Popularity} {Bias}}, shorttitle = {Large {Language} {Models} as {Recommender} {Systems}}, url = {http://arxiv.org/abs/2406.01285}, abstract = {The issue of popularity bias—where popular items are disproportionately recommended, overshadowing less popular but potentially relevant items—remains a significant challenge in recommender systems. Recent advancements have seen the integration of generalpurpose Large Language Models (LLMs) into the architecture of such systems. This integration raises concerns that it might exacerbate popularity bias, given that the LLM’s training data is likely dominated by popular items. However, it simultaneously presents a novel opportunity to address the bias via prompt tuning. Our study explores this dichotomy, examining whether LLMs contribute to or can alleviate popularity bias in recommender systems. We introduce a principled way to measure popularity bias by discussing existing metrics and proposing a novel metric that fulfills a series of desiderata. Based on our new metric, we compare a simple LLM-based recommender to traditional recommender systems on a movie recommendation task. We find that the LLM recommender exhibits less popularity bias, even without any explicit mitigation.}, language = {en}, urldate = {2024-08-15}, publisher = {arXiv}, author = {Lichtenberg, Jan Malte and Buchholz, Alexander and Schwöbel, Pola}, month = jun, year = {2024}, note = {arXiv:2406.01285 [cs]}, }
@phdthesis{slokom_towards_2024, type = {Dissertation}, title = {Towards {Purpose}-aware {Privacy}-{Preserving} {Techniques} for {Predictive} {Applications}}, url = {https://doi.org/10.4233/uuid:4db4a67e-3e4f-4c94-b3e0-1eb8cd1765cb}, abstract = {In the field of machine learning (ML), the goal is to leverage algorithmic models to generate predictions, transforming raw input data into valuable insights. However, the ML pipeline, consisting of input data, models, and output data, is susceptible to various vulnerabilities and attacks. These attacks include re-identification, attribute inference, membership inference, and model inversion attacks, all posing threats to individual privacy. This thesis specifically targets attribute inference attacks, wherein adversaries seek to infer sensitive information about target individuals. The literature on privacy-preserving techniques explores various perturbative approaches, including obfuscation, randomization, and differential privacy, to mitigate privacy attacks. While these methods have shown effectiveness, conventional perturbation based techniques often offer generic protection, lacking the nuance needed to preserve specific utility and accuracy. These conventional techniques are typically purpose unaware, meaning they modify data to protect privacy while maintaining general data usefulness. Recently, there has been a growing interest in purpose-aware techniques.The thesis introduces purpose-aware privacy preservation in the form of a conceptual framework. This approach involves tailoring data modifications to serve specific purposes and implementing changes orthogonal to relevant features. We aim to protect user privacy without compromising utility. We focus on two key applications within the ML spectrum: recommender systems and machine learning classifiers. The objective is to protect these applications against potential privacy attacks, addressing vulnerabilities in both input data and output data (i.e., predictions). We structure the thesis into two parts, each addressing distinct challenges in the ML pipeline. Part 1 tackles attacks on input data, exploring methods to protect sensitive information while maintaining the accuracy of ML models, specifically in recommender systems. Firstly, we explore an attack scenario in which an adversary can acquire the user-item matrix and aims to infer privacy-sensitive information. We assume that the adversary has a gender classifier that is pre-trained on unprotected data. The objective of the adversary is to infer the gender of target individuals. We propose personalized blurring (PerBlur), a personalization-based approach to gender obfuscation that aims to protect user privacy while maintaining the recommendation quality. We demonstrate that recommender system algorithms trained on obfuscated data perform comparably to those trained on the original user-item matrix. Furthermore, our approach not only prevents classifiers from predicting users' gender based on the obfuscated data but also achieves diversity through the recommendation of (non-stereotypical) diverse items. Secondly, we investigate an attack scenario in which an adversary has access to a user-item matrix and aims to exploit the user preference values that it contains. The objective of the adversary is to infer the preferences of individual users. We propose Shuffle-NNN, a data masking-based approach that aims to hide the preferences of users for individual items while maintaining the relative performance of recommendation algorithms. We demonstrate that Shuffle-NNN provides evidence of what information should be retained and what can be removed from the user-item matrix. Shuffle-NNN has great potential for data release, such as in data science challenges. Part 2 investigates attacks on output data, focusing on model inversion attacks aimed at predictions from machine learning classifiers and examining potential privacy risks associated with recommender system outputs. Firstly, we explore a scenario where an adversary attempts to infer individuals' sensitive information by querying a machine learning model and receiving output predictions. We investigate various attack models and identify a potential risk of sensitive information leakage when the target model is trained on original data. To mitigate this risk, we propose to replace the original training data with protected data using synthetic training data + privacy-preserving techniques. We show that the target model trained on protected data achieves performance comparable to the target model trained on original data. We demonstrate that by using privacy-preserving techniques on synthetic training data, we observe a small reduction in the success of certain model inversion attacks measured over a group of target individuals. Secondly, we explore an attack scenario in which the adversary seeks to infer users' sensitive information by intercepting recommendations provided by a recommender system to a set of users. Our goal is to gain insight into possible unintended consequences of using user attributes as side information in context-aware recommender systems. We study the extent to which personal attributes of a user can be inferred from a list of recommendations to that user. We find that both standard recommenders and context-aware recommenders leak personal user information into the recommendation lists.We demonstrate that using user attributes in context-aware recommendations yields a small gain in accuracy. However, the benefit of this gain is distributed unevenly among users and it sacrifices coverage and diversity. This leads us to question the actual value of side information and the need to ensure that there are no hidden `side effects'. The final chapter of the thesis summarizes our findings. It provides recommendations for future research directions which we think are promising for further exploring and promoting the use of purpose-aware privacy-preserving data for ML predictions.}, school = {TU Delft}, author = {Slokom, M.}, year = {2024}, }
@inproceedings{dolog_impact_2024, address = {New York, NY, USA}, series = {{WWW} '24}, title = {The {Impact} of {Cluster} {Centroid} and {Text} {Review} {Embeddings} on {Recommendation} {Methods}}, isbn = {9798400701726}, url = {https://dl.acm.org/doi/10.1145/3589335.3651570}, doi = {10.1145/3589335.3651570}, abstract = {Recommendation systems often neglect global patterns that can be provided by clusters of similar items or even additional information such as text. Therefore, we study the impact of integrating clustering embeddings, review embeddings, and their combinations with embeddings obtained by a recommender system. Our work assesses the performance of this approach across various state-of-the-art recommender system algorithms. Our study highlights the improvement of recommendation performance through clustering, particularly evident when combined with review embeddings, and the enhanced performance of neural methods when incorporating review embeddings.}, urldate = {2024-08-15}, booktitle = {Companion {Proceedings} of the {ACM} on {Web} {Conference} 2024}, publisher = {Association for Computing Machinery}, author = {Dolog, Peter and Sadikaj, Ylli and Velaj, Yllka and Stephan, Andreas and Roth, Benjamin and Plant, Claudia}, month = may, year = {2024}, pages = {589--592}, }
@misc{lizenberger_rethinking_2024, title = {Rethinking {Recommender} {Systems}: {Cluster}-based {Algorithm} {Selection}}, shorttitle = {Rethinking {Recommender} {Systems}}, url = {http://arxiv.org/abs/2405.18011}, doi = {10.48550/arXiv.2405.18011}, abstract = {Cluster-based algorithm selection deals with selecting recommendation algorithms on clusters of users to obtain performance gains. No studies have been attempted for many combinations of clustering approaches and recommendation algorithms. We want to show that clustering users prior to algorithm selection increases the performance of recommendation algorithms. Our study covers eight datasets, four clustering approaches, and eight recommendation algorithms. We select the best performing recommendation algorithm for each cluster. Our work shows that cluster-based algorithm selection is an effective technique for optimizing recommendation algorithm performance. For five out of eight datasets, we report an increase in nDCG@10 between 19.28\% (0.032) and 360.38\% (0.191) compared to algorithm selection without prior clustering.}, urldate = {2024-08-15}, publisher = {arXiv}, author = {Lizenberger, Andreas and Pfeifer, Ferdinand and Polewka, Bastian}, month = may, year = {2024}, note = {arXiv:2405.18011 [cs]}, }
@article{honda_anonymity-aware_2024, title = {Anonymity-{Aware} {Framework} for {Designing} {Recommender} {Systems}}, volume = {19}, copyright = {© 2024 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.}, issn = {1931-4981}, url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/tee.24093}, doi = {10.1002/tee.24093}, abstract = {Due to increasing secondary use of data, recommender systems using anonymized data are in demand. However, implementing a recommender system requires complicated data processing and programming, and the relationship between anonymization level and recommendation quality has not been investigated. Therefore, this study proposes a framework that facilitates the development of recommender systems. Additionally, a method is proposed for quantitatively evaluating recommendation quality when the anonymization level is varied. The proposed method promotes data utilization by recommender systems and determination of compensation for providing data based on anonymization level. © 2024 Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.}, language = {en}, number = {9}, urldate = {2024-08-15}, journal = {IEEJ Transactions on Electrical and Electronic Engineering}, author = {Honda, Moena and Nishi, Hiroaki}, year = {2024}, note = {\_eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/tee.24093}, pages = {1455--1464}, }
@inproceedings{pathak_analyzing_2024, address = {New York, NY, USA}, series = {{UMAP} {Adjunct} '24}, title = {Analyzing the {Interplay} between {Diversity} of {News} {Recommendations} and {Misinformation} {Spread} in {Social} {Media}}, isbn = {9798400704666}, url = {https://dl.acm.org/doi/10.1145/3631700.3664870}, doi = {10.1145/3631700.3664870}, abstract = {Recommender systems play a crucial role in social media platforms, especially in the context of news, by assisting users in discovering relevant news. However, these systems can inadvertently contribute to increased personalization, and the formation of filter bubbles and echo chambers, thereby aiding in the propagation of fake news or misinformation. This study specifically focuses on examining the tradeoffs between the diversity of news recommendations and the dissemination of misinformation on social media. We evaluated classical recommender algorithms on two Twitter (now X) datasets to assess the diversity of top-10 recommendation lists and simulated the propagation of recommended misinformation within the user network to analyze the impact of diversity on misinformation spread. The research findings indicate that an increase in news recommendation diversity indeed contributes to mitigating the propagation of misinformation. Additionally, collaborative and content-based recommender systems provide more diversity in comparison to popularity and network-based systems, resulting in less misinformation propagation. Our study underscores the crucial role of diversity recommendations in mitigating misinformation propagation, offering valuable insights for designing misinformation-aware recommender systems and diversity-based misinformation intervention.}, urldate = {2024-08-15}, booktitle = {Adjunct {Proceedings} of the 32nd {ACM} {Conference} on {User} {Modeling}, {Adaptation} and {Personalization}}, publisher = {Association for Computing Machinery}, author = {Pathak, Royal and Spezzano, Francesca}, month = jun, year = {2024}, pages = {80--85}, }
@article{koeser_missing_2024, title = {Missing {Data}, {Speculative} {Reading}}, volume = {9}, url = {https://culturalanalytics.org/article/116926-missing-data-speculative-reading}, doi = {10.22148/001c.116926}, abstract = {In this article we use an approach we term “speculative reading” to explore gaps in Sylvia Beach’s lending library records and the *Shakespeare and Company Project* datasets. We recast the problem of missing data as an opportunity and use a combination of time series forecasting, evolutionary models, and recommendation systems to estimate the extent of missing information and speculatively fill in some gaps. We conclude that the datasets include ninety-three percent of membership activity, ninety-six percent of members, and sixty-four percent to seventy-six percent of the books despite only including twenty-six percent of the borrowing activity. We then treat Ernest Hemingway as a test case for speculative reading: based on Hemingway’s known borrowing and all documented borrowing activity, we generate a list of books he might have borrowed during the years his borrowing is not documented; we then verify and interpret our list against the substantial scholarly record of the books he read and owned.}, language = {en}, number = {2}, urldate = {2024-08-15}, journal = {Journal of Cultural Analytics}, author = {Koeser, Rebecca Sutton and LeBlanc, Zoe}, month = may, year = {2024}, }
@misc{schmidt_evaluating_2024, title = {Evaluating the performance-deviation of {itemKNN} in {RecBole} and {LensKit}}, url = {http://arxiv.org/abs/2407.13531}, abstract = {This study evaluates the performance variations of item-based kNearest Neighbors (ItemKNN) algorithms implemented in the recommender system libraries, RecBole and LensKit. By using four datasets (Anime, Modcloth, ML-100K, and ML-1M), we explore the efficiency, accuracy, and scalability of each library’s implementation of ItemKNN. The study involves replicating and reproducing experiments to ensure the reliability of results. We are using key metrics such as normalized discounted cumulative gain (nDCG), precision, and recall to evaluate performance with our main focus on nDCG. Our initial findings indicate that RecBole is more performant than LensKit on two out of three metrics. It achieved a 18\% higher nDCG, a 14\% higher Precision and a 35\% lower Recall. To ensure a fair comparison, we adjusted LensKit’s nDCG calculation implementation to match RecBole’s approach. After aligning the nDCG calculations implementation, the performance of the two libraries became more comparable. Using implicit feedback, LensKit achieved an nDCG value of 0.2540, whereas RecBole attained a value of 0.2674. Further analysis revealed that the deviations were caused by differences in the implementation of the similarity matrix calculation. Our findings show that RecBole’s implementation outperforms the LensKit algorithm on three out of our four datasets. Following the implementation of a similarity matrix calculation, where only the top K similar items for each item are retained (a method already incorporated in RecBole’s ItemKNN), we observed nearly identical nDCG values across all four of our datasets. For example, Lenskit achieved an nDCG value of 0.2586 for the ML-1M dataset with a random seed set to 42. Similarly, RecBole attained the same nDCG value of 0.2586 under identical conditions. Using the original implementation of LensKit’s ItemKNN, a higher nDCG value was obtained only on the ModCloth data set.}, language = {en}, urldate = {2024-08-15}, publisher = {arXiv}, author = {Schmidt, Michael and Nitschke, Jannik and Prinz, Tim}, month = jul, year = {2024}, note = {arXiv:2407.13531 [cs]}, }
@inproceedings{ihemelandu_multiple_2024, series = {{LNCS}}, title = {Multiple testing for {IR} and recommendation system experiments}, volume = {14610}, url = {https://md.ekstrandom.net/pubs/ecir-mcp}, doi = {10.1007/978-3-031-56063-7_37}, abstract = {While there has been significant research on statistical techniques for comparing two information retrieval (IR) systems, many IR experiments test more than two systems. This can lead to inflated false discoveries due to the multiple-comparison problem (MCP). A few IR studies have investigated multiple comparison procedures; these studies mostly use TREC data and control the familywise error rate. In this study, we extend their investigation to include recommendation system evaluation data as well as multiple comparison procedures that controls for False Discovery Rate (FDR).}, language = {en}, urldate = {2024-01-04}, booktitle = {Proceedings of the 46th {European} {Conference} on {Information} {Retrieval}}, publisher = {Springer}, author = {Ihemelandu, Ngozi and Ekstrand, Michael D.}, month = mar, year = {2024}, pages = {449--457}, }
doi link bibtex abstract
@inproceedings{wegmeth_revealing_2024, title = {Revealing the {Hidden} {Impact} of {Top}-{N} {Metrics} on {Optimization} in {Recommender} {Systems}}, isbn = {978-3-031-56027-9}, doi = {10.1007/978-3-031-56027-9_9}, abstract = {The hyperparameters of recommender systems for top-n predictions are typically optimized to enhance the predictive performance of algorithms. Thereby, the optimization algorithm, e.g., grid search or random search, searches for the best hyperparameter configuration according to an optimization-target metric, like nDCG or Precision. In contrast, the optimized algorithm, e.g., Alternating Least Squares Matrix Factorization or Bayesian Personalized Ranking, internally optimizes a different loss function during training, like squared error or cross-entropy. To tackle this discrepancy, recent work focused on generating loss functions better suited for recommender systems. Yet, when evaluating an algorithm using a top-n metric during optimization, another discrepancy between the optimization-target metric and the training loss has so far been ignored. During optimization, the top-n items are selected for computing a top-n metric; ignoring that the top-n items are selected from the recommendations of a model trained with an entirely different loss function. Item recommendations suitable for optimization-target metrics could be outside the top-n recommended items; hiddenly impacting the optimization performance. Therefore, we were motivated to analyze whether the top-n items are optimal for optimization-target top-n metrics. In pursuit of an answer, we exhaustively evaluate the predictive performance of 250 selection strategies besides selecting the top-n. We extensively evaluate each selection strategy over twelve implicit feedback and eight explicit feedback data sets with eleven recommender systems algorithms. Our results show that there exist selection strategies other than top-n that increase predictive performance for various algorithms and recommendation domains. However, the performance of the top \$\${\textbackslash}sim 43{\textbackslash}\%\$\$∼43\%of selection strategies is not significantly different. We discuss the impact of our findings on optimization and re-ranking in recommender systems and feasible solutions. The implementation of our study is publicly available.}, language = {en}, booktitle = {Advances in {Information} {Retrieval}}, publisher = {Springer Nature Switzerland}, author = {Wegmeth, Lukas and Vente, Tobias and Purucker, Lennart}, editor = {Goharian, Nazli and Tonellotto, Nicola and He, Yulan and Lipani, Aldo and McDonald, Graham and Macdonald, Craig and Ounis, Iadh}, year = {2024}, pages = {140--156}, }
doi link bibtex abstract
@inproceedings{pathak_empirical_2024, series = {{LNCS}}, title = {An {Empirical} {Analysis} of {Intervention} {Strategies}’ {Effectiveness} for {Countering} {Misinformation} {Amplification} by {Recommendation} {Algorithms}}, volume = {14611}, isbn = {978-3-031-56066-8}, doi = {10.1007/978-3-031-56066-8_23}, abstract = {Social network platforms connect people worldwide, facilitating communication, information sharing, and personal/professional networking. They use recommendation algorithms to personalize content and enhance user experiences. However, these algorithms can unintentionally amplify misinformation by prioritizing engagement over accuracy. For instance, recent works suggest that popularity-based and network-based recommendation algorithms contribute the most to misinformation diffusion. In our study, we present an exploration on two Twitter datasets to understand the impact of intervention techniques on combating misinformation amplification initiated by recommendation algorithms. We simulate various scenarios and evaluate the effectiveness of intervention strategies in social sciences such as Virality Circuit Breakers and accuracy nudges. Our findings highlight that these intervention strategies are generally successful when applied on top of collaborative filtering and content-based recommendation algorithms, while having different levels of effectiveness depending on the number of users keen to spread fake news present in the dataset.}, language = {en}, booktitle = {Advances in {Information} {Retrieval}}, publisher = {Springer Nature Switzerland}, author = {Pathak, Royal and Spezzano, Francesca}, editor = {Goharian, Nazli and Tonellotto, Nicola and He, Yulan and Lipani, Aldo and McDonald, Graham and Macdonald, Craig and Ounis, Iadh}, year = {2024}, pages = {285--301}, }
@inproceedings{ihemelandu_candidate_2023, title = {Candidate set sampling for evaluating top-{N} recommendation}, url = {https://doi.org/10.1109/WI-IAT59888.2023.00018}, doi = {10.1109/WI-IAT59888.2023.00018}, abstract = {The strategy for selecting candidate sets -- the set of items that the recommendation system is expected to rank for each user -- is an important decision in carrying out an offline top-\$N\$ recommender system evaluation. The set of candidates is composed of the union of the user's test items and an arbitrary number of non-relevant items that we refer to as decoys. Previous studies have aimed to understand the effect of different candidate set sizes and selection strategies on evaluation. In this paper, we extend this knowledge by studying the specific interaction of candidate set selection strategies with popularity bias, and use simulation to assess whether sampled candidate sets result in metric estimates that are less biased with respect to the true metric values under complete data that is typically unavailable in ordinary experiments.}, urldate = {2023-11-08}, booktitle = {Proceedings of the 22nd {IEEE}/{WIC} international conference on web intelligence and intelligent agent technology}, author = {Ihemelandu, Ngozi and Ekstrand, Michael D.}, month = oct, year = {2023}, note = {arXiv:2309.11723 [cs]}, keywords = {Computer Science - Information Retrieval}, pages = {88--94}, }
@article{wang_modeling_2023, title = {Modeling uncertainty to improve personalized recommendations via {Bayesian} deep learning}, volume = {16}, issn = {2364-4168}, url = {https://doi.org/10.1007/s41060-020-00241-1}, doi = {10.1007/s41060-020-00241-1}, abstract = {Modeling uncertainty has been a major challenge in developing Machine Learning solutions to solve real world problems in various domains. In Recommender Systems, a typical usage of uncertainty is to balance exploration and exploitation, where the uncertainty helps to guide the selection of new options in exploration. Recent advances in combining Bayesian methods with deep learning enable us to express uncertain status in deep learning models. In this paper, we investigate an approach based on Bayesian deep learning to improve personalized recommendations. We first build deep learning architectures to learn useful representation of user and item inputs for predicting their interactions. We then explore multiple embedding components to accommodate different types of user and item inputs. Based on Bayesian deep learning techniques, a key novelty of our approach is to capture the uncertainty associated with the model output and further utilize it to boost exploration in the context of Recommender Systems. We test the proposed approach in both a Collaborative Filtering and a simulated online recommendation setting. Experimental results on publicly available benchmarks demonstrate the benefits of our approach in improving the recommendation performance.}, language = {en}, number = {2}, urldate = {2024-03-17}, journal = {International Journal of Data Science and Analytics}, author = {Wang, Xin and Kadıoğlu, Serdar}, month = aug, year = {2023}, pages = {191--201}, }
link bibtex abstract
@inproceedings{wegmeth_effect_2023, title = {The effect of random seeds for data splitting on recommendation accuracy}, abstract = {The evaluation of recommender system algorithms depends on randomness, e.g., during randomly splitting data into training and testing data. We suspect that failing to account for randomness in this scenario may lead to misrepresenting the predictive accuracy of recommendation algorithms. To understand the community’s view of the importance of randomness, we conducted a paper study on 39 full papers published at the ACM RecSys 2022 conference. We found that the authors of 26 papers used some variation of a holdout split that requires a random seed. However, only five papers explicitly repeated experiments and averaged their results over different random seeds. This potentially problematic research practice motivated us to analyze the effect of data split random seeds on recommendation accuracy. Therefore, we train three common algorithms on nine public data sets with 20 data split random seeds, evaluate them on two ranking metrics with three different ranking cutoff values 𝑘, and compare the results. In the extreme case with 𝑘 = 1, we show that depending on the data split random seed, the accuracy with traditional recommendation algorithms deviates by up to ∼6.3\% from the mean accuracy achieved on the data set. Hence, we show that an algorithm may significantly over- or under-perform when maliciously or negligently selecting a random seed for splitting the data. To showcase a mitigation strategy and better research practice, we compare holdout to cross-validation and show that, again, for 𝑘 = 1, the accuracy of algorithms evaluated with cross-validation deviates only up to ∼2.3\% from the mean accuracy achieved on the data set. Furthermore, we found that the deviation becomes smaller the higher the value of 𝑘 for both holdout and cross-validation.}, language = {en}, booktitle = {Perspectives on the {Evaluation} of {Recommender} {Systems} {Workshop} ({PERSPECTIVES} 2023)}, author = {Wegmeth, Lukas and Vente, Tobias and Purucker, Lennart and Beel, Joeran}, month = sep, year = {2023}, keywords = {to-read}, }
@inproceedings{vente_introducing_2023, address = {New York, NY, USA}, series = {{RecSys} '23}, title = {Introducing {LensKit}-{Auto}, an experimental automated recommender system ({AutoRecSys}) toolkit}, isbn = {9798400702419}, url = {https://dl.acm.org/doi/10.1145/3604915.3610656}, doi = {10.1145/3604915.3610656}, abstract = {LensKit is one of the first and most popular Recommender System libraries. While LensKit offers a wide variety of features, it does not include any optimization strategies or guidelines on how to select and tune LensKit algorithms. LensKit developers have to manually include third-party libraries into their experimental setup or implement optimization strategies by hand to optimize hyperparameters. We found that 63.6\% (21 out of 33) of papers using LensKit algorithms for their experiments did not select algorithms or tune hyperparameters. Non-optimized models represent poor baselines and produce less meaningful research results. This demo introduces LensKit-Auto. LensKit-Auto automates the entire Recommender System pipeline and enables LensKit developers to automatically select, optimize, and ensemble LensKit algorithms.}, urldate = {2023-09-18}, booktitle = {Proceedings of the 17th {ACM} {Conference} on {Recommender} {Systems}}, publisher = {Association for Computing Machinery}, author = {Vente, Tobias and Ekstrand, Michael and Beel, Joeran}, month = sep, year = {2023}, keywords = {Algorithm Selection, AutoRecSys, Automated Recommender Systems, CASH, Hyperparameter Optimization, Recommender Systems}, pages = {1212--1216}, }
@misc{li_mitigating_2023, title = {Mitigating mainstream bias in recommendation via cost-sensitive learning}, url = {http://arxiv.org/abs/2307.13632}, doi = {10.1145/3578337.3605134}, abstract = {Mainstream bias, where some users receive poor recommendations because their preferences are uncommon or simply because they are less active, is an important aspect to consider regarding fairness in recommender systems. Existing methods to mitigate mainstream bias do not explicitly model the importance of these non-mainstream users or, when they do, it is in a way that is not necessarily compatible with the data and recommendation model at hand. In contrast, we use the recommendation utility as a more generic and implicit proxy to quantify mainstreamness, and propose a simple user-weighting approach to incorporate it into the training process while taking the cost of potential recommendation errors into account. We provide extensive experimental results showing that quantifying mainstreamness via utility is better able at identifying non-mainstream users, and that they are indeed better served when training the model in a cost-sensitive way. This is achieved with negligible or no loss in overall recommendation accuracy, meaning that the models learn a better balance across users. In addition, we show that research of this kind, which evaluates recommendation quality at the individual user level, may not be reliable if not using enough interactions when assessing model performance.}, urldate = {2023-07-29}, author = {Li, Roger Zhe and Urbano, Julián and Hanjalic, Alan}, month = jul, year = {2023}, note = {arXiv:2307.13632 [cs]}, keywords = {Computer Science - Information Retrieval}, }
@inproceedings{ihemelandu_inference_2023, address = {New York, NY, USA}, series = {{SIGIR} '23}, title = {Inference at scale: significance testing for large search and recommendation experiments}, copyright = {All rights reserved}, isbn = {978-1-4503-9408-6}, shorttitle = {Inference at scale}, url = {https://dl.acm.org/doi/10.1145/3539618.3592004}, doi = {10.1145/3539618.3592004}, abstract = {A number of information retrieval studies have been done to assess which statistical techniques are appropriate for comparing systems. However, these studies are focused on TREC-style experiments, which typically have fewer than 100 topics. There is no similar line of work for large search and recommendation experiments; such studies typically have thousands of topics or users and much sparser relevance judgements, so it is not clear if recommendations for analyzing traditional TREC experiments apply to these settings. In this paper, we empirically study the behavior of significance tests with large search and recommendation evaluation data. Our results show that the Wilcoxon and Sign tests show significantly higher Type-1 error rates for large sample sizes than the bootstrap, randomization and t-tests, which were more consistent with the expected error rate. While the statistical tests displayed differences in their power for smaller sample sizes, they showed no difference in their power for large sample sizes. We recommend the sign and Wilcoxon tests should not be used to analyze large scale evaluation results. Our result demonstrate that with Top-N recommendation and large search evaluation data, most tests would have a 100\% chance of finding statistically significant results. Therefore, the effect size should be used to determine practical or scientific significance.}, urldate = {2023-07-23}, booktitle = {Proceedings of the 46th {International} {ACM} {SIGIR} {Conference} on {Research} and {Development} in {Information} {Retrieval}}, publisher = {Association for Computing Machinery}, author = {Ihemelandu, Ngozi and Ekstrand, Michael D.}, month = jul, year = {2023}, keywords = {evaluation, statistical inference}, pages = {2087--2091}, }
@inproceedings{raj_measuring_2022, title = {Measuring fairness in ranked results: an analytical and empirical comparison}, url = {https://md.ekstrandom.net/pubs/fair-ranking}, doi = {10.1145/3477495.3532018}, abstract = {Information access systems, such as search and recommender systems, often use ranked lists to present results believed to be relevant to the user's information need. Evaluating these lists for their fairness along with other traditional metrics provides a more complete understanding of an information access system's behavior beyond accuracy or utility constructs. To measure the (un)fairness of rankings, particularly with respect to the protected group(s) of producers or providers, several metrics have been proposed in the last several years. However, an empirical and comparative analyses of these metrics showing the applicability to specific scenario or real data, conceptual similarities, and differences is still lacking. We aim to bridge the gap between theoretical and practical ap-plication of these metrics. In this paper we describe several fair ranking metrics from the existing literature in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data sets in the context of three information access tasks. We also provide a sensitivity analysis to assess the impact of the design choices and parameter settings that go in to these metrics and point to additional work needed to improve fairness measurement.}, booktitle = {Proceedings of the 45th {International} {ACM} {SIGIR} {Conference} on {Research} and {Development} in {Information} {Retrieval}}, publisher = {ACM}, author = {Raj, Amifa and Ekstrand, Michael D}, month = jul, year = {2022}, pages = {726--736}, }
@article{ekstrand_exploring_2021, title = {Exploring author gender in book rating and recommendation}, volume = {31}, issn = {0924-1868}, url = {https://md.ekstrandom.net/pubs/bag-extended}, doi = {10.1007/s11257-020-09284-2}, abstract = {Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items. Many of the patterns in rating datasets reflect important real-world differences between the various users and items in the data; other patterns may be irrelevant or possibly undesirable for social or ethical reasons, particularly if they reflect undesired discrimination, such as discrimination in publishing or purchasing against authors who are women or ethnic minorities. In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to a dimension of social concern, namely content creator gender. Using publicly-available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data. We find that common collaborative filtering algorithms differ in the gender distribution of their recommendation lists, and in the relationship of that output distribution to user profile distribution.}, number = {3}, urldate = {2020-06-05}, journal = {User Modeling and User-Adapted Interaction}, author = {Ekstrand, Michael D and Kluver, Daniel}, month = jul, year = {2021}, pages = {377--420}, }
@mastersthesis{vanhaesebroeck_music_2020, address = {Belgium}, title = {Music recommendation using genetic programming}, url = {https://libstore.ugent.be/fulltxt/RUG01/002/945/760/RUG01-002945760_2021_0001_AC.pdf}, urldate = {2025-05-30}, school = {Ghent University}, author = {Vanhaesebroeck, Robbe}, year = {2020}, }
@mastersthesis{da_silva_user-specific_2020, address = {Portugal}, title = {User-{Specific} {Bicluster}-{Based} {Collaborative} {Filtering}}, copyright = {Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.}, url = {https://www.proquest.com/docview/2652593247/abstract/29EEDE32E67A4219PQ/1}, abstract = {Os sistemas de recomendação são um conjunto de técnicas e software que têm como objetivo sugerir itens a um determinado utilizador. Sugestões essas que têm como objetivo ajudar os utilizadores durante a tomada de decisão. O processo para uma tomada de decisão pode ser difícil, especialmente quando existe um enorme número de opções para escolher. Grandes empresas tiram partido dos sistemas de recomendação para melhorar o seu serviço e aumentar as suas receitas. Um exemplo é a plataforma de streaming Netflix que, utilizando um sistema de recomendação, personaliza os filmes ou séries destacados para cada cliente. As recomendações personalizadas normalmente têm como base os dados que as empresas recolhem dos utilizadores, que vão desde reações explícitas, por exemplo através avaliações do utilizador a produtos, a reações implícitas, examinando a forma como o utilizador interage com o sistema. Uma das abordagens mais populares dos sistemas de recomendação é a filtragem colaborativa. Os métodos baseados em filtragem colaborativa produzem recomendações personalizadas de itens, tendo por base padrões encontrados em dados de uso ou avaliações anteriores. Os modelos de filtragem colaborativa normalmente usam uma simples matriz de dados, conhecida como matriz de interação U-I, que contém as avaliações que os utilizadores deram aos itens do sistema. Explorando os dados da matriz U-I, a filtragem colaborativa assume que, se um determinado utilizador teve as mesmas preferências que outro utilizador no passado, é provável que também venha a ter no futuro. Desta forma, os modelos de filtragem colaborativa têm como objetivo recomendar uma lista de N itens a um utilizador (denominado utilizador ativo), ou prever o rating que esse utilizador iria dar a um item que ainda não avaliou. Na literatura, os métodos de filtragem colaborativa são divididos em duas classes: os baseados em memórias e os baseados em modelos. Os algoritmos baseados em memória, também conhecidos como algoritmos de vizinhança, usam toda a matriz U-I para realizar as tarefas de recomendação. Os dois principais métodos são conhecidos como “User-based” e “Item-based”. O User-based tenta encontrar utilizadores com preferências parecidas ao utilizador a que se pretende fazer recomendações e usa os dados dessa vizinhança de utilizadores similares para fazer as previsões ou recomendações. Por outro lado, os algoritmos Item-based utilizam os itens já avaliados pelo utilizador ativo, calculam a similaridade entre esses itens e o item que se quer avaliar, construindo assim uma vizinhança de itens. A partir dessa vizinhança de itens, prevê-se uma futura avaliação do utilizador a esse mesmo item. Apesar de os algoritmos de vizinhança obterem bom resultados de previsão e recomendação, apresentam duas grandes debilidades que limitam o seu uso em ambientes de recomendação do mundo real. Os dados de recomendação são normalmente de grandes dimensões e esparsos, isto é, com muitos valores em falta. Dada a complexidade resultante do facto de terem de comparar todos os utilizadores ou itens entre si, o que se traduz em n 2 comparações, torna-se impraticável o uso de algoritmos deste género em sistemas com grande quantidade de users e itens. Além disso, o facto de haver muitos valores em falta, faz que seja recorrente alguns utilizadores/itens terem pequenas vizinhanças. Para tentar lidar com as fraquezas dos algoritmos baseados em memórias, surgiram os algoritmos baseados em modelos. Estas abordagens utilizam modelos que aprendem com os dados e reconhecem padrões para realizar as tarefas de filtragem colaborativa. Técnicas de redução de dimensionalidade como “Singular Value Decomposition” e “Latent Semantic Analysis” são agora as abordagens standard para reduzir a natureza esparsa da matriz de interação. Existem ainda abordagens baseadas em aprendizagem automática, como redes bayesianas, agrupamento de dados, entre outras. Estes modelos de redução de dimensionalidade, apesar de perderem informação que geralmente resulta em piores resultados em termos de previsão/recomendação, conseguem lidar com o problema da escalabilidade apresentado pelos modelos baseados em memória. Alternate abstract: Collaborative Filtering is one of the most popular and successful approaches for Recommender Systems. However, some challenges limit the effectiveness of Collaborative Filtering approaches when dealing with recommendation data, mainly due to the vast amounts of data and their sparse nature. In order to improve the scalability and performance of Collaborative Filtering approaches, several authors proposed successful approaches combining Collaborative Filtering with clustering techniques. In this work, we study the effectiveness of biclustering, an advanced clustering technique that groups rows and columns simultaneously, in Collaborative Filtering. When applied to the classic U-I interaction matrices, biclustering considers the duality relations between users and items, creating clusters of users who are similar under a particular group of items. We propose USBCF, a novel biclustering-based Collaborative Filtering approach that creates user specific models to improve the scalability of traditional CF approaches. Using a realworld dataset, we conduct a set of experiments to objectively evaluate the performance of the proposed approach, comparing it against baseline and state-of-the-art Collaborative Filtering methods. Our results show that the proposed approach can successfully suppress the main limitation of the previously proposed state-of-the-art biclustering-based Collaborative Filtering (BBCF) since BBCF can only output predictions for a small subset of the system users and item (lack of coverage). Moreover, USBCF produces rating predictions with quality comparable to the state-of-the-art approaches.}, language = {English}, urldate = {2025-05-30}, school = {Universidade de Lisboa (Portugal)}, author = {da Silva, Miguel Miranda Garção}, year = {2020}, note = {ISBN: 9798209925156}, }
@inproceedings{diaz_evaluating_2020, series = {{CIKM} '20}, title = {Evaluating stochastic rankings with expected exposure}, url = {http://arxiv.org/abs/2004.13157}, doi = {10.1145/3340531.3411962}, abstract = {We introduce the concept of expected exposure as the average attention ranked items receive from users over repeated samples of the same query. Furthermore, we advocate for the adoption of the principle of equal expected exposure: given a fixed information need, no item receive more or less expected exposure compared to any other item of the same relevance grade. We argue that this principle is desirable for many retrieval objectives and scenarios, including topical diversity and fair ranking. Leveraging user models from existing retrieval metrics, we propose a general evaluation methodology based on expected exposure and draw connections to related metrics in information retrieval evaluation. Importantly, this methodology relaxes classic information retrieval assumptions, allowing a system, in response to a query, to produce a distribution over rankings instead of a single fixed ranking. We study the behavior of the expected exposure metric and stochastic rankers across a variety of information access conditions, including ad hoc retrieval and recommendation. We believe that measuring and optimizing expected exposure metrics using randomization opens a new area for retrieval algorithm development and progress.}, booktitle = {Proceedings of the 29th {ACM} {International} {Conference} on {Information} and {Knowledge} {Management}}, publisher = {ACM}, author = {Diaz, Fernando and Mitra, Bhaskar and Ekstrand, Michael D and Biega, Asia J and Carterette, Ben}, month = oct, year = {2020}, }
@inproceedings{raj_comparing_2020, title = {Comparing fair ranking metrics}, url = {http://arxiv.org/abs/2009.01311}, abstract = {Ranking is a fundamental aspect of recommender systems. However, ranked outputs can be susceptible to various biases; some of these may cause disadvantages to members of protected groups. Several metrics have been proposed to quantify the (un)fairness of rankings, but there has not been to date any direct comparison of these metrics. This complicates deciding what fairness metrics are applicable for specific scenarios, and assessing the extent to which metrics agree or disagree. In this paper, we describe several fair ranking metrics in a common notation, enabling direct comparison of their approaches and assumptions, and empirically compare them on the same experimental setup and data set. Our work provides a direct comparative analysis identifying similarities and differences of fair ranking metrics selected for our work.}, author = {Raj, Amifa and Wood, Connor and Montoly, Ananda and Ekstrand, Michael D}, month = sep, year = {2020}, }
Original LensKit (Java)
If you publish research that uses the old Java version of LensKit, cite:
BibTeX
@INPROCEEDINGS{LensKit,
title = "Rethinking the Recommender Research Ecosystem: Reproducibility, Openness, and {LensKit}",
booktitle = "Proceedings of the Fifth {ACM} Conference on Recommender Systems",
author = "Ekstrand, Michael D and Ludwig, Michael and Konstan, Joseph A and Riedl, John T",
publisher = "ACM",
pages = "133--140",
series = "RecSys '11",
year = 2011,
url = "http://doi.acm.org/10.1145/2043932.2043958",
conference = "RecSys '11",
doi = "10.1145/2043932.2043958"
}

<script src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2FTJPPJ92X%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1&jsonp=1"></script>
<?php
$contents = file_get_contents("https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2FTJPPJ92X%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1");
print_r($contents);
?>
<iframe src="https://bibbase.org/show?bib=https%3A%2F%2Fapi.zotero.org%2Fusers%2F6655%2Fcollections%2FTJPPJ92X%2Fitems%3Fkey%3DVFvZhZXIoHNBbzoLZ1IM2zgf%26format%3Dbibtex%26limit%3D100&jsonp=1"></iframe>
For more details see the documention.
To the site owner:
Action required! Mendeley is changing its API. In order to keep using Mendeley with BibBase past April 14th, you need to:
- renew the authorization for BibBase on Mendeley, and
- update the BibBase URL in your page the same way you did when you initially set up this page.
@unpublished{aridor_economics_2022, title = {The {Economics} of {Recommender} {Systems}: {Evidence} from a {Field} {Experiment} on {MovieLens}}, url = {http://arxiv.org/abs/2211.14219}, abstract = {We conduct a field experiment on a movie-recommendation platform to identify if and how recommendations affect consumption. We use within-consumer randomization at the good level and elicit beliefs about unconsumed goods to disentangle exposure from informational effects. We find recommendations increase consumption beyond its role in exposing goods to consumers. We provide support for an informational mechanism: recommendations affect consumers' beliefs, which in turn explain consumption. Recommendations reduce uncertainty about goods consumers are most uncertain about and induce information acquisition. Our results highlight the importance of recommender systems' informational role when considering policies targeting these systems in online marketplaces.}, author = {Aridor, Guy and Goncalves, Duarte and Kluver, Daniel and Kong, Ruoyan and Konstan, Joseph}, month = nov, year = {2022}, note = {ISBN: 2211.14219 Publication Title: arXiv [econ.GN]}, }
@article{ibrahim_hybrid_2021, title = {Hybrid {Recommender} for {Research} {Papers} and {Articles}}, volume = {10}, url = {http://article.ijoiis.com/pdf/10.11648.j.ijiis.20211002.11.pdf}, abstract = {… GroupLens called LensKit , along with set of tools for such system was used to implement Collaborative filtering algorithm. This research uses only the LensKit -core and LensKit -data-structures modules to implement this section of the algorithm …}, number = {2}, journal = {Int. J. Intell. Inf. Database Syst.}, author = {Ibrahim, Alhassan Jamilu and Zira, Peter and Abdulganiyyi, Nuraini}, year = {2021}, note = {Publisher: Science Publishing Group}, pages = {9}, }
@inproceedings{wei_recommender_2021, address = {New York, NY, USA}, title = {Recommender {Systems} for {Software} {Project} {Managers}}, url = {https://doi.org/10.1145/3463274.3463951}, doi = {10.1145/3463274.3463951}, abstract = {The design of recommendation systems is based on complex information processing and big data interaction. This personalized view has evolved into a hot area in the past decade, where applications might have been proved to help for solving problem in the software development field. Therefore, with the evolvement of Recommendation System in Software Engineering (RSSE), the coordination of software projects with their stakeholders is improving. This experiment examines four open source recommender systems and implemented a customized recommender engine with two industrial-oriented packages: Lenskit and Mahout. Each of the main functions was examined and issues were identified during the experiment.}, urldate = {2021-09-14}, booktitle = {{EASE} 2021}, publisher = {Association for Computing Machinery}, author = {Wei, Liang and Capretz, Luiz Fernando}, month = jun, year = {2021}, note = {Journal Abbreviation: EASE 2021}, keywords = {RSSE, Recommender Engine, Project Management, Recommendation System, Recommendation System in Software Engineering}, pages = {412--417}, }
@inproceedings{zhou_privacy_2021, title = {Privacy and performance in recommender systems: {Exploration} of potential influence of {CCPA}}, url = {http://2021.cswimworkshop.org/wp-content/uploads/2021/06/cswim2021_paper_80.pdf}, urldate = {2021-07-12}, author = {Zhou, Meizi and Song, Yicheng and Adomavicius, Gediminas}, year = {2021}, }
@article{wischenbart_engaging_2021, title = {Engaging end-user driven recommender systems: personalization through web augmentation}, volume = {80}, issn = {1380-7501}, url = {https://doi.org/10.1007/s11042-020-09803-8}, doi = {10.1007/s11042-020-09803-8}, abstract = {In the past decades recommender systems have become a powerful tool to improve personalization on the Web. Yet, many popular websites lack such functionality, its implementation usually requires certain technical skills, and, above all, its introduction is beyond the scope and control of end-users. To alleviate these problems, this paper presents a novel tool to empower end-users without programming skills, without any involvement of website providers, to embed personalized recommendations of items into arbitrary websites on client-side. For this we have developed a generic meta-model to capture recommender system configuration parameters in general as well as in a web augmentation context. Thereupon, we have implemented a wizard in the form of an easy-to-use browser plug-in, allowing the generation of so-called user scripts, which are executed in the browser to engage collaborative filtering functionality from a provided external rest service. We discuss functionality and limitations of the approach, and in a study with end-users we assess the usability and show its suitability for combining recommender systems with web augmentation techniques, aiming to empower end-users to implement controllable recommender applications for a more personalized browsing experience.}, number = {5}, journal = {Multimed. Tools Appl.}, author = {Wischenbart, Martin and Firmenich, Sergio and Rossi, Gustavo and Bosetti, Gabriela and Kapsammer, Elisabeth}, month = feb, year = {2021}, pages = {6785--6809}, }
@unpublished{bellogin_improving_2021, title = {Improving {Accountability} in {Recommender} {Systems} {Research} {Through} {Reproducibility}}, url = {http://arxiv.org/abs/2102.00482}, abstract = {Reproducibility is a key requirement for scientific progress. It allows the reproduction of the works of others, and, as a consequence, to fully trust the reported claims and results. In this work, we argue that, by facilitating reproducibility of recommender systems experimentation, we indirectly address the issues of accountability and transparency in recommender systems research from the perspectives of practitioners, designers, and engineers aiming to assess the capabilities of published research works. These issues have become increasingly prevalent in recent literature. Reasons for this include societal movements around intelligent systems and artificial intelligence striving towards fair and objective use of human behavioral data (as in Machine Learning, Information Retrieval, or Human-Computer Interaction). Society has grown to expect explanations and transparency standards regarding the underlying algorithms making automated decisions for and around us. This work surveys existing definitions of these concepts, and proposes a coherent terminology for recommender systems research, with the goal to connect reproducibility to accountability. We achieve this by introducing several guidelines and steps that lead to reproducible and, hence, accountable experimental workflows and research. We additionally analyze several instantiations of recommender system implementations available in the literature, and discuss the extent to which they fit in the introduced framework. With this work, we aim to shed light on this important problem, and facilitate progress in the field by increasing the accountability of research.}, author = {Bellogín, Alejandro and Said, Alan}, month = jan, year = {2021}, note = {ISBN: 2102.00482 Publication Title: arXiv [cs.IR]}, }
@article{cheng_understanding_2020, title = {Understanding the {Impact} of {Individual} {Users}’ {Rating} {Characteristics} on the {Predictive} {Accuracy} of {Recommender} {Systems}}, volume = {32}, issn = {1091-9856}, url = {https://doi.org/10.1287/ijoc.2018.0882}, doi = {10.1287/ijoc.2018.0882}, abstract = {In this study, we investigate how individual users? rating characteristics affect the user-level performance of recommendation algorithms. We measure users? rating characteristics from three perspectives: rating value, rating structure, and neighborhood network embeddedness. We study how these three categories of measures influence the predictive accuracy of popular recommendation algorithms for each user. Our experiments use five real-world data sets with varying characteristics. For each individual user, we estimate the predictive accuracy of three recommendation algorithms. We then apply regression-based models to uncover the relationships between rating characteristics and recommendation performance at the individual user level. Our experimental results show consistent and significant effects of several rating measures on recommendation accuracy. Understanding how rating characteristics affect the recommendation performance at the individual user level has practical implications for the design of recommender systems.}, number = {2}, journal = {INFORMS J. Comput.}, author = {Cheng, Xiaoye and Zhang, Jingjing and Yan, Lu (lucy)}, month = apr, year = {2020}, note = {Publisher: INFORMS}, pages = {303--320}, }
@article{kotkov_how_2020, title = {How does serendipity affect diversity in recommender systems? {A} serendipity-oriented greedy algorithm}, volume = {102}, issn = {0144-3097}, url = {http://link.springer.com/10.1007/s00607-018-0687-5}, doi = {10.1007/s00607-018-0687-5}, abstract = {Most recommender systems suggest items that are popular among all users and similar to items a user usually consumes. As a result, the user receives recommendations that she/he is already familiar with or would find anyway, leading to low satisfaction. To overcome this problem, a recommender system should suggest novel, relevant and unexpected i.e., serendipitous items. In this paper, we propose a serendipity-oriented, reranking algorithm called a serendipity-oriented greedy (SOG) algorithm, which improves serendipity of recommendations through feature diversification and helps overcome the overspecialization problem. To evaluate our algorithm, we employed the only publicly available dataset containing user feedback regarding serendipity. We compared our SOG algorithm with topic diversification, popularity baseline, singular value decomposition, serendipitous personalized ranking and Zheng’s algorithms relying on the above dataset. SOG outperforms other algorithms in terms of serendipity and diversity. It also outperforms serendipity-oriented algorithms in terms of accuracy, but underperforms accuracy-oriented algorithms in terms of accuracy. We found that the increase of diversity can hurt accuracy and harm or improve serendipity depending on the size of diversity increase.}, number = {2}, journal = {Computing}, author = {Kotkov, Denis and Veijalainen, Jari and Wang, Shuaiqiang}, month = feb, year = {2020}, pages = {393--411}, }
@phdthesis{noffsinger_predictive_2020, address = {Ann Arbor, United States}, title = {Predictive {Accuracy} of {Recommender} {Algorithms}}, url = {https://libproxy.boisestate.edu/login?url=https://www-proquest-com.libproxy.boisestate.edu/dissertations-theses/predictive-accuracy-recommender-algorithms/docview/2466761384/se-2}, abstract = {Recommender systems present a customized list of items based upon user or item characteristics with the objective of reducing a large number of possible choices to a smaller ranked set most likely to appeal to the user. A variety of algorithms for recommender systems have been developed and refined including applications of deep learning neural networks. Recent research reports point to a need to perform carefully controlled experiments to gain insights about the relative accuracy of different recommender algorithms, because studies evaluating different methods have not used a common set of benchmark data sets, baseline models, and evaluation metrics.The dissertation used publicly available sources of ratings data with a suite of three conventional recommender algorithms and two deep learning (DL) algorithms in controlled experiments to assess their comparative accuracy. Results for the non-DL algorithms conformed well to published results and benchmarks. The two DL algorithms did not perform as well and illuminated known challenges implementing DL recommender algorithms as reported in the literature. Model overfitting is discussed as a potential explanation for the weaker performance of the DL algorithms and several regularization strategies are reviewed as possible approaches to improve predictive error. Findings justify the need for further research in the use of deep learning models for recommender systems.}, school = {Nova Southeastern University}, author = {Noffsinger, William B}, collaborator = {Mukherjee, Sumitra}, year = {2020}, note = {Publication Title: Information Systems (DISS)}, }
@article{gazdar_new_2020, title = {A new similarity measure for collaborative filtering based recommender systems}, volume = {188}, issn = {0950-7051}, url = {http://www.sciencedirect.com/science/article/pii/S0950705119304484}, doi = {10.1016/j.knosys.2019.105058}, abstract = {The objective of a recommender system is to provide customers with personalized recommendations while selecting an item among a set of products (movies, books, etc.). The collaborative filtering is the most used technique for recommender systems. One of the main components of a recommender system based on the collaborative filtering technique, is the similarity measure used to determine the set of users having the same behavior with regard to the selected items. Several similarity functions have been proposed, with different performances in terms of accuracy and quality of recommendations. In this paper, we propose a new simple and efficient similarity measure. Its mathematical expression is determined through the following paper contributions: 1) transforming some intuitive and qualitative conditions, that should be satisfied by the similarity measure, into relevant mathematical equations namely: the integral equation, the linear system of differential equations and a non-linear system and 2) resolving the equations to achieve the kernel function of the similarity measure. The extensive experimental study driven on a benchmark datasets shows that the proposed similarity measure is very competitive, especially in terms of accuracy, with regards to some representative similarity measures of the literature.}, journal = {Knowledge-Based Systems}, author = {Gazdar, Achraf and Hidri, Lotfi}, month = jan, year = {2020}, keywords = {Collaborative filtering, Neighborhood based CF, Recommendation systems, Similarity measure}, pages = {105058}, }
@inproceedings{polychronou_machine_2020, title = {Machine {Learning} {Algorithms} for {Food} {Intelligence}: {Towards} a {Method} for {More} {Accurate} {Predictions}}, url = {http://dx.doi.org/10.1007/978-3-030-39815-6_16}, doi = {10.1007/978-3-030-39815-6_16}, abstract = {It is evident that machine learning algorithms are being widely impacting industrial applications and platforms. Beyond typical research experimentation scenarios, there is a need for companies that wish to enhance their online data and analytics solutions to incorporate ways in which they can select, experiment, benchmark, parameterise and choose the version of a machine learning algorithm that seems to be most appropriate for their specific application context. In this paper, we describe such a need for a big data platform that supports food data analytics and intelligence. More specifically, we introduce Agroknow’s big data platform and identify the need to extend it with a flexible and interactive experimentation environment where different machine learning algorithms can be tested using a variation of synthetic and real data. A typical usage scenario is described, based on our need to experiment with various machine learning algorithms to support price prediction for food products and ingredients. The initial requirements for an experimentation environment are also introduced.}, publisher = {Springer International Publishing}, author = {Polychronou, Ioanna and Katsivelis, Panagis and Papakonstantinou, Mihalis and Stoitsis, Giannis and Manouselis, Nikos}, year = {2020}, pages = {165--172}, }
@article{asenova_personalized_2019, title = {Personalized {Micro}-{Service} {Recommendation} {System} for {Online} {News}}, volume = {160}, issn = {1877-0509}, url = {http://www.sciencedirect.com/science/article/pii/S1877050919317399}, doi = {10.1016/j.procs.2019.11.039}, abstract = {In the era of artificial intelligence and high technology advance our life is dependent on them in every aspect. The dynamic environment forces us to plan our time with conscious and every minute is valuable. To help individuals and corporations see information that is only relevant to them, recommendation systems are in place. Popular platforms that such as Amazon, Ebay, Netflix, YouTube, make use of advanced recommendation systems to better serve the needed of their users. This research paper gives insight of building a microservice recommendation system for online news. Research in recommendation systems is mainly focused on improving user’s experience based mainly on personalization information, such as preferences, and searching history. To determine the initial preferences of a user an initial menu of topics/themes is provided for the user to choose from. In order to reflect as precise as possible the searching interests regarding news of user, all of his interactions are thoroughly recorded and in depth analyzed, based on advanced machine learning techniques, when adjusting the news topics, the user is interested for. Based on the aforementioned approach, a personalized recommendation system for online news has been developed. Existing techniques has been researched and evaluated to aid the decision about picking the best approach for the software to be implemented. Frameworks/technologies used for the development are Java 8, Spring boot, Spring MVC, Maven and MongoDB.}, journal = {Procedia Comput. Sci.}, author = {Asenova, Marchela and Chrysoulas, Christos}, month = jan, year = {2019}, keywords = {TF-IDF, collaborative filtering, cosine similarity, recommendation engine, recommendation phases}, pages = {610--615}, }
@inproceedings{shriver_evaluating_2019, title = {Evaluating {Recommender} {System} {Stability} with {Influence}-{Guided} {Fuzzing}}, url = {https://www.comp.nus.edu.sg/~david/Publications/aaai2019-preprint.pdf}, abstract = {Recommender systems help users to find products or services they may like when lacking personal experience or facing an overwhelming set of choices. Since unstable recommendations can lead to distrust, loss of profits, and a poor user experience, it is important to test recommender system stability. In this work, we present an approach based on inferred models of influence that underlie recommender systems to guide the generation of dataset modifications to assess a recommender's stability. We implement our approach …}, publisher = {AAAI}, author = {Shriver, David and Elbaum, Sebastian and Dwyer, Matthew B and Rosenblum, David S}, year = {2019}, }
@inproceedings{karpus_things_2019, title = {Things you might not know about the k-{Nearest} neighbors algorithm}, url = {https://www.researchgate.net/profile/Adam_Przybylek/publication/336235570_Things_You_Might_Not_Know_about_the_k-Nearest_Neighbors_Algorithm/links/5daf2307a6fdccc99d92bf9f/Things-You-Might-Not-Know-about-the-k-Nearest-Neighbors-Algorithm.pdf}, author = {Karpus, Aleksandra and Raczyńska, M and Przybyłek, A}, year = {2019}, }
@inproceedings{ekstrand_all_2018, series = {Proceedings of {Machine} {Learning} {Research}}, title = {All the cool kids, how do they fit in?: popularity and demographic biases in recommender evaluation and effectiveness}, volume = {81}, url = {https://proceedings.mlr.press/v81/ekstrand18b.html}, abstract = {In the research literature, evaluations of recommender system effectiveness typically report results over a given data set, providing an aggregate measure of effectiveness over each instance (e.g. user) in the data set. Recent advances in information retrieval evaluation, however, demonstrate the importance of considering the distribution of effectiveness across diverse groups of varying sizes. For example, do users of different ages or genders obtain similar utility from the system, particularly if their group is a relatively small subset of the user base? We apply this consideration to recommender systems, using offline evaluation and a utility-based metric of recommendation effectiveness to explore whether different user demographic groups experience similar recommendation accuracy. We find demographic differences in measured recommender effectiveness across two data sets containing different types of feedback in different domains; these differences sometimes, but not always, correlate with the size of the user group in question. Demographic effects also have a complex—and likely detrimental—interaction with popularity bias, a known deficiency of recommender evaluation. These results demonstrate the need for recommender system evaluation protocols that explicitly quantify the degree to which the system is meeting the information needs of all its users, as well as the need for researchers and operators to move beyond naïve evaluations that favor the needs of larger subsets of the user population while ignoring smaller subsets.}, booktitle = {Proceedings of the 1st {Conference} on {Fairness}, {Accountability} and {Transparency}}, publisher = {PMLR}, author = {Ekstrand, Michael D and Tian, Mucun and Azpiazu, Ion Madrazo and Ekstrand, Jennifer D and Anuyah, Oghenemaro and McNeill, David and Pera, Maria Soledad}, editor = {Friedler, Sorelle A and Wilson, Christo}, year = {2018}, note = {Journal Abbreviation: Proceedings of Machine Learning Research}, pages = {172--186}, }
@inproceedings{ekstrand_exploring_2018, address = {New York, NY, USA}, title = {Exploring author gender in book rating and recommendation}, url = {https://dl.acm.org/doi/10.1145/3240323.3240373}, doi = {10.1145/3240323.3240373}, abstract = {Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items. Many of the patterns in rating datasets reflect important real-world differences between the various users and items in the data; other patterns may be irrelevant or possibly undesirable for social or ethical reasons, particularly if they reflect undesired discrimination, such as gender or ethnic discrimination in publishing. In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to a dimension of social concern, namely content creator gender. Using publicly-available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data. We find that common collaborative filtering algorithms differ in the gender distribution of their recommendation lists, and in the relationship of that output distribution to user profile distribution.}, publisher = {ACM}, author = {Ekstrand, Michael D and Tian, Mucun and Kazi, Mohammed R Imran and Mehrpouyan, Hoda and Kluver, Daniel}, month = sep, year = {2018}, }
@inproceedings{dragovic_recommendation_2018, title = {From recommendation to curation: when the system becomes your personal docent}, url = {http://ceur-ws.org/Vol-2225/paper6.pdf}, abstract = {Curation is the act of selecting, organizing, and presenting content. Some applications emulate this process by turning users into curators, while others use recommenders to select items, seldom achieving the focus or selectivity of human curators. We bridge this gap with a …}, author = {Dragovic, Nevena and Azpiazu, Ion Madrazo and Pera, Maria Soledad}, month = oct, year = {2018}, pages = {37--44}, }
@article{cami_user_2018, title = {User preferences modeling using dirichlet process mixture model for a content-based recommender system}, issn = {0950-7051}, url = {http://www.sciencedirect.com/science/article/pii/S0950705118304799}, doi = {10.1016/j.knosys.2018.09.028}, abstract = {Recommender systems have been developed to assist users in retrieving relevant resources. Collaborative and content-based filtering are two basic approaches that are used in recommender systems. The former employs the feedback of users with similar interests, while the latter is based on the feature of the selected resources by each user. Recommender systems can consider users’ behavior to more accurately estimate their preferences via a list of recommendations. However, the existing approaches rarely consider both interests and preferences of the users. Also, the dynamic nature of user behavior poses an additional challenge for recommender systems. In this paper, we consider the interactions of each individual user, and analyze them to propose a user model and capture user’s interests. We construct the user model based on a Bayesian nonparametric framework, called the Dirichlet Process Mixture Model. The proposed model evolves following the dynamic nature of user behavior to adapt both the user interests and preferences. We implemented the proposed model and evaluated it using both the MovieLens dataset, and a real-world dataset that contains news tweets from five news channels (New York Times, BBC, CNN, Reuters and Associated Press). The experimental results and comparisons with several recently developed approaches show the superiority in accuracy of the proposed approach, and its ability to adapt with user behavior over time.}, journal = {Knowledge-Based Systems}, author = {Cami, Bagher Rahimpour and Hassanpour, Hamid and Mashayekhi, Hoda}, month = sep, year = {2018}, keywords = {Temporal content-based recommender systems, User behavior modeling, User preferences modeling}, }
@inproceedings{carvalho_fair_2018, title = {{FAiR}: {A} {Framework} for {Analyses} and {Evaluations} on {Recommender} {Systems}}, url = {http://dx.doi.org/10.1007/978-3-319-95168-3_26}, doi = {10.1007/978-3-319-95168-3_26}, abstract = {Recommender systems (RSs) have become essential tools in e-commerce applications, helping users in the decision-making process. Evaluation on these tools is, however, a major divergence point nowadays, since there is no consensus regarding which metrics are necessary to consolidate new RSs. For this reason, distinct frameworks have been developed to ease the deployment of RSs in research and/or production environments. In the present work, we perform an extensive study of the most popular evaluation metrics, organizing them into three groups: Effectiveness-based, Complementary Dimensions of Quality and Domain Profiling. Further, we consolidate a framework named FAiR to help researchers in evaluating their RSs using these metrics, besides identifying the characteristics of data collections that may intrinsically affect RSs performance. FAiR is compatible with the output format of the main existing RSs libraries (i.e., MyMediaLite and LensKit).}, publisher = {Springer International Publishing}, author = {Carvalho, Diego and Silva, Nícollas and Silveira, Thiago and Mourão, Fernando and Pereira, Adriano and Dias, Diego and Rocha, Leonardo}, year = {2018}, pages = {383--397}, }
@inproceedings{coba_replicating_2018, address = {New York, NY, USA}, title = {Replicating and {Improving} {Top}-{N} {Recommendations} in {Open} {Source} {Packages}}, url = {http://doi.acm.org/10.1145/3227609.3227671}, doi = {10.1145/3227609.3227671}, booktitle = {{WIMS} '18}, publisher = {ACM}, author = {Coba, Ludovik and Symeonidis, Panagiotis and Zanker, Markus}, year = {2018}, note = {Journal Abbreviation: WIMS '18}, keywords = {Collaborative Filtering, Recommendation algorithms, evaluation}, pages = {40:1--40:7}, }
@article{yang_improving_2018, title = {Improving {Existing} {Collaborative} {Filtering} {Recommendations} via {Serendipity}-{Based} {Algorithm}}, volume = {20}, issn = {1520-9210}, url = {http://dx.doi.org/10.1109/TMM.2017.2779043}, doi = {10.1109/TMM.2017.2779043}, abstract = {In this paper, we study how to address the sparsity, accuracy and serendipity issues of top-N recommendation with collaborative filtering (CF). Existing studies commonly use rated items (which form only a small section in a rating matrix) or import some additional information (e.g., details about the items and users) to improve the performance of CF. Unlike these methods, we propose a novel notion towards a huge amount of unrated items: serendipity item. By utilizing serendipity items, we propose concise satisfaction and interest injection (CSII), a method that can effectively find interesting, satisfying, and serendipitous items in unrated items. By preventing uninteresting and unsatisfying items to be recommended as top-N items, this concise-but-novel method improves accuracy and recommendation quality (especially serendipity) substantially. Meanwhile, it can address the sparsity and cold-start issues by enriching the rating matrix in CF without additional information. As our method tackles rating matrix before recommendation procedure, it can be applied to most existing CF methods, such as item-based CF, user-based CF and matrix factorization-based CF. Through comprehensive experiments using abundant real-world datasets with LensKit implementation, we successfully demonstrate that our solution improves the performance of existing CF methods consistently and universally. Moreover, comparing with baseline methods, CSII can extract uninteresting items more carefully and cautiously, avoiding potential items inferred by mistake.}, number = {7}, journal = {IEEE Trans. Multimedia}, author = {Yang, Y and Xu, Y and Wang, E and Han, J and Yu, Z}, month = jul, year = {2018}, keywords = {CF methods, CSII, Collaboration, Collaborative filtering, Computer science, Data mining, Lifting equipment, Multimedia communication, Recommender systems, cold-start issues, collaborative filtering, collaborative filtering recommendations, concise satisfaction and interest injection, item-based CF, matrix decomposition, matrix factorization, matrix factorization-based CF, rating matrix, recommendation quality, recommender systems, serendipitous recommendation, serendipity item, top-N items, top-N recommendation, unrated items, user-based CF}, pages = {1888--1900}, }
@mastersthesis{shriver_assessing_2018, title = {Assessing the {Quality} and {Stability} of {Recommender} {Systems}}, url = {https://digitalcommons.unl.edu/computerscidiss/147}, abstract = {Recommender systems help users to find products they may like when lacking personal experience or facing an overwhelmingly large set of items. However, assessing the quality and stability of recommender systems can present challenges for developers. First, traditional accuracy metrics, such as precision and recall, for validating the quality of recommendations, offer only a coarse, one-dimensional view of the system performance. Second, assessing the stability of a recommender systems requires generating new data and retraining a system, which is expensive. In this work, we present two new approaches for assessing the quality and stability of recommender systems to address these challenges. We first present a general and extensible approach for assessing the quality of the behavior of a recommender system using logical property templates. The approach is general in that it defines recommendation systems in terms of sets of rankings, ratings, users, and items on which property templates are defined. It is extensible in that these property templates define a space of properties that can be instantiated and parameterized to characterize a recommendation system. We study the application of the approach to several recommendation systems. Our findings demonstrate the potential of these properties, illustrating the insights they can provide about the different algorithms and evolving datasets. We also present an approach for influence-guided fuzz testing of recommender system stability. We infer influence models for aspects of a dataset, such as users or items, from the recommendations produced by a recommender system and its training data. We define dataset fuzzing heuristics that use these influence models for generating modifications to an original dataset and we present a test oracle based on a threshold of acceptable instability. We implement our approach and evaluate it on several recommender algorithms using the MovieLens dataset and we find that influence-guided fuzzing can effectively find small sets of modifications that cause significantly more instability than random approaches. Adviser: Sebastian Elbaum}, urldate = {2018-05-08}, school = {University of Nebraska - Lincoln}, author = {Shriver, David}, collaborator = {Elbaum, Sebastian}, year = {2018}, note = {Publication Title: Computer Science and Engineering}, }
@book{kotkov_serendipity_2018, title = {Serendipity in recommender systems}, isbn = {978-951-39-7438-1}, url = {https://jyx.jyu.fi/handle/123456789/58207}, abstract = {The number of goods and services (such as accommodation or music streaming) offered by e-commerce websites does not allow users to examine all the available options in a reasonable amount of time. Recommender systems are auxiliary systems designed to help users find interesting goods or services (items) on a website when the number of available items is overwhelming. Traditionally, recommender systems have been optimized for accuracy, which indicates how often a user consumed the items recommended by system. To increase accuracy, recommender systems often suggest items that are popular and suitably similar to items these users have consumed in the past. As a result, users often lose interest in using these systems, as they either know about the recommended items already or can easily find these items themselves. One way to increase user satisfaction and user retention is to suggest serendipitous items. These items are items that users would not find themselves or even look for, but would enjoy consuming. Serendipity in recommender systems has not been thoroughly investigated. There is not even a consensus on the concept’s definition. In this dissertation, serendipitous items are defined as relevant, novel and unexpected to a user. In this dissertation, we (a) review different definitions of the concept and evaluate them in a user study, (b) assess the proportion of serendipitous items in a typical recommender system, (c) review ways to measure and improve serendipity, (d) investigate serendipity in cross-domain recommender systems (systems that take advantage of multiple domains, such as movies, songs and books) and (e) discuss challenges and future directions concerning this topic. We applied a Design Science methodology as the framework for this study and developed four artifacts: (1) a collection of eight variations of serendipity definition, (2) a measure of the serendipity of suggested items, (3) an algorithm that generates serendipitous suggestions, (4) a dataset of user feedback regarding serendipitous movies in the recommender system MovieLens. These artifacts are evaluated using suitable methods and communicated through publications.}, urldate = {2018-07-06}, publisher = {University of Jyväskylä}, author = {Kotkov, Denis}, year = {2018}, }
@article{de_pessemier_heart_2018, title = {Heart rate monitoring, activity recognition, and recommendation for e-coaching}, issn = {1380-7501}, url = {https://link.springer.com/article/10.1007/s11042-018-5640-2}, doi = {10.1007/s11042-018-5640-2}, abstract = {Equipped with hardware, such as accelerometer and heart rate sensor, wearables enable measuring physical activities and heart rate. However, the accuracy of these heart rate measurements is still unclear and the coupling with activity recognition is often missing in health apps. This study evaluates heart rate monitoring with four different device types: a specialized sports device with chest strap, a fitness tracker, a smart watch, and a smartphone using photoplethysmography. In a state of rest, similar measurement results are obtained with the four devices. During physical activities, the fitness tracker, smart watch, and smartphone measure sudden variations in heart rate with a delay, due to movements of the wrist. Moreover, this study showed that physical activities, such as squats and dumbbell curl, can be recognized with fitness trackers. By combining heart rate monitoring and activity recognition, personal suggestions for physical activities are generated using a tag-based recommender and rule-based filter.}, urldate = {2018-02-08}, journal = {Multimed. Tools Appl.}, author = {De Pessemier, Toon and Martens, Luc}, month = jan, year = {2018}, note = {Publisher: Springer US}, pages = {1--18}, }
@inproceedings{ekstrand_sturgeon_2017, series = {{FLAIRS} 30}, title = {Sturgeon and the {Cool} {Kids}: {Problems} with {Top}-{N} {Recommender} {Evaluation}}, url = {https://aaai.org/papers/639-flairs-2017-15534/}, abstract = {Top-N evaluation of recommender systems, typically carried out using metrics from information retrieval or machine learning, has several challenges. Two of these challenges are popularity bias, where the evaluation intrinsically favors algorithms that recommend popular items, and misclassified decoys, where items for which no user relevance is known are actually relevant to the user, but the evaluation is unaware and penalizes the recommender for suggesting them. One strategy for mitigating the misclassified decoy problem is the one-plus-random evaluation strategy and its generalization, which we call random decoys. In this work, we explore the random decoy strategy through both a theoretical treatment and an empirical study, but find little evidence to guide its tuning and show that it has complex and deleterious interactions with popularity bias.}, booktitle = {Proceedings of the 30th {Florida} {Artificial} {Intelligence} {Research} {Society} {Conference}}, publisher = {AAAI Press}, author = {Ekstrand, Michael D and Mahant, Vaibhav}, month = may, year = {2017}, }
@inproceedings{channamsetty_recommender_2017, title = {Recommender response to diversity and popularity bias in user profiles}, url = {https://aaai.org/papers/657-flairs-2017-15524/}, abstract = {Recommender system evaluation usually focuses on the overall effectiveness of the algorithms, either in terms of measurable accuracy or ability to deliver user satisfaction or improve business metrics. When additional factors are considered, such as the diversity or novelty of the recommendations, the focus typically remains on the algorithm’s overall performance. We examine the relationship of the recommender’s output characteristics – accuracy, popularity (as an inverse of novelty), and diversity – to characteristics of the user’s rating profile. The aims of this analysis are twofold: (1) to probe the conditions under which common algorithms produce more or less diverse or popular recommendations, and (2) to determine if these personalized recommender algorithms reflect a user’s preference for diversity or novelty. We trained recommenders on the MovieLens data and looked for correlation between the user profile and the recommender’s output for both diversity and popularity bias using different metrics. We find that the diversity and popularity of movies in users’ profiles has little impact on the recommendations they receive.}, urldate = {2017-05-29}, booktitle = {Proceedings of the 30th {Florida} artificial intelligence research society conference}, publisher = {AAAI Press}, author = {Channamsetty, Sushma and Ekstrand, Michael D}, month = may, year = {2017}, }
@inproceedings{sardianos_scaling_2017, title = {Scaling {Collaborative} {Filtering} to {Large}-{Scale} {Bipartite} {Rating} {Graphs} {Using} {Lenskit} and {Spark}}, url = {http://dx.doi.org/10.1109/BigDataService.2017.28}, doi = {10.1109/BigDataService.2017.28}, abstract = {Popular social networking applications such as Facebook, Twitter, Friendster, etc. generate very large graphs with different characteristics. These social networks are huge, comprising millions of nodes and edges that push existing graph mining algorithms and architectures to their limits. In product-rating graphs, users connect with each other and rate items in tandem. In such bipartite graphs users and items are the nodes and ratings are the edges and collaborative filtering algorithms use the edge information (i.e. user ratings for items) in order to suggest items of potential interest to users. Existing algorithms can hardly scale up to the size of the entire graph and require unlimited resources to finish. This work employs a machine learning method for predicting the performance of Collaborative Filtering algorithms using the structural features of the bipartite graphs. Using a fast graph partitioning algorithm and information from the user friendship graph, the original bipartite graph is partitioned into different schemes (i.e. sets of smaller bipartite graphs). The schemes are evaluated against the predicted performance of the Collaborative Filtering algorithm and the best partitioning scheme is employed for generating the recommendations. As a result, the Collaborative Filtering algorithms are applied to smaller bipartite graphs, using limited resources and allowing the problem to scale or be parallelized. Tests on a large, real-life, rating graph, show that the proposed method allows the collaborative filtering algorithms to run in parallel and complete using limited resources.}, author = {Sardianos, C and Varlamis, I and Eirinaki, M}, month = apr, year = {2017}, keywords = {Bipartite graph, Collaboration, Collaborative Filtering, Graph Metrics, Graph Partitioning, Lenskit, Machine learning algorithms, Partitioning algorithms, Prediction algorithms, Recommender Systems, Recommender systems, Social Networks, Social network services, Spark, bipartite graphs, collaborative filtering, collaborative filtering algorithms, data mining, fast graph partitioning algorithm, graph theory, large-scale bipartite rating graphs, learning (artificial intelligence), machine learning, product-rating graphs, social networking (online), social networking applications, structural features, user-friendship graph}, pages = {70--79}, }
@article{papadakis_scor_2017, title = {{SCoR}: {A} {Synthetic} {Coordinate} based {Recommender} system}, volume = {79}, issn = {0957-4174}, url = {http://www.sciencedirect.com/science/article/pii/S0957417417301070}, doi = {10.1016/j.eswa.2017.02.025}, abstract = {Recommender systems try to predict the preferences of users for specific items, based on an analysis of previous consumer preferences. In this paper, we propose SCoR, a Synthetic Coordinate based Recommendation system which is shown to outperform the most popular algorithmic techniques in the field, approaches like matrix factorization and collaborative filtering. SCoR assigns synthetic coordinates to nodes (users and items), so that the distance between a user and an item provides an accurate prediction of the user’s preference for that item. The proposed framework has several benefits. It is parameter free, thus requiring no fine tuning to achieve high performance, and is more resistance to the cold-start problem compared to other algorithms. Furthermore, it provides important annotations of the dataset, such as the physical detection of users and items with common and unique characteristics as well as the identification of outliers. SCoR is compared against nine other state-of-the-art recommender systems, sever of them based on the well known matrix factorization and two on collaborative filtering. The comparison is performed against four real datasets, including a brief version of the dataset used in the well known Netflix challenge. The extensive experiments prove that SCoR outperforms previous techniques while demonstrating its improved stability and high performance.}, journal = {Expert Syst. Appl.}, author = {Papadakis, Harris and Panagiotakis, Costas and Fragopoulou, Paraskevi}, month = aug, year = {2017}, keywords = {Graph, Matrix factorization, Netflix, Recommender systems, Synthetic coordinates, Vivaldi}, pages = {8--19}, }
@mastersthesis{solvang_video_2017, title = {Video {Recommendation} {Systems}: {Finding} a {Suitable} {Recommendation} {Approach} for an {Application} {Without} {Sufficient} {Data}}, url = {http://hdl.handle.net/10852/59239}, author = {Solvang, Marius Lørstad}, year = {2017}, }
@article{pera_recommending_2017, title = {Recommending books to be exchanged online in the absence of wish lists}, issn = {2330-1643}, url = {http://dx.doi.org/10.1002/asi.23978}, doi = {10.1002/asi.23978}, abstract = {An online exchange system is a web service that allows communities to trade items without the burden of manually selecting them, which saves users' time and effort. Even though online book-exchange systems have been developed, their services can further be improved by reducing the workload imposed on their users. To accomplish this task, we propose a recommendation-based book exchange system, called EasyEx, which identifies potential exchanges for a user solely based on a list of items the user is willing to part with. EasyEx is a novel and unique book-exchange system because unlike existing online exchange systems, it does not require a user to create and maintain a wish list, which is a list of items the user would like to receive as part of the exchange. Instead, EasyEx directly suggests items to users to increase serendipity and as a result expose them to items which may be unfamiliar, but appealing, to them. In identifying books to be exchanged, EasyEx employs known recommendation strategies, that is, personalized mean and matrix factorization, to predict book ratings, which are treated as the degrees of appeal to a user on recommended books. Furthermore, EasyEx incorporates OptaPlanner, which solves constraint satisfaction problems efficiently, as part of the recommendation-based exchange process to create exchange cycles. Experimental results have verified that EasyEx offers users recommended books that satisfy the users' interests and contributes to the item-exchange mechanism with a new design methodology.}, journal = {Journal of the Association for Information Science and Technology}, author = {Pera, Maria Soledad and Ng, Yiu-Kai}, month = nov, year = {2017}, }
@inproceedings{coba_rrecsys_2016, title = {rrecsys: {An} {R}-package for {Prototyping} {Recommendation} {Algorithms}}, url = {https://pdfs.semanticscholar.org/1856/b9e4c19a8ed34c3041911e43c0f3f9e1baa5.pdf}, abstract = {ABSTRACT We introduce rrecsys , an open source extension package in R for rapid prototyping and intuitive assessment of recommender system algorithms. As the only currently available R package for recommender algorithms (recommenderlab) did not}, author = {Çoba, Ludovik and Zanker, Markus}, year = {2016}, keywords = {toolkit}, }
@phdthesis{saha_multi-objective_2016, title = {A {Multi}-objective {Autotuning} {Framework} {For} {The} {Java} {Virtual} {Machine}}, url = {https://digital.library.txstate.edu/handle/10877/6096}, abstract = {Due to inherent limitations in performance, Java was not considered a suitable platform for for scalable high-performance computing (HPC) for a long time. The scenario is changing because of the development of frameworks like Hadoop, Spark and Fast-MPJ. In spite of the increase in usage, achieving high performance with Java is not trivial. High performance in Java relies on libraries providing explicit threads or relying on runnable-like interfaces for distributed programming. In this thesis, we develop an autotuning framework for JVM that manages multiple objective functions including execution time, power consumption, energy and perfomance-per-watt. The framework searches the combined space of JIT optimization sequences and different classes of JVM runtime parameters. To discover good configurations more quickly, the framework implements novel heuristic search algorithms. To reduce the size of the search space machine-learning based pruning techniques are used. Evaluation on recommender system workloads show that significant improvements in both performance and power can be gained by fine-tuning JVM runitme parameters.}, urldate = {2016-07-05}, school = {Texas State University}, author = {Saha, Shuvabrata}, month = apr, year = {2016}, }
@inproceedings{colucci_evaluating_2016, address = {New York, NY, USA}, title = {Evaluating {Item}-{Item} {Similarity} {Algorithms} for {Movies}}, url = {http://doi.acm.org/10.1145/2851581.2892362}, doi = {10.1145/2851581.2892362}, booktitle = {{CHI} {EA} '16}, publisher = {ACM}, author = {Colucci, Lucas and Doshi, Prachi and Lee, Kun-Lin and Liang, Jiajie and Lin, Yin and Vashishtha, Ishan and Zhang, Jia and Jude, Alvin}, year = {2016}, note = {Journal Abbreviation: CHI EA '16}, keywords = {algorithm evaluation, item-item similarity, recommender systems}, pages = {2141--2147}, }
@inproceedings{kharrat_recommendation_2016, title = {Recommendation system based contextual analysis of {Facebook} comment}, url = {http://dx.doi.org/10.1109/AICCSA.2016.7945792}, doi = {10.1109/AICCSA.2016.7945792}, abstract = {This paper present a new recommendation algorithm based on contextual analysis and new measurements. Social Network is one of the most popular Web 2.0 applications and related services, like Facebook, have evolved into a practical means for sharing opinions. Consequently, Social Network web sites have since become rich data sources for opinion mining. This paper proposes to introduce external resource from comments posted by users to predict recommendation and relieve the cold start problem. The novelty of the proposed approach is that posts are not simply characterized by an opinion score, as is the case with machine learning-based classifiers, but instead receive an opinion grade for each distinct notion in the post. Our approach has been implemented with Java and Lenskit framework; the study we have conducted on a movie dataset has shown competitive results. We compared our algorithm to SVD and Slope One algorithms. We have obtained an improvement of 8\% in precision and recall as well an improvement of 16\% in RMSE and nDCG.}, author = {Kharrat, F Ben and Elkhleifi, A and Faiz, R}, month = nov, year = {2016}, keywords = {Algorithm design and analysis, Classification algorithms, Collaboration, Collaborative filtering, Facebook, Motion pictures, Recommendation system, Recommender systems, Social network, User cold start, User profile}, pages = {1--6}, }
@phdthesis{salam_patrous_evaluating_2016, address = {Stockholm, Sweden}, title = {Evaluating {Prediction} {Accuracy} for {Collaborative} {Filtering} {Algorithms} in {Recommender} {Systems}}, url = {http://kth.diva-portal.org/smash/record.jsf?aq2=%5B%5B%5D%5D&c=1&af=%5B%5D&searchType=LIST_LATEST&query=&language=en&pid=diva2%3A927356&aq=%5B%5B%5D%5D&sf=all&aqe=%5B%5D&sortOrder=author_sort_asc&onlyFullText=false&noOfRows=50&dswid=-7195}, abstract = {Recommender systems are a relatively new technology that is commonly used by e-commerce websites and streaming services among others, to predict user opinion about products. This report studies two ...}, urldate = {2016-06-13}, school = {KTH Royal Institute of Technology}, author = {Salam Patrous, Ziad and Najafi, Safir}, year = {2016}, }
@phdthesis{chang_leveraging_2016, address = {Minneapolis, MN, USA}, title = {Leveraging {Collective} {Intelligence} in {Recommender} {System}}, url = {http://hdl.handle.net/11299/182725}, abstract = {Recommender systems, since their introduction 20 years ago, have been widely deployed in web services to alleviate user information overload. Driven by business objectives of their applications, the focus of recommender systems has shifted from accurately modeling and predicting user preferences to offering good personalized user experience. The later is difficult because there are many factors, e.g., tenure of a user, context of recommendation and transparency of recommender system, that affect users' perception of recommendations. Many of these factors are subjective and not easily quantifiable, posing challenges to recommender algorithms. When pure algorithmic solutions are at their limits in providing good user experience in recommender systems, we turn to the collective intelligence of human and computer. Computer and human are complementary to each other: computers are fast at computation and data processing and have accurate memory; humans are capable of complex reasoning, being creative and relating to other humans. In fact, such close collaborations between human and computer have precedent: after chess master Garry Kasparov lost to IBM computer ``Deep Blue'', he invited a new form of chess --- advanced chess, in which human player and a computer program teams up against such pairs. In this thesis, we leverage the collective intelligence of human and computer to tackle several challenges in recommender systems and demonstrate designs of such hybrid systems. We make contributions to the following aspects of recommender systems: providing better new user experience, enhancing topic modeling component for items, composing better recommendation sets and generating personalized natural language explanations. These four applications demonstrate different ways of designing systems with collective intelligence, applicable to domains other than recommender systems. We believe the collective intelligence of human and computer can power more intelligent, user friendly and creative systems, worthy of continuous research effort in future.}, urldate = {2016-11-01}, school = {University of Minnesota}, author = {Chang, Shuo}, month = aug, year = {2016}, }
@article{pessemier_hybrid_2016, title = {Hybrid group recommendations for a travel service}, volume = {75}, issn = {1380-7501}, url = {http://link.springer.com/article/10.1007/s11042-016-3265-x}, doi = {10.1007/s11042-016-3265-x}, abstract = {Recommendation techniques have proven their usefulness as a tool to cope with the information overload problem in many classical domains such as movies, books, and music. Additional challenges for recommender systems emerge in the domain of tourism such as acquiring metadata and feedback, the sparsity of the rating matrix, user constraints, and the fact that traveling is often a group activity. This paper proposes a recommender system that offers personalized recommendations for travel destinations to individuals and groups. These recommendations are based on the users’ rating profile, personal interests, and specific demands for their next destination. The recommendation algorithm is a hybrid approach combining a content-based, collaborative filtering, and knowledge-based solution. For groups of users, such as families or friends, individual recommendations are aggregated into group recommendations, with an additional opportunity for users to give feedback on these group recommendations. A group of test users evaluated the recommender system using a prototype web application. The results prove the usefulness of individual and group recommendations and show that users prefer the hybrid algorithm over each individual technique. This paper demonstrates the added value of various recommendation algorithms in terms of different quality aspects, compared to an unpersonalized list of the most-popular destinations.}, number = {5}, urldate = {2016-03-11}, journal = {Multimed. Tools Appl.}, author = {Pessemier, Toon De and Dhondt, Jeroen and Martens, Luc}, month = jan, year = {2016}, pages = {1--25}, }
@phdthesis{nguyen_enhancing_2016, address = {Minneapolis, MN, USA}, title = {Enhancing {User} {Experience} {With} {Recommender} {Systems} {Beyond} {Prediction} {Accuracies}}, url = {http://hdl.handle.net/11299/182780}, abstract = {In this dissertation, we examine to improve the user experience with recommender systems beyond prediction accuracy. We focus on the following aspects of the user experience. In chapter 3 we examine if a recommender system exposes users to less diverse contents over time. In chapter 4 we look at the relationships between user personality and user preferences for recommendation diversity, popularity, and serendipity. In chapter 5 we investigate the relations between the self-reported user satisfaction and the three recommendation properties with the inferred user recommendation consumption. In chapter 6 we look at four different rating inter- faces and evaluated how these interfaces affected the user rating experience. We find that over time a recommender system exposes users to less-diverse contents and that users rate less-diverse items. However, users who took recommendations were exposed to more diverse recommendations than those who did not. Furthermore, users with different personalities have different preferences for recommendation diversity, popularity, and serendipity (e.g. some users prefer more diverse recommendations, while others prefer similar ones). We also find that user satisfaction with recommendation popularity and serendipity measured with survey questions strongly relate to user recommendation consumption inferred with logged data. We then propose a way to get better signals about user preferences and help users rate items in the recommendation systems more consistently. That is providing exemplars to users at the time they rate the items improved the consistency of users’ ratings. Our results suggest several ways recommender system practitioners and re- searchers can enrich the user experience. For example, by integrating users’ personality into recommendation frameworks, we can help recommender systems deliver recommendations with the preferred levels of diversity, popularity, and serendipity to individual users. We can also facilitate the rating process by integrating a set of proven rating-support techniques into the systems’ interfaces.}, urldate = {2016-11-01}, school = {University of Minnesota}, author = {Nguyen, Tien}, month = aug, year = {2016}, }
@misc{noauthor_machine_2016, title = {Machine ‘{Unlearning}’ {Technique} {Wipes} {Out} {Unwanted} {Data} {Quickly} and {Completely}}, url = {http://www.scientificcomputing.com/news/2016/03/machine-unlearning-technique-wipes-out-unwanted-data-quickly-and-completely}, abstract = {Cao and Yang believe that easy adoption of forgetting systems will be increasingly in demand. The pair has developed a way to do it faster and more effectively than what is currently available. Their concept, called "machine unlearning," is so promising that the duo have been awarded a four-year, \$1.2 million National Science Foundation grant — split between Lehigh and Columbia — to develop the approach.}, urldate = {2016-03-16}, month = mar, year = {2016}, }
@article{harper_movielens_2015, title = {The {MovieLens} {Datasets}: {History} and {Context}}, volume = {5}, issn = {2160-6455}, url = {http://doi.acm.org/10.1145/2827872}, doi = {10.1145/2827872}, abstract = {The MovieLens datasets are widely used in education, research, and industry. They are downloaded hundreds of thousands of times each year, reflecting their use in popular press programming books, traditional and online courses, and software. These datasets are a product of member activity in the MovieLens movie recommendation system, an active research platform that has hosted many experiments since its launch in 1997. This article documents the history of MovieLens and the MovieLens datasets. We include a discussion of lessons learned from running a long-standing, live research platform from the perspective of a research organization. We document best practices and limitations of using the MovieLens datasets in new research.}, number = {4}, urldate = {2016-03-11}, journal = {ACM Transactions on Interactive Intelligent Systems}, author = {Harper, F Maxwell and Konstan, Joseph A}, month = dec, year = {2015}, keywords = {dataset}, pages = {19:1--19:19}, }
@inproceedings{harper_putting_2015, address = {New York, NY, USA}, title = {Putting {Users} in {Control} of {Their} {Recommendations}}, url = {http://doi.acm.org/10.1145/2792838.2800179}, doi = {10.1145/2792838.2800179}, abstract = {The essence of a recommender system is that it can recommend items personalized to the preferences of an individual user. But typically users are given no explicit control over this personalization, and are instead left guessing about how their actions affect the resulting recommendations. We hypothesize that any recommender algorithm will better fit some users' expectations than others, leaving opportunities for improvement. To address this challenge, we study a recommender that puts some control in the hands of users. Specifically, we build and evaluate a system that incorporates user-tuned popularity and recency modifiers, allowing users to express concepts like "show more popular items". We find that users who are given these controls evaluate the resulting recommendations much more positively. Further, we find that users diverge in their preferred settings, confirming the importance of giving control to users.}, urldate = {2015-09-19}, booktitle = {{RecSys} '15}, publisher = {ACM}, author = {Harper, F Maxwell and Xu, Funing and Kaur, Harmanpreet and Condiff, Kyle and Chang, Shuo and Terveen, Loren}, year = {2015}, note = {Journal Abbreviation: RecSys '15}, pages = {3--10}, }
@inproceedings{chang_using_2015, address = {New York, NY, USA}, title = {Using {Groups} of {Items} for {Preference} {Elicitation} in {Recommender} {Systems}}, url = {http://doi.acm.org/10.1145/2675133.2675210}, doi = {10.1145/2675133.2675210}, abstract = {To achieve high quality initial personalization, recommender systems must provide an efficient and effective process for new users to express their preferences. We propose that this goal is best served not by the classical method where users begin by expressing preferences for individual items - this process is an inefficient way to convert a user's effort into improved personalization. Rather, we propose that new users can begin by expressing their preferences for groups of items. We test this idea by designing and evaluating an interactive process where users express preferences across groups of items that are automatically generated by clustering algorithms. We contribute a strategy for recommending items based on these preferences that is generalizable to any collaborative filtering-based system. We evaluate our process with both offline simulation methods and an online user experiment. We find that, as compared with a baseline rate-15-items interface, (a) users are able to complete the preference elicitation process in less than half the time, and (b) users are more satisfied with the resulting recommended items. Our evaluation reveals several advantages and other trade-offs involved in moving from item-based preference elicitation to group-based preference elicitation.}, urldate = {2015-09-19}, booktitle = {{CSCW} '15}, publisher = {ACM}, author = {Chang, Shuo and Harper, F Maxwell and Terveen, Loren}, year = {2015}, note = {Journal Abbreviation: CSCW '15}, pages = {1258--1269}, }
@inproceedings{magnuson_event_2015, address = {New York, NY, USA}, title = {Event {Recommendation} {Using} {Twitter} {Activity}}, url = {http://doi.acm.org/10.1145/2792838.2796556}, doi = {10.1145/2792838.2796556}, abstract = {User interactions with Twitter (social network) frequently take place on mobile devices - a user base that it strongly caters to. As much of Twitter's traffic comes with geo-tagging information associated with it, it is a natural platform for geographic recommendations. This paper proposes an event recommender system for Twitter users, which identifies twitter activity co-located with previous events, and uses it to drive geographic recommendations via item-based collaborative filtering.}, urldate = {2015-09-19}, booktitle = {{RecSys} '15}, publisher = {ACM}, author = {Magnuson, Axel and Dialani, Vijay and Mallela, Deepa}, year = {2015}, note = {Journal Abbreviation: RecSys '15}, pages = {331--332}, }
@article{ghoshal_recommendations_2015, title = {Recommendations {Using} {Information} from {Multiple} {Association} {Rules}: {A} {Probabilistic} {Approach}}, volume = {26}, issn = {1047-7047}, url = {http://pubsonline.informs.org/doi/abs/10.1287/isre.2015.0583}, doi = {10.1287/isre.2015.0583}, abstract = {Business analytics has evolved from being a novelty used by a select few to an accepted facet of conducting business. Recommender systems form a critical component of the business analytics toolkit and, by enabling firms to effectively target customers with products and services, are helping alter the e-commerce landscape. A variety of methods exist for providing recommendations, with collaborative filtering, matrix factorization, and association-rule-based methods being the most common. In this paper, we propose a method to improve the quality of recommendations made using association rules. This is accomplished by combining rules when possible and stands apart from existing rule-combination methods in that it is strongly grounded in probability theory. Combining rules requires the identification of the best combination of rules from the many combinations that might exist, and we use a maximum-likelihood framework to compare alternative combinations. Because it is impractical to apply the maximum likelihood framework directly in real time, we show that this problem can equivalently be represented as a set partitioning problem by translating it into an information theoretic context—the best solution corresponds to the set of rules that leads to the highest sum of mutual information associated with the rules. Through a variety of experiments that evaluate the quality of recommendations made using the proposed approach, we show that (i) a greedy heuristic used to solve the maximum likelihood estimation problem is very effective, providing results comparable to those from using the optimal set partitioning solution; (ii) the recommendations made by our approach are more accurate than those made by a variety of state-of-the-art benchmarks, including collaborative filtering and matrix factorization; and (iii) the recommendations can be made in a fraction of a second on a desktop computer, making it practical to use in real-world applications.}, number = {3}, urldate = {2015-09-19}, journal = {Information Systems Research}, author = {Ghoshal, Abhijeet and Menon, Syam and Sarkar, Sumit}, month = jul, year = {2015}, pages = {532--551}, }
link bibtex
@inproceedings{wischenbart_recommender_2015, title = {Recommender {Systems} for the {People} — {Enhancing} {Personalization} in {Web} {Augmentation}}, author = {Wischenbart, Martin and Firmenich, Sergio and Rossi, Gustavo and Wimmer, Manuel}, month = sep, year = {2015}, }
@incollection{chowdhury_boostmf_2015, series = {Lecture {Notes} in {Computer} {Science}}, title = {{BoostMF}: {Boosted} {Matrix} {Factorisation} for {Collaborative} {Ranking}}, isbn = {978-3-319-23524-0}, url = {http://link.springer.com/chapter/10.1007/978-3-319-23525-7_1}, urldate = {2015-09-19}, booktitle = {Machine {Learning} and {Knowledge} {Discovery} in {Databases}}, publisher = {Springer International Publishing}, author = {Chowdhury, Nipa and Cai, Xiongcai and Luo, Cheng}, editor = {Appice, Annalisa and Rodrigues, Pedro Pereira and Costa, Vítor Santos and Gama, João and Jorge, Alípio and Soares, Carlos}, month = sep, year = {2015}, pages = {3--18}, }
@incollection{kille_stream-based_2015, series = {Lecture {Notes} in {Computer} {Science}}, title = {Stream-{Based} {Recommendations}: {Online} and {Offline} {Evaluation} as a {Service}}, isbn = {978-3-319-24026-8}, url = {http://link.springer.com/chapter/10.1007/978-3-319-24027-5_48}, urldate = {2015-09-19}, booktitle = {Experimental {IR} {Meets} {Multilinguality}, {Multimodality}, and {Interaction}}, publisher = {Springer International Publishing}, author = {Kille, Benjamin and Lommatzsch, Andreas and Turrin, Roberto and Serény, András and Larson, Martha and Brodt, Torben and Seiler, Jonas and Hopfgartner, Frank}, editor = {Mothe, Josiane and Savoy, Jacques and Kamps, Jaap and Pinel-Sauvagnat, Karen and Jones, Gareth J F and SanJuan, Eric and Cappellato, Linda and Ferro, Nicola}, year = {2015}, pages = {497--517}, }
@phdthesis{ek_recommender_2015, address = {Gothenburg, Sweden}, title = {Recommender {Systems}; {Contextual} {Multi}-{Armed} {Bandit} {Algorithms} for the purpose of targeted advertisement within e-commerce}, url = {http://publications.lib.chalmers.se/records/fulltext/219662/219662.pdf}, school = {Chalmers University of Technology}, author = {Ek, Frederik and Stigsson, Robert}, year = {2015}, }
@article{christou_amore_2015, title = {{AMORE}: design and implementation of a commercial-strength parallel hybrid movie recommendation engine}, issn = {0219-1377}, url = {http://link.springer.com.libproxy.txstate.edu/article/10.1007/s10115-015-0866-z}, doi = {10.1007/s10115-015-0866-z}, urldate = {2015-09-19}, journal = {Knowl. Inf. Syst.}, author = {Christou, Ioannis T and Amolochitis, Emmanouil and Tan, Zheng-Hua}, month = aug, year = {2015}, pages = {1--26}, }
@inproceedings{cao_towards_2015, title = {Towards {Making} {Systems} {Forget} with {Machine} {Unlearning}}, url = {http://www.ieee-security.org/TC/SP2015/papers-archived/6949a463.pdf}, abstract = {Today’s systems produce a wealth of data every day, and the data further generates more data, i.e., the derived data, forming into a complex data propagation network, defined as the data’s lineage. There are many reasons for users and administrators to forget certain data including the data’s lineage. From the privacy perspective, a system may leak private information of certain users, and those users unhappy about privacy leaks naturally want to forget their data and its lineage. From the security perspective, an anomaly detection system can be polluted by adversaries through injecting manually crafted data into the training set. Therefore, we envision forgetting systems, capable of completely forgetting certain data and its lineage. In this paper, we focus on making learning systems forget, the process of which is defined as machine unlearning or unlearning. To perform unlearning upon learning system, we present general unlearning criteria, i.e., converting a learning system or part of it into a summation form of statistical query learning model, and updating all the summations to achieve unlearning. Then, we integrate our unlearning criteria into an unlearning architecture that interacts with all the components of a learning system, such as sample clustering and feature selection. To demonstrate our unlearning criteria and architecture, we select four real-world learning systems, including an item-item recommendation system, an online social network spam filter, and a malware detection system. These systems are first exposed to an adversarial environment, e.g., if the system is potentially vulnerable to training data pollution, we first pollute the training data set and show that the detection rate drops significantly. Then, we apply our unlearning technique upon those affected systems, either polluted or leaking private information. Our results show that after unlearning, the detection rate of a polluted system increases back to the one before pollution, and a system leaking a particular user’s private information completely forgets that information.}, publisher = {IEEE}, author = {Cao, Yinzhi and Yang, Junfeng}, month = may, year = {2015}, }
@inproceedings{elkhelifi_recommendation_2015, title = {Recommendation {Systems} {Based} on {Online} {User}'s {Action}}, url = {http://dx.doi.org/10.1109/CIT/IUCC/DASC/PICOM.2015.69}, doi = {10.1109/CIT/IUCC/DASC/PICOM.2015.69}, abstract = {In this paper, we propose a new recommender algorithm based on multi-dimensional users behavior and new measurements. It's used in the framework of our recommender system that use knowledge discovery techniques to the problem of making product recommendations during a live user interaction. Most of Collaborative filtering algorithms based on user's rating or similar item that other users bought, we propose to combine all user's action to predict recommendation. These systems are achieving widespread success in E-tourism nowadays. We evaluate our algorithm on tourism dataset. Evaluations have shown good results. We compared our algorithm to Slope One and Weight Slope One. We obtained an improvement of 5\% in precision and recall. And an improvement of 12\% in RMSE and nDCG.}, author = {Elkhelifi, A and Kharrat, F Ben and Faiz, R}, month = oct, year = {2015}, pages = {485--490}, }
@inproceedings{dragovic_exploiting_2015, title = {Exploiting {Reviews} to {Guide} {Users}’ {Selections}}, url = {http://ceur-ws.org/Vol-1441/recsys2015_poster7.pdf}, urldate = {2017-03-01}, author = {Dragovic, Nevena and Pera, Maria Soledad}, year = {2015}, }
@inproceedings{larrain_towards_2015, title = {Towards {Improving} {Top}-{N} {Recommendation} by {Generalization} of {SLIM}}, url = {http://ceur-ws.org/Vol-1441/recsys2015_poster22.pdf}, author = {Larraín, Santiago and Parra, Denis and Soto, Alvaro}, year = {2015}, }
link bibtex
@phdthesis{dhondt_hybrid_2015, address = {Gent, Belgium}, title = {A hybrid group recommender system for travel destinations}, school = {University of Gent}, author = {Dhondt, Jeroen}, month = may, year = {2015}, }
@inproceedings{ekstrand_user_2014, address = {New York, NY, USA}, title = {User perception of differences in movie recommendation algorithms}, url = {http://dx.doi.org/10.1145/2645710.2645737}, doi = {10.1145/2645710.2645737}, abstract = {Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithm implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations. In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.}, booktitle = {Proceedings of the {Eighth} {ACM} {Conference} on {Recommender} {Systems}}, publisher = {ACM}, author = {Ekstrand, Michael D and Harper, F Maxwell and Willemsen, Martijn C and Konstan, Joseph A}, month = oct, year = {2014}, note = {Journal Abbreviation: RecSys '14}, pages = {161--168}, }
@inproceedings{said_comparative_2014, address = {New York, NY, USA}, title = {Comparative {Recommender} {System} {Evaluation}: {Benchmarking} {Recommendation} {Frameworks}}, url = {http://dx.doi.org/10.1145/2645710.2645746}, doi = {10.1145/2645710.2645746}, abstract = {Recommender systems research is often based on comparisons of predictive accuracy: the better the evaluation scores, the better the recommender. However, it is difficult to compare results from different recommender systems due to the many options in design and implementation of an evaluation strategy. Additionally, algorithmic implementations can diverge from the standard formulation due to manual tuning and modifications that work better in some situations. In this work we compare common recommendation algorithms as implemented in three popular recommendation frameworks. To provide a fair comparison, we have complete control of the evaluation dimensions being benchmarked: dataset, data splitting, evaluation strategies, and metrics. We also include results using the internal evaluation mechanisms of these frameworks. Our analysis points to large differences in recommendation accuracy across frameworks and strategies, i.e. the same baselines may perform orders of magnitude better or worse across frameworks. Our results show the necessity of clear guidelines when reporting evaluation of recommender systems to ensure reproducibility and comparison of results.}, urldate = {2017-02-03}, booktitle = {{RecSys} '14}, publisher = {ACM Press}, author = {Said, Alan and Bellogin, Alejandro}, month = oct, year = {2014}, note = {Journal Abbreviation: RecSys '14}, keywords = {toolkit}, pages = {129--136}, }
@phdthesis{ekstrand_towards_2014, address = {Minneapolis, MN}, title = {Towards {Recommender} {Engineering}: {Tools} and {Experiments} in {Recommender} {Differences}}, url = {http://hdl.handle.net/11299/165307}, abstract = {Since the introduction of their modern form 20 years ago, recommender systems have proven a valuable tool for help users manage information overload. Two decades of research have produced many algorithms for computing recommendations, mechanisms for evaluating their effectiveness, and user interfaces and experiences to embody them. It has also been found that the outputs of different recommendation algorithms differ in user-perceptible ways that affect their suitability to different tasks and information needs. However, there has been little work to systematically map out the space of algorithms and the characteristics they exhibit that makes them more or less effective in different applications. As a result, developers of recommender systems must experiment, conducting basic science on each application and its users to determine the approach(es) that will meet their needs. This thesis presents our work towards recommender engineering: the design of recommender systems from well-understood principles of user needs, domain properties, and algorithm behaviors. This will reduce the experimentation required for each new recommender application, allowing developers to design recommender systems that are likely to be effective for their particular application. To that end, we make four contributions: the LensKit toolkit for conducting experiments on a wide variety of recommender algorithms and data sets under different experimental conditions (offline experiments with diverse metrics, online user studies, and the ability to grow to support additional methodologies), along with new developments in object-oriented software configuration to support this toolkit; experiments on the configuration options of widely-used algorithms to provide guidance on tuning and configuring them; an offline experiment on the differences in the errors made by different algorithms; and a user study on the user-perceptible differences between lists of movie recommendations produced by three common recommender algorithms. Much research is needed to fully realize the vision of recommender engineering in the coming years; it is our hope that LensKit will prove a valuable foundation for much of this work, and our experiments represent a small piece of the kinds of studies that must be carried out, replicated, and validated to enable recommender systems to be engineered.}, school = {University of Minnesota}, author = {Ekstrand, Michael D}, collaborator = {Konstan, Joseph A}, month = jul, year = {2014}, note = {Publication Title: Computer Science and Engineering Volume: Ph.D}, }
@inproceedings{konstan_teaching_2014, address = {New York, NY, USA}, title = {Teaching {Recommender} {Systems} at {Large} {Scale}: {Evaluation} and {Lessons} {Learned} from a {Hybrid} {MOOC}}, url = {http://doi.acm.org/10.1145/2556325.2566244}, doi = {10.1145/2556325.2566244}, abstract = {In Fall 2013 we offered an open online Introduction to Recommender Systems through Coursera, while simultaneously offering a for-credit version of the course on-campus using the Coursera platform and a flipped classroom instruction model. As the goal of offering this course was to experiment with this type of instruction, we performed extensive evaluation including surveys of demographics, self-assessed skills, and learning intent; we also designed a knowledge-assessment tool specifically for the subject matter in this course, administering it before and after the course to measure learning. We also tracked students through the course, including separating out students enrolled for credit from those enrolled only for the free, open course. This article reports on our findings.}, urldate = {2014-03-19}, booktitle = {L@{S} '14}, publisher = {ACM}, author = {Konstan, Joseph A and Walker, J D and Brooks, D Christopher and Brown, Keith and Ekstrand, Michael D}, month = mar, year = {2014}, note = {Journal Abbreviation: L@S '14}, pages = {61--70}, }
@inproceedings{kluver_evaluating_2014, title = {Evaluating {Recommender} {Behavior} for {New} {Users}}, url = {http://dx.doi.org/10.1145/2645710.2645742}, doi = {10.1145/2645710.2645742}, publisher = {ACM}, author = {Kluver, Daniel and Konstan, Joseph A}, month = oct, year = {2014}, }
@article{de_nart_personalized_2014, title = {A {Personalized} {Concept}-driven {Recommender} {System} for {Scientific} {Libraries}}, volume = {38}, issn = {1877-0509}, url = {http://www.sciencedirect.com/science/article/pii/S1877050914013751}, doi = {10.1016/j.procs.2014.10.015}, abstract = {Recommender Systems can greatly enhance the exploitation of large digital libraries; however, in order to achieve good accuracy with collaborative recommenders some domain assumptions must be met, such as having a large number of users sharing similar interests over time. Such assumptions may not hold in digital libraries, where users are structured in relatively small groups of experts whose interests may change in unpredictable ways: this is the case of scientific and technical documents archives. Moreover, when recommending documents, users often expect insights on the recommended content as well as a detailed explanation of why the system has selected it, which cannot be provided by collaborative techniques. In this paper we consider the domain of scientific publications repositories and propose a content-based recommender based upon a graph representation of concepts built up by linked keyphrases. This recommender is coupled with a keyphrase extraction system able to generate meaningful metadata for the documents, which are the basis for providing helpful and explainable recommendations.}, urldate = {2015-09-23}, journal = {Procedia Comput. Sci.}, author = {De Nart, D and Tasso, C}, year = {2014}, pages = {84--91}, }
@inproceedings{nguyen_improving_2014, address = {New York, NY, USA}, title = {Improving {Recommender} {Systems}: {User} {Roles} and {Lifecycles}}, url = {http://doi.acm.org/10.1145/2645710.2653363}, doi = {10.1145/2645710.2653363}, abstract = {In the era of big data, it is usually agreed that the more data we have, the better results we can get. However, for some domains that heavily depend on user inputs (such as recommender systems), the performance evaluation metrics are sensitive to the amount of noise introduced by users. Such noise can be from users who only wanted to explore the systems, and thus did not spend efforts to provide accurate inputs. Noise can also be introduced by the methods of collecting user ratings. In my dissertation, I study how user data can affect prediction accuracies and performances of recommendation algorithms. To that end, I investigate how the data collection methods and the life cycles of users affect the prediction accuracies and the performance of recommendation algorithms.}, urldate = {2015-09-23}, booktitle = {{RecSys} '14}, publisher = {ACM}, author = {Nguyen, Tien T}, year = {2014}, note = {Journal Abbreviation: RecSys '14}, pages = {417--420}, }
@inproceedings{zhao_privacy-aware_2014, address = {ICST, Brussels, Belgium, Belgium}, title = {Privacy-aware {Location} {Privacy} {Preference} {Recommendations}}, url = {http://dx.doi.org/10.4108/icst.mobiquitous.2014.258017}, doi = {10.4108/icst.mobiquitous.2014.258017}, abstract = {Location-Based Services have become increasingly popular due to the prevalence of smart devices and location-sharing applications such as Facebook and Foursquare. The protection of people's sensitive location data in such applications is an important requirement. Conventional location privacy protection methods, however, such as manually defining privacy rules or asking users to make decisions each time they enter a new location may be overly complex, intrusive or unwieldy. An alternative is to use machine learning to predict people's privacy preferences and automatically configure settings. Model-based machine learning classifiers may be too computationally complex to be used in real-world applications, or suffer from poor performance when training data are insufficient. In this paper we propose a location-privacy recommender that can provide people with recommendations of appropriate location privacy settings through user-user collaborative filtering. Using a real-world location-sharing dataset, we show that the prediction accuracy of our scheme (73.08\%) is similar to the best performance of model-based classifiers (75.30\%), and at the same time causes fewer privacy leaks (11.75\% vs 12.70\%). Our scheme further outperforms model-based classifiers when there are insufficient training data. Since privacy preferences are innately private, we make our recommender privacy-aware by obfuscating people's preferences. Our results show that obfuscation leads to a minimal loss of prediction accuracy (0.76\%).}, urldate = {2015-09-23}, booktitle = {{MOBIQUITOUS} '14}, publisher = {ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering)}, author = {Zhao, Yuchen and Ye, Juan and Henderson, Tristan}, year = {2014}, note = {Journal Abbreviation: MOBIQUITOUS '14}, pages = {120--129}, }
@article{amolochitis_implementing_2014, title = {Implementing a {Commercial}-{Strength} {Parallel} {Hybrid} {Movie} {Recommendation} {Engine}}, volume = {29}, issn = {1541-1672}, url = {http://dx.doi.org/10.1109/MIS.2014.23}, doi = {10.1109/MIS.2014.23}, abstract = {AMORE is a hybrid recommendation system that provides movie recommendations for a major triple-play services provider in Greece. Combined with our own implementations of several user-, item-, and content-based recommendation algorithms, AMORE significantly outperforms other state-of-the-art implementations both in solution quality and response time. AMORE currently serves daily recommendation requests for all active subscribers of the provider's video-on-demand services and has contributed to an increase of rental profits and customer retention.}, number = {2}, journal = {IEEE Intell. Syst.}, author = {Amolochitis, E and Christou, I T and Tan, Zheng-Hua}, month = mar, year = {2014}, pages = {92--96}, }
@inproceedings{nguyen_rating_2013, address = {New York, NY, USA}, title = {Rating {Support} {Interfaces} to {Improve} {User} {Experience} and {Recommender} {Accuracy}}, url = {http://doi.acm.org/10.1145/2507157.2507188}, doi = {10.1145/2507157.2507188}, abstract = {One of the challenges for recommender systems is that users struggle to accurately map their internal preferences to external measures of quality such as ratings. We study two methods for supporting the mapping process: (i) reminding the user of characteristics of items by providing personalized tags and (ii) relating rating decisions to prior rating decisions using exemplars. In our study, we introduce interfaces that provide these methods of support. We also present a set of methodologies to evaluate the efficacy of the new interfaces via a user experiment. Our results suggest that presenting exemplars during the rating process helps users rate more consistently, and increases the quality of the data.}, urldate = {2014-04-28}, booktitle = {{RecSys} '13}, publisher = {ACM}, author = {Nguyen, Tien T and Kluver, Daniel and Wang, Ting-Yu and Hui, Pik-Mai and Ekstrand, Michael D and Willemsen, Martijn C and Riedl, John}, year = {2013}, note = {Journal Abbreviation: RecSys '13}, pages = {149--156}, }
@article{benjamin_heitmann_technical_2013, title = {Technical {Report} on evaluation of recommendations generated by spreading activation}, url = {http://www.researchgate.net/publication/237020679_Technical_Report_on_evaluation_of_recommendations_generated_by_spreading_activation}, author = {Benjamin Heitmann, Conor Hayes}, year = {2013}, }
@inproceedings{ekstrand_when_2012, address = {New York, NY, USA}, title = {When recommenders fail: predicting recommender failure for algorithm selection and combination}, url = {http://doi.acm.org/10.1145/2365952.2366002}, doi = {10.1145/2365952.2366002}, abstract = {Hybrid recommender systems --- systems using multiple algorithms together to improve recommendation quality --- have been well-known for many years and have shown good performance in recent demonstrations such as the NetFlix Prize. Modern hybridization techniques, such as feature-weighted linear stacking, take advantage of the hypothesis that the relative performance of recommenders varies by circumstance and attempt to optimize each item score to maximize the strengths of the component recommenders. Less attention, however, has been paid to understanding what these strengths and failure modes are. Understanding what causes particular recommenders to fail will facilitate better selection of the component recommenders for future hybrid systems and a better understanding of how individual recommender personalities can be harnessed to improve the recommender user experience. We present an analysis of the predictions made by several well-known recommender algorithms on the MovieLens 10M data set, showing that for many cases in which one algorithm fails, there is another that will correctly predict the rating.}, urldate = {2012-12-13}, booktitle = {{RecSys} '12}, publisher = {ACM}, author = {Ekstrand, Michael D and Riedl, John T}, year = {2012}, note = {Journal Abbreviation: RecSys '12}, pages = {233--236}, }
@inproceedings{kluver_how_2012, address = {New York, NY, USA}, title = {How many bits per rating?}, url = {http://doi.acm.org/10.1145/2365952.2365974}, doi = {10.1145/2365952.2365974}, abstract = {Most recommender systems assume user ratings accurately represent user preferences. However, prior research shows that user ratings are imperfect and noisy. Moreover, this noise limits the measurable predictive power of any recommender system. We propose an information theoretic framework for quantifying the preference information contained in ratings and predictions. We computationally explore the properties of our model and apply our framework to estimate the efficiency of different rating scales for real world datasets. We then estimate how the amount of information predictions give to users is related to the scale ratings are collected on. Our findings suggest a tradeoff in rating scale granularity: while previous research indicates that coarse scales (such as thumbs up / thumbs down) take less time, we find that ratings with these scales provide less predictive value to users. We introduce a new measure, preference bits per second, to quantitatively reconcile this tradeoff.}, urldate = {2013-09-12}, booktitle = {{RecSys} '12}, publisher = {ACM}, author = {Kluver, Daniel and Nguyen, Tien T and Ekstrand, Michael and Sen, Shilad and Riedl, John}, year = {2012}, note = {Journal Abbreviation: RecSys '12}, pages = {99--106}, }
@inproceedings{schelter_scalable_2012, address = {New York, NY, USA}, title = {Scalable {Similarity}-based {Neighborhood} {Methods} with {MapReduce}}, url = {http://doi.acm.org/10.1145/2365952.2365984}, doi = {10.1145/2365952.2365984}, abstract = {Similarity-based neighborhood methods, a simple and popular approach to collaborative filtering, infer their predictions by finding users with similar taste or items that have been similarly rated. If the number of users grows to millions, the standard approach of sequentially examining each item and looking at all interacting users does not scale. To solve this problem, we develop a MapReduce algorithm for the pairwise item comparison and top-N recommendation problem that scales linearly with respect to a growing number of users. This parallel algorithm is able to work on partitioned data and is general in that it supports a wide range of similarity measures. We evaluate our algorithm on a large dataset consisting of 700 million song ratings from Yahoo! Music.}, urldate = {2015-09-23}, booktitle = {{RecSys} '12}, publisher = {ACM}, author = {Schelter, Sebastian and Boden, Christoph and Markl, Volker}, year = {2012}, note = {Journal Abbreviation: RecSys '12}, pages = {163--170}, }
@article{guimera_predicting_2012, title = {Predicting {Human} {Preferences} {Using} the {Block} {Structure} of {Complex} {Social} {Networks}}, volume = {7}, url = {http://dx.doi.org/10.1371/journal.pone.0044620}, doi = {10.1371/journal.pone.0044620}, abstract = {With ever-increasing available data, predicting individuals' preferences and helping them locate the most relevant information has become a pressing need. Understanding and predicting preferences is also important from a fundamental point of view, as part of what has been called a “new” computational social science. Here, we propose a novel approach based on stochastic block models, which have been developed by sociologists as plausible models of complex networks of social interactions. Our model is in the spirit of predicting individuals' preferences based on the preferences of others but, rather than fitting a particular model, we rely on a Bayesian approach that samples over the ensemble of all possible models. We show that our approach is considerably more accurate than leading recommender algorithms, with major relative improvements between 38\% and 99\% over industry-level algorithms. Besides, our approach sheds light on decision-making processes by identifying groups of individuals that have consistently similar preferences, and enabling the analysis of the characteristics of those groups.}, number = {9}, urldate = {2014-10-04}, journal = {PLoS One}, author = {Guimerà, Roger and Llorente, Alejandro and Moro, Esteban and Sales-Pardo, Marta}, month = sep, year = {2012}, pages = {e44620}, }