2025 Releases#

2025 brought breaking changes to across the LensKit APIs to improve ergonomics, correctness-by-default, and flexibility. It also adopts SPEC0, a standard for supported versions of scientific Python libraries, and changes the LensKit version number scheme to “SemCalVer”. See Migrating from LensKit 0.x for information on how to upgrade your code.

2025.7.0#

Add EASE (EASEScorer) (🐞 942, ⛙ 1011, backported in ⛙ 1015).
Fix hyperparameter tuning for non-iterative scorers, and simplify configuration (⛙ 1014, backported in ⛙ 1015).
Improve performance of co_occurrences() (🐞 970, ⛙ 1008, ⛙ 1010).

2025.6.3#

Put LightGCN in eval mode after training (⛙ 975)
Drop x86 wheels from build and release

2025.6.2#

Fix 🐞 956: division by zero in NDCG (⛙ 957).

2025.6.1#

Fix a small bug in lenskit.data.ItemListCollection.from_df() that duplicated user or session ID columns (⛙ 952).
Fixed ItemList’s __getitem__ to work correctly with negative slice bounds.
Updated common pipelines to use the new TrainingItemCandidateSelector instead of the deprecated UnratedTrainingItemCandidateSelector. The behavior of these pipelines in the standard user-based recommendation configuration is unchanged, but they will now transparently work with other sources of query items (⛙ 953).

2025.6.0#

Bug Fixes#

Disabled parallel scoring in ItemKNNScorer to fix problems with nondeterministic results (⛙ 922).

New Features#

Add new concat() and update() methods to ItemList to make it easier to extend and update item lists (⛙ 927).
Initial support for session-based recommendation, through additional methods on RecQuery (⛙ 929). Item KNN and the ALS recommenders now support session-based recommendation as well. The changes to support this also add support for related-item recommendation through “context items” in query. See session-queries for further information.
Add TrainingItemCandidateSelector to replace the Unrated and All candidate selectors with better configuration.
New function rank_biased_precision() to expose the inner computations of RBP.
Add new method co_occurrences() to count entity co-occurrences in a dataset.

Deprecations#

UnratedTrainingItemsCandidateSelector is deprecated in favor of TrainingItemCandidateSelector with default configuration.
AllTrainingItemsCandidateSelector is deprecated in favor of TrainingItemCandidateSelector with suitable configuration.
lenskit.data.RecQuery.user_items is deprecated in favor of history_items.
Removed fixed defaults for row_entity and col_entity on matrix(), now defaulting to the first and last entities involved in the relationship. This does not change anything for code working on standard interactions.

Platform and Build Support#

Add and test support for Python 3.14.
Adjust dependencies to allow PyArrow 22 (but not 21, due to miscompiled Windows wheels).
Various improvements to the test suite to reduce testing overhead.

2025.5.0#

Documentation updates.
Rename rank metric k arguments to n (with k as a deprecated alias).
Rename MetricAccumulator to MeasurementCollector, to reserve the “accumulator” name for other accumulation purposes.
Add test coverage measurements for Rust accelerator code.

2025.4.0#

Some small improvements. See milestone 2025.4 for details.

2025.3.2#

This is a small bug fix and minor improvement release.

Improve __repr__ for Dataset and ItemList.
Fix rate limit in Jupyter progress bar (⛙ 876).
Modify the scipy() and torch() methods of lenskit.data.MatrixRelationshipSet to filter out missing values of the selected attribute.
Fixed EmbeddingSizeMixin’s support for embedding_size_exp to correctly override embedding_size.
Added LIP and RBO functions.
Bug fixes and refactors for FA*IR reranker.
FlexMF: Fall back to uncompiled models when Dynamo fails (⛙ 883).
Internal refactors of some testing code.
Update repr for Dataset and ItemList (⛙ 877).
Exclude missing attribute values from SciPy and PyTorch sparse matrix representations of matrix relationship attributes (⛙ 884). This fixes breakage in Biased MF and Bias in the face of incomplete attribute data.

2025.3.1#

LensKit 2025.3.1 is a significant feature and bugfix upgrade to the LensKit 2025 series. It maintains backwards compatibility with 2025.2 for stable interfaces (see Stability Levels), except as noted below. See the GitHub milestone for a full list of changes in this release; key changes are noted below.

Highlights#

Added Rust acceleration for several data processing operations and kNN models.
Added a new, experimental PyTorch matrix factorization model (see Flexible Matrix Factorization).
lenskit.basic now exports the configuration classes for basic algorithms (⛙ 672).
The pipeline runner now supports Pipeline Hooks to inspect or modify pipeline operations.
The pipeline runner type checking logic has been refactored and simplified. As a consequence, when None is provided to a component input that does not accept None, the runner now raises TypeError instead of PipelineError, as this is a type error. Details on a type error’s input wiring are now provided as a note on the exception, instead of in the main exception message.
Pipelines can now be specified in configuration files, and used from the command line.
Datasets can now have repeated or duplicate interactions (although such data sets are typically slow, see 🐞 869).
Experimental and mostly undocumented hyperparameter tuning support.
Added a configuration facility in lenskit.config to configure random numbers, power measurement, etc. Scripts using LensKit should call lenskit.configure().
Added support for querying power consumption from Prometheus. Documentation on how to set this up to be useful is TBD.

Compatibility Changes#

Important

These changes should not break most programs, but do introduce and document stricter requirements on certain names.

Pipeline component input names are not allowed to be prefixed with _, as such names are reserved for LensKit internal operation. This is not yet enforced, but will be enforced beginning in LensKit 2026.
Entity and attribute names are not allowed to be prefixed with _, and attribute names cannot end with _id or _num.

Component Changes#

ItemKNNScorer and UserKNNScorer are rewritten to use Rust acceleration, along with changes to its internal data representation to use Arrow instead of SciPy. This also fixes a segfault with very large similarity matrices.

Note

The model parameters of the KNN scorers have changed. They are no longer suffixed with _, and the similarity matrix is a PyArrow list array. Code that was directly examining internal elements will need to change.
The lenskit.als scorers have been similarly refactored, and had their learned parameters renamed for better consistency.
Replaced the broken SoftmaxRanker with a proper stochastic sampler (⛙ 667, StochasticTopNRanker). The old ranker will be removed in LensKit 2026.
Added lenskit.training.UsesTrainer for more sophisticated iterative training support.
Added lenskit.data.ItemList.top_n() to get the top-N values of an item list efficiently.
lenskit.data.Vocabulary is now backed by a Rust hashtable instead of a Pandas Index. An index view is still available.

Data Handling#

Added versioning to the native data format, documented data format compatibility, and added compatibility tests.
Added compressed sparse row extension types for Arrow, and use them in the LensKit native format (as well as Python/Rust data interchange) to more reliably handle CSR matrix data in Arrow (previously, we had to carry the matrix width or row dimension in side information; it is now embedded into the Arrow metadata).
Fix MovieLens import to detect movies without genres (🐞 727, ⛙ 738).
Parallel Processing now supports comma-separated lists for configuring parallelism within worker processes, and LK_NUM_CHILD_THREADS is now deprecated.
Added importers for UCSD Amazon data sets.

Evaluation#

Reworked the design of the Metric interface, along with metric accumulation for run measurement, to facilitate more types of metrics and more flexible use of the evaluation facilities. More breaking changes will come in LensKit 2026.

CLI#

Added several new capabilities to the LensKit CLI.

Other Changes#

sample_negatives() now accepts "popular" as an alias for "popularity".
Several bug fixes for logging in niche setups (including ray clusters) (⛙ 673).

2025.2.0#

LensKit 2025.2.0 was released March 12, 2025.

Some small quality-of-life improvements (and removing invalid API compat).

Add lenskit.pipeline.PipelineCache to share components between pipelines (⛙ 605).
Only warn once for users without test data in bulk analysis (⛙ 664, 🐞 663).
Allow a Pandas data frame to be passed as the test data to the batch recommender (⛙ 660).

Note

This removes extra keyword arguments from the convenience batch.recommend, etc. functions that were leftovers from LensKit 0.14 and no longder did anything.
Support auto-detecting keys in lenskit.data.ItemListCollection.from_df() (⛙ 659).

2025.1.1#

LensKit 2025.1.1 was released March 7, 2025.

The changes in this release are too numerous and fundamental to fully document in traditional release notes. See the following for release update documentation:

Migrating from LensKit 0.x for conceptual changes and how to upgrade your code.
The notes below for behavior changes (e.g. new defaults, new metric capabilities), and small bits not covered in the migration guide.
The full changelog in the Git history and issue/PR milestone.

Breaking Changes#

LensKit 2025 has many breaking changes, with the migration guide (Migrating from LensKit 0.x) documenting the major ones. Below are some smaller ones not covered by that document:

Where Pandas data frames are still used, the standard user and item columns have been renamed to user_id and item_id respectively, with user_num and item_num for 0-based user and item numbers. This is to remove ambiguity about how users and items are being referenced.
The Popular recommender has been removed in favor of PopScore.
The DCG metric has been removed, as it is basically never used and was not useful as a part of the NDCG implementation.

New Features (incremental)#

Many LensKit components (batch running, model training, etc.) now report progress the progress API in lenskit.logging.progress, and can be connected to Jupyter or Rich.
Added RBP top-N metric (⛙ 334).
Added command-line tool to fetch datasets (⛙ 347).

Metric Behavior Changes#

Important

Some LensKit metric default has been changed; this results in values different from those computed by previous versions, either more correct or more consistent with common practice.

The NDCG metric now defaults to ignore rating values.

Model Behavior Changes#

Most models will exhibit some changes, hopefully mostly in performance, due to moving to PyTorch. There are some deliberate behavior changes in this new version, however, documented here.

ALS models only use Cholesky decomposition (previously selected with the erroneously-named method="lu" option); conjugate gradient and coordinate descent are no longer available. Cholesky decomposition is faster on PyTorch than it was with Numba, and is easier to maintain.
The default minimum similarity for UserUser is now \(10^{-6}\).
k-NN algorithms no longer support negative similarities; min_sim is clamped to be at least the smallest normal in 32-bit floating point (\(1.75 \times 10^{-38}\)).
The implicit bridge algorithms no longer look at rating values when they are present.
Bias is no longer optional for BiasedMFScorer and FunkSVD; both are inherently biased models, and FunkSVD is not commonly used.
lenskit.hpf.HPF no longer uses ratings as synthetic counts by default.

Bug Fixes#

Fixed bug in NDCG list truncation (🐞 309, ⛙ 312).
Corrected documentation errors for recall() and hit() (⛙ 369 by @lukas-wegmeth).

Dependencies and Maintenance#

Bumped minimum supported dependencies as per SPEC0 (Python 3.11, NumPy 1.24, Pandas 2.0, SciPy 1.10).
Added support for Pandas 2 (⛙ 364) and Python 3.12.
Improved Apple testing to include vanilla Python and Apple Silicon (⛙ 366).
Updated build environment, dependency setup, taskrunning, and CI to more consistent and maintainable.
Removed legacy random code and SeedBank usage in favor of SPEC 7 (see Random Seeds).
Code is now auto-formatted with Ruff.

2025 Releases#

2025.7.0#

2025.6.3#

2025.6.2#

2025.6.1#

2025.6.0#

Bug Fixes#

New Features#

Deprecations#

Platform and Build Support#

2025.5.0#

2025.4.0#

2025.3.2#

2025.3.1#

Highlights#

Compatibility Changes#

Component Changes#

Data Handling#

Evaluation#

CLI#

Other Changes#

2025.2.0#

2025.1.1#

Breaking Changes#

New Features (incremental)#

Metric Behavior Changes#

Model Behavior Changes#

Bug Fixes#

Dependencies and Maintenance#

This Page