lenskit.testing#

LensKit test harnesses and utilities.

This package contains utility code for testing LensKit and its components, including in derived packages. It relies on PyTest and Hypothesis.

Attributes#

Classes#

BasicComponentTests

ScorerTests

Common tests for scorer components. Many of these just test that the component

DemoRecs

Functions#

coo_arrays([shape, dtype, elements])

scored_lists(*[, n, scores])

Hypothesis generator that produces scored lists.

sparse_arrays(*[, layout])

sparse_tensors(*[, layout])

demo_recs()

A demo set of train, test, and recommendation data.

ml_20m()

ml_100k()

Fixture to load the MovieLens 100K dataset (currently as a data frame). It skips

ml_ds(ml_ds_unchecked)

Fixture to load the MovieLens test dataset. To use this, just include it as

ml_ds_unchecked()

Fixture to load the MovieLens dataset, without checking for modifications.

ml_ratings()

Fixture to load the test MovieLens ratings as a data frame. To use this,

pop_recs()

A demo set of train, test, and recommendation data, from most-popular.

msweb()

set_env_var(var, val)

Set an environment variable & restore it.

Package Contents#

lenskit.testing.coo_arrays(shape=None, dtype=nph.floating_dtypes(endianness='=', sizes=[32, 64]), elements=st.floats(-1000000.0, 1000000.0, allow_nan=False, allow_infinity=False, width=32))#
Return type:

scipy.sparse.coo_array

lenskit.testing.scored_lists(*, n=st.integers(0, 1000), scores=None)#

Hypothesis generator that produces scored lists.

Parameters:
  • n (int | tuple[int, int] | hypothesis.strategies.SearchStrategy[int])

  • scores (hypothesis.strategies.SearchStrategy[float] | Literal['gaussian'] | None)

Return type:

lenskit.data.ItemList

lenskit.testing.sparse_arrays(*, layout='csr', **kwargs)#
lenskit.testing.sparse_tensors(*, layout='csr', **kwargs)#
class lenskit.testing.BasicComponentTests#
component: type[lenskit.pipeline.Component]#
configs = []#
test_instantiate_default()#
test_default_config_vars()#
test_default_config_round_trip()#
test_config_round_trip()#
class lenskit.testing.ScorerTests#

Bases: TrainingTests

Common tests for scorer components. Many of these just test that the component runs, not that it produces correct output.

component: ClassVar[type[lenskit.pipeline.Component]]#
can_score: ClassVar[Literal['some', 'known', 'all']] = 'known'#

What can this scorer score?

expected_rmse: ClassVar[float | tuple[float, float] | object | None] = None#

Asserts RMSE either less than the provided expected value or between two values as tuple.

expected_ndcg: ClassVar[float | tuple[float, float] | object | None] = None#

Asserts nDCG either greater than the provided expected value or between two values as tuple.

invoke_scorer(pipe, **kwargs)#
Parameters:

pipe (lenskit.pipeline.Pipeline)

Return type:

lenskit.data.ItemList

verify_models_equivalent(orig, copy)#

Verify that two models are equivalent.

test_score_known(rng, ml_ds, trained_pipeline)#
Parameters:
test_pickle_roundrip(rng, ml_ds, trained_pipeline, trained_model)#
Parameters:
test_score_unknown_user(rng, ml_ds, trained_pipeline)#

score with an unknown user ID

Parameters:
test_score_unknown_item(rng, ml_ds, trained_pipeline)#

score with one target item unknown

Parameters:
test_score_empty_query(rng, ml_ds, trained_pipeline)#

score with an empty query

Parameters:
test_score_query_history(rng, ml_ds, trained_pipeline)#

score when query has user ID and history

Parameters:
test_score_query_history_only(rng, ml_ds, trained_pipeline)#

score when query only has history

Parameters:
test_score_empty_items(rng, ml_ds, trained_pipeline)#

score an empty list of items

Parameters:
test_train_score_items_missing_data(rng, ml_ds)#

train and score when some entities are missing data

Parameters:
test_train_recommend(rng, ml_ds, trained_topn_pipeline)#

Test that a full train-recommend pipeline works.

Parameters:
test_ray_recommend(rng, ml_ds, trained_topn_pipeline)#

Ensure pipeline can be used via Ray.

Parameters:
test_run_with_doubles(ml_ratings)#
Parameters:

ml_ratings (pandas.DataFrame)

test_batch_prediction_accuracy(rng, ml_100k)#
Parameters:
test_batch_top_n_accuracy(rng, ml_100k)#
Parameters:
class lenskit.testing.DemoRecs#

Bases: NamedTuple

split: lenskit.splitting.TTSplit#
recommendations: lenskit.data.ItemListCollection[lenskit.data.UserIDKey]#
lenskit.testing.demo_recs()#

A demo set of train, test, and recommendation data.

Return type:

DemoRecs

lenskit.testing.ml_20m()#
Return type:

Generator[lenskit.data.Dataset, None, None]

lenskit.testing.ml_100k()#

Fixture to load the MovieLens 100K dataset (currently as a data frame). It skips the test if the ML100K data is not available.

Return type:

Generator[pandas.DataFrame, None, None]

lenskit.testing.ml_100k_zip#
lenskit.testing.ml_ds(ml_ds_unchecked)#

Fixture to load the MovieLens test dataset. To use this, just include it as a parameter in your test:

def test_thing_with_data(ml_ds: Dataset):
    ...

Note

This is imported in conftest.py so it is always available in LensKit tests.

Parameters:

ml_ds_unchecked (lenskit.data.Dataset)

Return type:

Generator[lenskit.data.Dataset, None, None]

lenskit.testing.ml_ds_unchecked()#

Fixture to load the MovieLens dataset, without checking for modifications.

Usually use ml_ds() instead.

Return type:

Generator[lenskit.data.Dataset, None, None]

lenskit.testing.ml_ratings()#

Fixture to load the test MovieLens ratings as a data frame. To use this, just include it as a parameter in your test:

def test_thing_with_data(ml_ratings: pd.DataFrame):
    ...

Note

This is imported in conftest.py so it is always available in LensKit tests.

Return type:

Generator[pandas.DataFrame, None, None]

lenskit.testing.ml_test_dir#
lenskit.testing.pop_recs()#

A demo set of train, test, and recommendation data, from most-popular.

Return type:

DemoRecs

lenskit.testing.msweb()#
Return type:

lenskit.splitting.TTSplit

lenskit.testing.set_env_var(var, val)#

Set an environment variable & restore it.