lenskit.batch#

Batch-run recommendation pipelines for evaluation.

Classes#

BatchResults

Results from a batch recommendation run. Results consist of the outputs of

BatchPipelineRunner

Apply a pipeline to a collection of test users.

InvocationSpec

Specification for a single pipeline invocation, to record one or more

Functions#

predict(pipeline, test, *[, n_jobs, use_ray])

Convenience function to batch-generate rating predictions (or other per-item

recommend(pipeline, queries[, n, n_jobs, use_ray, ...])

Convenience function to batch-generate recommendations from a pipeline. This

Package Contents#

class lenskit.batch.BatchResults(key)#

Results from a batch recommendation run. Results consist of the outputs of various pipeline components for each of the test users. Results may be None, if the pipeline produced no output for that query.

Stability:
Caller (see Stability Levels).
Parameters:

key (type[tuple] | Sequence[str])

property outputs: list[str]#

Get the list of output names in these results.

Return type:

list[str]

output(name)#

Get the item lists for a particular output component.

Parameters:

name (str) – The output name. This may or may not be the same as the component name.

Return type:

lenskit.data.ItemListCollection[lenskit.data.GenericKey]

add_result(name, key, result)#

Add a single result for one of the outputs.

Parameters:
  • name (str) – The output name in which to save this result.

  • user – The user identifier for this result.

  • result (object) – The result object to save.

  • key (lenskit.data.GenericKey)

class lenskit.batch.BatchPipelineRunner(*, n_jobs=None, use_ray=None, profiler=None, batch_size=None)#

Apply a pipeline to a collection of test users.

Stability:
Caller (see Stability Levels).
Parameters:
Argss:
pipeline:

The pipeline to evaluate.

n_jobs:

The number of parallel threads to use, or None for default defined by LensKit configuration and environment variables (see Configuring Parallelism).

use_ray:

Use Ray instead of threads to parallelize batch inference, overriding any option set in an environment variable or lenskit.toml.

batch_size:

The batch size for multiprocess execution. If None, a batch size based on the number of inputs is used, with a maximum batch size of 1000.

n_jobs: int#
use_ray: bool#
batch_size: int | None = None#
profiler: lenskit.pipeline.PipelineProfiler | None#
invocations: list[InvocationSpec]#
add_invocation(inv)#
Parameters:

inv (InvocationSpec)

score(component='scorer', *, output='scores')#

Request the batch run to generate test item scores.

Parameters:
  • component (str) – The name of the rating predictor component to run.

  • output (str) – The name of the results in the output dictionary.

predict(component='rating-predictor', *, output='predictions')#

Request the batch run to generate test item rating predictions. It is identical to score() but with different defaults.

Parameters:
  • component (str) – The name of the rating predictor component to run.

  • output (str) – The name of the results in the output dictionary.

recommend(component='recommender', *, output='recommendations', **extra)#

Request the batch run to generate recomendations.

Parameters:
  • component (str) – The name of the recommender component to run.

  • output (str) – The name of the results in the output dictionary.

  • extra (Any) – Extra inputs to the recommender. A common option is n, the number of recommendations to return (a default may be baked into the pipeline).

run(pipeline: lenskit.pipeline.Pipeline, queries: collections.abc.Iterable[lenskit.data.RecQuery] | collections.abc.Iterable[tuple[lenskit.data.RecQuery, lenskit.data.ItemList]] | collections.abc.Iterable[lenskit.data.ID | lenskit.data.GenericKey] | lenskit.data.ItemListCollection[lenskit.data.GenericKey] | Mapping[lenskit.data.ID, lenskit.data.ItemList] | pandas.DataFrame) lenskit.batch._results.BatchResults#
run(pipeline: lenskit.pipeline.Pipeline, *, test_data: collections.abc.Iterable[lenskit.data.ID | lenskit.data.GenericKey] | lenskit.data.ItemListCollection[lenskit.data.GenericKey] | Mapping[lenskit.data.ID, lenskit.data.ItemList] | pandas.DataFrame) lenskit.batch._results.BatchResults

Run the pipeline and return its results.

Note

The runner does not guarantee that results are in the same order as the original inputs.

Parameters:
  • pipeline – The pipeline to run.

  • queries – The collection of test queries use. See Batch Queries for details on the various input formats.

Returns:

The results, as a nested dictionary. The outer dictionary maps component output names to inner dictionaries of result data.

class lenskit.batch.InvocationSpec#

Specification for a single pipeline invocation, to record one or more pipeline component outputs for a test user.

name: str#

A name for this invocation.

components: dict[str, str]#

The names of pipeline components to measure and return, mapped to their output names.

items: ItemSource = None#

The target or candidate items (if any) to provide to the recommender.

extra_inputs: dict[str, Any]#

Additional inputs to pass to the pipeline.

lenskit.batch.predict(pipeline, test, *, n_jobs=None, use_ray=None)#

Convenience function to batch-generate rating predictions (or other per-item scores) from a pipeline. This is a batch version of lenskit.predict(), and is a convenience wrapper around using a BatchPipelineRunner() to generate predictions.

Note

If test is just a sequence of IDs, this method will still work, but it will score _all candidate items_ for each of the IDs.

Stability:
Caller (see Stability Levels).
Parameters:
Return type:

lenskit.data.ItemListCollection[lenskit.data.GenericKey]

lenskit.batch.recommend(pipeline, queries, n=None, *, n_jobs=None, use_ray=None, profiler=None, users=None)#

Convenience function to batch-generate recommendations from a pipeline. This is a batch version of lenskit.recommend(), and is a convenience wrapper around using a BatchPipelineRunner() to generate recommendations.

See also

BatchPipelineRunner.run() for details on the arguments, and Batch Queries for details on the valid inputs for queries.

Parameters:
Stability:
Caller (see Stability Levels).
Return type:

lenskit.data.ItemListCollection[lenskit.data.UserIDKey]

Exported Aliases#

lenskit.batch.ID#

Re-exported alias for lenskit.data.ID.

lenskit.batch.GenericKey#

Re-exported alias for lenskit.data.GenericKey.

class lenskit.batch.ItemList#

Re-exported alias for lenskit.data.ItemList.

class lenskit.batch.ItemListCollection#

Re-exported alias for lenskit.data.ItemListCollection.

class lenskit.batch.RecQuery#

Re-exported alias for lenskit.data.RecQuery.

class lenskit.batch.UserIDKey#

Re-exported alias for lenskit.data.UserIDKey.

class lenskit.batch.Pipeline#

Re-exported alias for lenskit.pipeline.Pipeline.

class lenskit.batch.PipelineProfiler#

Re-exported alias for lenskit.pipeline.PipelineProfiler.