lenskit.data#

Data abstractions and data set access.

Submodules#

accum

Data accumulation support

amazon

Load Amazon ratings data from Julian McAuley's group.

matrix

Classes for working with matrix data.

movielens

Code to import MovieLens data sets into LensKit.

msweb

Support for the MSWeb datasets.

repr

Utility functions for implementing __str__ and __repr__ methods

schema

Pydantic models for LensKit data schemas. These models define define the data

types

Basic data types used in data representations.

Attributes#

GenericKey

A generic collection key with no bounds or type information. Key types must

QueryInput

Types that can be converted to a query by RecQuery.create().

QueryItemSource

Valid sources for query items.

Classes#

EntityAttribute

Base class for an attribute associated with an entity class. This class

BatchedRange

Iterator over a range by batches.

DatasetBuilder

Construct data sets from data and tables.

ItemListCollection

A collection of item lists. This protocol defines read access to the

ItemListCollector

Collect item lists with associated keys, as in ItemListCollection.

ListILC

Mutable item list collection backed by a Python list.

MutableItemListCollection

Intersection type of ItemListCollection and

QueryIDKey

Key type for query IDs. This is used for :ref:`item list collections

UserIDKey

Key type for user IDs. This is used for :ref:`item list collections

DataContainer

A general container for the data backing a dataset.

Dataset

Representation of a data set for LensKit training, evaluation, etc. Data can

EntitySet

Representation of a set of entities from the dataset. Obtained from

ItemList

Representation of a (usually ordered) list of items, possibly with scores

RecQuery

Representation of a the data available for a recommendation query. This is

MatrixRelationshipSet

Two-entity relationships without duplicates, accessible in matrix form.

RelationshipSet

Representation for a set of relationship records. This is the class for

Vocabulary

Vocabularies of entity identifiers for the LensKit data model.

Functions#

from_interactions_df(df, *[, user_col, item_col, ...])

Create a dataset from a data frame of ratings or other user-item

key_dict(kt)

Package Contents#

lenskit.data.from_interactions_df(df, *, user_col=None, item_col=None, rating_col=None, timestamp_col=None, users=None, items=None, class_name='rating')#

Create a dataset from a data frame of ratings or other user-item interactions.

Stability:
Caller (see Stability Levels).
Parameters:
Returns:

The initiated data set.

Return type:

lenskit.data._dataset.Dataset

type lenskit.data.GenericKey = tuple[ID, ...]#

A generic collection key with no bounds or type information. Key types must also be named tuples (the Python type system does not allow us to express this).

lenskit.data.key_dict(kt)#
Parameters:

kt (tuple[lenskit.data.types.ID, Ellipsis])

Return type:

dict[str, Any]

type lenskit.data.QueryInput = RecQuery | ID | ItemList | None#

Types that can be converted to a query by RecQuery.create().

type lenskit.data.QueryItemSource = Literal['history', 'session', 'context']#

Valid sources for query items.

Exported Aliases#

exception lenskit.data.FieldError#

Re-exported alias for lenskit.diagnostics.FieldError.

lenskit.data.load_amazon_ratings()#

Re-exported alias for lenskit.data.amazon.load_amazon_ratings().

lenskit.data.load_movielens()#

Re-exported alias for lenskit.data.movielens.load_movielens().

lenskit.data.load_movielens_df()#

Re-exported alias for lenskit.data.movielens.load_movielens_df().

lenskit.data.load_ms_web()#

Re-exported alias for lenskit.data.msweb.load_ms_web().

lenskit.data.ID#

Re-exported alias for lenskit.data.types.ID.

lenskit.data.NPID#

Re-exported alias for lenskit.data.types.NPID.

lenskit.data.FeedbackType#

Re-exported alias for lenskit.data.types.FeedbackType.