lenskit.data#
Data abstractions and data set access.
Submodules#
Data accumulation support |
|
Load Amazon ratings data from Julian McAuley's group. |
|
Classes for working with matrix data. |
|
Code to import MovieLens data sets into LensKit. |
|
Support for the MSWeb datasets. |
|
Utility functions for implementing |
|
Pydantic models for LensKit data schemas. These models define define the data |
|
Basic data types used in data representations. |
Attributes#
A generic collection key with no bounds or type information. Key types must |
|
Types that can be converted to a query by |
|
Valid sources for query items. |
Classes#
Base class for an attribute associated with an entity class. This class |
|
Iterator over a range by batches. |
|
Construct data sets from data and tables. |
|
A collection of item lists. This protocol defines read access to the |
|
Collect item lists with associated keys, as in |
|
Mutable item list collection backed by a Python list. |
|
Intersection type of |
|
Key type for query IDs. This is used for :ref:`item list collections |
|
Key type for user IDs. This is used for :ref:`item list collections |
|
A general container for the data backing a dataset. |
|
Representation of a data set for LensKit training, evaluation, etc. Data can |
|
Representation of a set of entities from the dataset. Obtained from |
|
Representation of a (usually ordered) list of items, possibly with scores |
|
Representation of a the data available for a recommendation query. This is |
|
Two-entity relationships without duplicates, accessible in matrix form. |
|
Representation for a set of relationship records. This is the class for |
|
Vocabularies of entity identifiers for the LensKit data model. |
Functions#
|
Create a dataset from a data frame of ratings or other user-item |
|
Package Contents#
- lenskit.data.from_interactions_df(df, *, user_col=None, item_col=None, rating_col=None, timestamp_col=None, users=None, items=None, class_name='rating')#
Create a dataset from a data frame of ratings or other user-item interactions.
- Stability:
- Caller (see Stability Levels).
- Parameters:
df (pandas.DataFrame) – The user-item interactions (e.g. ratings). The dataset code takes ownership of this data frame and may modify it.
user_col (str | None) – The name of the user ID column. By default, looks for columns named
user,user_id, oruserId, with several case variants.item_col (str | None) – The name of the item ID column. By default, looks for columns named
item,item_id, oritemId, with several case variants.rating_col (str | None) – The name of the rating column.
timestamp_col (str | None) – The name of the timestamp column.
user_ids – A vocabulary of user IDs. The data frame is subset to this set of IDs.
item_ids – A vocabulary of item IDs. The data frame is subset to this set of IDs.
name – The interaction class name.
users (lenskit.data.types.IDSequence | pandas.Index | Iterable[lenskit.data.types.ID] | lenskit.data._vocab.Vocabulary | None)
items (lenskit.data.types.IDSequence | pandas.Index | Iterable[lenskit.data.types.ID] | lenskit.data._vocab.Vocabulary | None)
class_name (str)
- Returns:
The initiated data set.
- Return type:
- type lenskit.data.GenericKey = tuple[ID, ...]#
A generic collection key with no bounds or type information. Key types must also be named tuples (the Python type system does not allow us to express this).
- lenskit.data.key_dict(kt)#
- Parameters:
kt (tuple[lenskit.data.types.ID, Ellipsis])
- Return type:
- type lenskit.data.QueryInput = RecQuery | ID | ItemList | None#
Types that can be converted to a query by
RecQuery.create().
- type lenskit.data.QueryItemSource = Literal['history', 'session', 'context']#
Valid sources for query items.
Exported Aliases#
- exception lenskit.data.FieldError#
Re-exported alias for
lenskit.diagnostics.FieldError.
- lenskit.data.load_amazon_ratings()#
Re-exported alias for
lenskit.data.amazon.load_amazon_ratings().
- lenskit.data.load_movielens()#
Re-exported alias for
lenskit.data.movielens.load_movielens().
- lenskit.data.load_movielens_df()#
Re-exported alias for
lenskit.data.movielens.load_movielens_df().
- lenskit.data.load_ms_web()#
Re-exported alias for
lenskit.data.msweb.load_ms_web().
- lenskit.data.ID#
Re-exported alias for
lenskit.data.types.ID.
- lenskit.data.NPID#
Re-exported alias for
lenskit.data.types.NPID.
- lenskit.data.FeedbackType#
Re-exported alias for
lenskit.data.types.FeedbackType.