lenskit.data.movielens#
Code to import MovieLens data sets into LensKit.
Attributes#
Classes#
Internal class representing an open ML data set. |
|
Loader for the ML100K data set. |
|
Loader for the ML 1M and 10M data sets. |
|
Loader for modern MovieLens data sets (20M and later). |
Functions#
|
Load a MovieLens dataset. The appropriate MovieLens format is detected |
|
Load the ratings from a MovieLens dataset as a raw data frame. The |
Module Contents#
- class lenskit.data.movielens.MLData(version, source, prefix='')#
Internal class representing an open ML data set.
Stability: Internal
This API is at the internal or experimental stability level: it may change at any time, and breaking changes will not necessarily be described in the release notes. See Stability Levels for details.
- Parameters:
version (str)
source (pathlib.Path | zipfile.ZipFile)
prefix (str)
- source: pathlib.Path | zipfile.ZipFile#
- static version_impl(version)#
- Parameters:
version (str)
- Return type:
collections.abc.Callable[Ellipsis, MLData]
- __enter__()#
- __exit__(*args)#
- abstractmethod dataset()#
Load the full dataset.
- Return type:
- abstractmethod ratings_df()#
Load the ratings data frame.
- Return type:
- class lenskit.data.movielens.ML100KLoader(version, source, prefix='')#
Bases:
MLDataLoader for the ML100K data set.
- Parameters:
version (str)
source (pathlib.Path | zipfile.ZipFile)
prefix (str)
- dataset()#
Load the full dataset.
- Return type:
- genres()#
- Return type:
- movies_df(genres=None)#
- Parameters:
- Return type:
- users_df()#
- Return type:
- ratings_df()#
Load the ratings data frame.
- Return type:
- class lenskit.data.movielens.MLMLoader(version, source, prefix='')#
Bases:
MLDataLoader for the ML 1M and 10M data sets.
- Parameters:
version (str)
source (pathlib.Path | zipfile.ZipFile)
prefix (str)
- dataset()#
Load the full dataset.
- Return type:
- movies_df()#
- users_df()#
- ratings_df()#
Load the ratings data frame.
- tagging_df()#
- class lenskit.data.movielens.MLModernLoader(version, source, prefix='')#
Bases:
MLDataLoader for modern MovieLens data sets (20M and later).
- Parameters:
version (str)
source (pathlib.Path | zipfile.ZipFile)
prefix (str)
- dataset()#
Load the full dataset.
- Return type:
- movies_df()#
- tagging_df()#
- genome_df()#
- ratings_df()#
Load the ratings data frame.
- lenskit.data.movielens.load_movielens(path)#
Load a MovieLens dataset. The appropriate MovieLens format is detected based on the file contents.
- Stability:
- Caller (see Stability Levels).
- Parameters:
path (str | pathlib.Path) – The path to the dataset, either as an unpacked directory or a zip file.
- Returns:
The dataset.
- Return type:
- lenskit.data.movielens.load_movielens_df(path)#
Load the ratings from a MovieLens dataset as a raw data frame. The appropriate MovieLens format is detected based on the file contents.
- Stability:
- Caller (see Stability Levels).
- Parameters:
path (str | pathlib.Path) – The path to the dataset, either as an unpacked directory or a zip file.
- Returns:
The ratings, with columns
user_id,item_id,rating, andtimestamp.- Return type:
Exported Aliases#
- lenskit.data.movielens.get_logger()#
Re-exported alias for
lenskit.logging.get_logger().
- class lenskit.data.movielens.DatasetBuilder#
Re-exported alias for
lenskit.data._builder.DatasetBuilder.
- class lenskit.data.movielens.Dataset#
Re-exported alias for
lenskit.data._dataset.Dataset.