k-NN Collaborative Filtering¶
LKPY provides user- and item-based classical k-NN collaborative Filtering implementations. These lightly-configurable implementations are intended to capture the behavior of the Java-based LensKit implementations to provide a good upgrade path and enable basic experiments out of the box.
Item-based k-NN¶
-
class
lenskit.algorithms.item_knn.ItemItem(nnbrs, min_nbrs=1, min_sim=1e-06, save_nbrs=None, center=True, aggregate='weighted-average')¶ Bases:
lenskit.algorithms.Trainable,lenskit.algorithms.PredictorItem-item nearest-neighbor collaborative filtering with ratings. This item-item implementation is not terribly configurable; it hard-codes design decisions found to work well in the previous Java-based LensKit code.
-
load_model(path)¶ Save a trained model to a file.
Parameters: path (str) – the path to file from which to load the model. Returns: the re-loaded model (of an implementation-defined type).
-
predict(model, user, items, ratings=None)¶ Compute predictions for a user and items.
Parameters: - model – the trained model to use. Either
Noneor the ratings matrix if the algorithm has no concept of training. - user – the user ID
- items (array-like) – the items to predict
- ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, they may be used to override or augment the model’s notion of a user’s preferences.
Returns: scores for the items, indexed by item id.
Return type: - model – the trained model to use. Either
-
save_model(model, path)¶ Save a trained model to a file or directory. The default implementation pickles the model.
Algorithms are allowed to use any format for saving their models, including directories.
Parameters: - model – the trained model.
- path (str) – the path at which to save the model.
-
train(ratings)¶ Train a model.
The model-training process depends on
save_nbrsandmin_sim, but not on other algorithm parameters.Parameters: ratings (pandas.DataFrame) – (user,item,rating) data for computing item similarities. Returns: a trained item-item CF model.
-
-
class
lenskit.algorithms.item_knn.IIModel¶ Item-item recommendation model. This stores the necessary data to run the item-based k-NN recommender.
-
items¶ the index of item IDs.
Type: pandas.Index
-
means¶ the mean rating for each known item.
Type: numpy.ndarray
-
counts¶ the number of saved neighbors for each item.
Type: numpy.ndarray
-
sim_matrix¶ the similarity matrix.
Type: matrix.CSR
-
users¶ the index of known user IDs for the rating matrix.
Type: pandas.Index
-
rating_matrix¶ the user-item rating matrix for looking up users’ ratings.
Type: matrix.CSR
-
User-based k-NN¶
-
class
lenskit.algorithms.user_knn.UserUser(nnbrs, min_nbrs=1, min_sim=0, center=True, aggregate='weighted-average')¶ Bases:
lenskit.algorithms.Trainable,lenskit.algorithms.PredictorUser-user nearest-neighbor collaborative filtering with ratings. This user-user implementation is not terribly configurable; it hard-codes design decisions found to work well in the previous Java-based LensKit code.
-
load_model(path)¶ Save a trained model to a file.
Parameters: path (str) – the path to file from which to load the model. Returns: the re-loaded model (of an implementation-defined type).
-
predict(model, user, items, ratings=None)¶ Compute predictions for a user and items.
Parameters: - model (UUModel) – the memorized data to use.
- user – the user ID
- items (array-like) – the items to predict
- ratings (pandas.Series) – the user’s ratings (indexed by item id); if provided, will be used to recompute the user’s bias at prediction time.
Returns: scores for the items, indexed by item id.
Return type:
-
save_model(model, path)¶ Save a trained model to a file or directory. The default implementation pickles the model.
Algorithms are allowed to use any format for saving their models, including directories.
Parameters: - model – the trained model.
- path (str) – the path at which to save the model.
-
train(ratings)¶ “Train” a user-user CF model. This memorizes the rating data in a format that is usable for future computations.
Parameters: ratings (pandas.DataFrame) – (user, item, rating) data for collaborative filtering. Returns: a memorized model for efficient user-based CF computation. Return type: UUModel
-
-
class
lenskit.algorithms.user_knn.UUModel¶ Memorized data for user-user collaborative filtering.
-
matrix¶ normalized user-item rating matrix.
Type: matrix.CSR
-
users¶ index of user IDs.
Type: pandas.Index
-
user_means¶ user mean ratings.
Type: numpy.ndarray
-
items¶ index of item IDs.
Type: pandas.Index
-
transpose¶ the transposed rating matrix (with data transformations but without L2 normalization).
Type: matrix.CSR
-