lenskit.flexmf#
Flexible PyTorch matrix factorization models for LensKit.
The components in this package implement several matrix factorization models for LensKit, and also serve as an example for practical PyTorch recommender training.
Stability: Internal
This API is at the internal or experimental stability level: it may change at any time, and breaking changes will not necessarily be described in the release notes. See Stability Levels for details. FlexMF is provided as a preview release, and may change in the next months as we gain more experience with it.
as we gain more experience with it.
Classes#
Common configuration for all FlexMF scoring components. |
|
Base class for the FlexMF scorers, providing common Torch support. |
|
Configuration for |
|
Explicit-feedback rating prediction with FlexMF. This realizes a biased |
|
Configuration for |
|
Implicit-feedback rating prediction with FlexMF. This is capable of |
Package Contents#
- class lenskit.flexmf.FlexMFConfigBase#
Bases:
lenskit.config.common.EmbeddingSizeMixin,pydantic.BaseModelCommon configuration for all FlexMF scoring components.
- Stability:
Experimental
- embedding_size: pydantic.PositiveInt = 64#
The dimension of user and item embeddings (number of latent features to learn).
- regularization: float = 0.01#
The regularization strength.
Note
The explicit-feedback model uses a different default strength.
- reg_method: Literal['AdamW', 'L2'] | None = 'AdamW'#
The regularization method to use.
With the default AdamW regularization, training will use the
AdamWoptimizer with weight decay. With L2 regularization, training will use sparse gradients and thetorch.optim.SparseAdamoptimizer.Note
The explicit-feedback model defaults this setting to
"L2".NoneUse no regularization.
"L2"Use L2 regularization on the parameters used in each training batch. The strength is applied to the _mean_ norms in a batch, so that the regularization term scale is not dependent on the batch size.
"AdamW"Use
torch.optim.AdamWwith the specified regularization strength. This configuration does not use sparse gradients, but training time is often comparable.
Note
Regularization values do not necessarily have the same range or meaning for the different regularization methods.
- class lenskit.flexmf.FlexMFScorerBase(config=None, **kwargs)#
Bases:
lenskit.training.UsesTrainer,lenskit.pipeline.ComponentBase class for the FlexMF scorers, providing common Torch support.
- Stability:
Experimental
- Parameters:
config (object | None)
kwargs (Any)
- config: FlexMFConfigBase#
The component configuration object. Component classes that support configuration must redefine this attribute with their specific configuration class type, which can be a Python dataclass or a Pydantic model class.
- users: lenskit.data.Vocabulary#
- items: lenskit.data.Vocabulary#
- model: lenskit.flexmf._model.FlexMFModel#
- to(device)#
Move the model to a different device.
- __call__(query, items)#
Generate item scores for a user.
Note that user and items are both user and item IDs, not positions.
- Parameters:
query (lenskit.data.QueryInput)
items (lenskit.data.ItemList)
- Return type:
- score_items(users, items)#
Score for users and items, after resolivng them and limiting to known users and items.
- Parameters:
users (torch.Tensor)
items (torch.Tensor)
- Return type:
- class lenskit.flexmf.FlexMFExplicitConfig#
Bases:
lenskit.flexmf._base.FlexMFConfigBaseConfiguration for
FlexMFExplicitScorer. This class overrides certain base class defaults for better explicit-feedback performance.- Stability:
Experimental
- regularization: float = 0.1#
The regularization strength.
Note
The explicit-feedback model uses a different default strength.
- reg_method: Literal['AdamW', 'L2'] | None = 'L2'#
The regularization method to use.
With the default AdamW regularization, training will use the
AdamWoptimizer with weight decay. With L2 regularization, training will use sparse gradients and thetorch.optim.SparseAdamoptimizer.Note
The explicit-feedback model defaults this setting to
"L2".NoneUse no regularization.
"L2"Use L2 regularization on the parameters used in each training batch. The strength is applied to the _mean_ norms in a batch, so that the regularization term scale is not dependent on the batch size.
"AdamW"Use
torch.optim.AdamWwith the specified regularization strength. This configuration does not use sparse gradients, but training time is often comparable.
Note
Regularization values do not necessarily have the same range or meaning for the different regularization methods.
- class lenskit.flexmf.FlexMFExplicitScorer(config=None, **kwargs)#
Bases:
lenskit.flexmf._base.FlexMFScorerBaseExplicit-feedback rating prediction with FlexMF. This realizes a biased matrix factorization model (similar to
lenskit.als.BiasedMF) trained with PyTorch.- Stability:
Experimental
- Parameters:
config (object | None)
kwargs (Any)
- config: FlexMFExplicitConfig#
The component configuration object. Component classes that support configuration must redefine this attribute with their specific configuration class type, which can be a Python dataclass or a Pydantic model class.
- create_trainer(data, options)#
Create a model trainer to train this model.
- score_items(users, items)#
Score for users and items, after resolivng them and limiting to known users and items.
- Parameters:
users (torch.Tensor)
items (torch.Tensor)
- Return type:
- class lenskit.flexmf.FlexMFImplicitConfig#
Bases:
lenskit.flexmf._base.FlexMFConfigBaseConfiguration for
FlexMFImplicitScorer. It inherits base model options fromFlexMFConfigBase.- Stability:
Experimental
- preset: Literal['bpr', 'warp', 'lightgcn'] | None = None#
Select preset defaults to mimic a particular model’s original presentation.
- loss: ImplicitLoss = 'logistic'#
The loss to use for model training.
- negative_strategy: NegativeStrategy | None = None#
The negative sampling strategy. The default is
"misranked"for WARP loss and"uniform"for other losses.
- negative_count: pydantic.PositiveInt = 1#
The number of negative items to sample for each positive item in the training data. With BPR loss, the positive item is compared to each negative item; with logistic loss, the positive item is treated once per learning round, so this setting effectively makes the model learn on _n_ negatives per positive, rather than giving positive and negative examples equal weight.
- positive_weight: pydantic.PositiveFloat = 1.0#
A weighting multiplier to apply to the positive item’s loss, to adjust the relative importance of positive and negative classifications. Only applies to logistic loss.
- user_bias: bool | None = None#
Whether to learn a user bias term. If unspecified, the default depends on the loss function (
Falsefor pairwise andTruefor logistic).
- convolution_layers: pydantic.NonNegativeInt = 0#
The number of LightGCN convolution layers to use. 0 (the default) configures for standard matrix factorization.
- selected_negative_strategy()#
- Return type:
NegativeStrategy
- classmethod apply_preset(data)#
- check_strategies()#
- class lenskit.flexmf.FlexMFImplicitScorer(config=None, **kwargs)#
Bases:
lenskit.flexmf._base.FlexMFScorerBaseImplicit-feedback rating prediction with FlexMF. This is capable of realizing multiple models, including:
BPR-MF (Bayesian personalized ranking) [RFGSchmidtThieme09] (with
"pairwise"loss)Logistic matrix factorization [Joh14] (with
"logistic"loss)
All use configurable negative sampling, including the sampling approach from WARP.
- Stability:
Experimental
- Parameters:
config (object | None)
kwargs (Any)
- config: FlexMFImplicitConfig#
The component configuration object. Component classes that support configuration must redefine this attribute with their specific configuration class type, which can be a Python dataclass or a Pydantic model class.
- create_trainer(data, options)#
Create a model trainer to train this model.