lenskit.flexmf#

Flexible PyTorch matrix factorization models for LensKit.

The components in this package implement several matrix factorization models for LensKit, and also serve as an example for practical PyTorch recommender training.

Stability: Internal

This API is at the internal or experimental stability level: it may change at any time, and breaking changes will not necessarily be described in the release notes. See Stability Levels for details. FlexMF is provided as a preview release, and may change in the next months as we gain more experience with it.

as we gain more experience with it.

Classes#

FlexMFConfigBase

Common configuration for all FlexMF scoring components.

FlexMFScorerBase

Base class for the FlexMF scorers, providing common Torch support.

FlexMFExplicitConfig

Configuration for FlexMFExplicitScorer. This class overrides

FlexMFExplicitScorer

Explicit-feedback rating prediction with FlexMF. This realizes a biased

FlexMFImplicitConfig

Configuration for FlexMFImplicitScorer. It inherits base model

FlexMFImplicitScorer

Implicit-feedback rating prediction with FlexMF. This is capable of

Package Contents#

class lenskit.flexmf.FlexMFConfigBase#

Bases: lenskit.config.common.EmbeddingSizeMixin, pydantic.BaseModel

Common configuration for all FlexMF scoring components.

Stability:

Experimental

embedding_size: pydantic.PositiveInt = 64#

The dimension of user and item embeddings (number of latent features to learn).

batch_size: int = 8192#

The training batch size.

learning_rate: float = 0.01#

The learning rate for training.

epochs: int = 10#

The number of training epochs.

regularization: float = 0.01#

The regularization strength.

Note

The explicit-feedback model uses a different default strength.

reg_method: Literal['AdamW', 'L2'] | None = 'AdamW'#

The regularization method to use.

With the default AdamW regularization, training will use the AdamW optimizer with weight decay. With L2 regularization, training will use sparse gradients and the torch.optim.SparseAdam optimizer.

Note

The explicit-feedback model defaults this setting to "L2".

None

Use no regularization.

"L2"

Use L2 regularization on the parameters used in each training batch. The strength is applied to the _mean_ norms in a batch, so that the regularization term scale is not dependent on the batch size.

"AdamW"

Use torch.optim.AdamW with the specified regularization strength. This configuration does not use sparse gradients, but training time is often comparable.

Note

Regularization values do not necessarily have the same range or meaning for the different regularization methods.

class lenskit.flexmf.FlexMFScorerBase(config=None, **kwargs)#

Bases: lenskit.training.UsesTrainer, lenskit.pipeline.Component

Base class for the FlexMF scorers, providing common Torch support.

Stability:

Experimental

Parameters:
  • config (object | None)

  • kwargs (Any)

config: FlexMFConfigBase#

The component configuration object. Component classes that support configuration must redefine this attribute with their specific configuration class type, which can be a Python dataclass or a Pydantic model class.

users: lenskit.data.Vocabulary#
items: lenskit.data.Vocabulary#
model: lenskit.flexmf._model.FlexMFModel#
to(device)#

Move the model to a different device.

__call__(query, items)#

Generate item scores for a user.

Note that user and items are both user and item IDs, not positions.

Parameters:
Return type:

lenskit.data.ItemList

score_items(users, items)#

Score for users and items, after resolivng them and limiting to known users and items.

Parameters:
Return type:

torch.Tensor

class lenskit.flexmf.FlexMFExplicitConfig#

Bases: lenskit.flexmf._base.FlexMFConfigBase

Configuration for FlexMFExplicitScorer. This class overrides certain base class defaults for better explicit-feedback performance.

Stability:

Experimental

regularization: float = 0.1#

The regularization strength.

Note

The explicit-feedback model uses a different default strength.

reg_method: Literal['AdamW', 'L2'] | None = 'L2'#

The regularization method to use.

With the default AdamW regularization, training will use the AdamW optimizer with weight decay. With L2 regularization, training will use sparse gradients and the torch.optim.SparseAdam optimizer.

Note

The explicit-feedback model defaults this setting to "L2".

None

Use no regularization.

"L2"

Use L2 regularization on the parameters used in each training batch. The strength is applied to the _mean_ norms in a batch, so that the regularization term scale is not dependent on the batch size.

"AdamW"

Use torch.optim.AdamW with the specified regularization strength. This configuration does not use sparse gradients, but training time is often comparable.

Note

Regularization values do not necessarily have the same range or meaning for the different regularization methods.

class lenskit.flexmf.FlexMFExplicitScorer(config=None, **kwargs)#

Bases: lenskit.flexmf._base.FlexMFScorerBase

Explicit-feedback rating prediction with FlexMF. This realizes a biased matrix factorization model (similar to lenskit.als.BiasedMF) trained with PyTorch.

Stability:

Experimental

Parameters:
  • config (object | None)

  • kwargs (Any)

global_bias: float#
config: FlexMFExplicitConfig#

The component configuration object. Component classes that support configuration must redefine this attribute with their specific configuration class type, which can be a Python dataclass or a Pydantic model class.

create_trainer(data, options)#

Create a model trainer to train this model.

score_items(users, items)#

Score for users and items, after resolivng them and limiting to known users and items.

Parameters:
Return type:

torch.Tensor

class lenskit.flexmf.FlexMFImplicitConfig#

Bases: lenskit.flexmf._base.FlexMFConfigBase

Configuration for FlexMFImplicitScorer. It inherits base model options from FlexMFConfigBase.

Stability:

Experimental

preset: Literal['bpr', 'warp', 'lightgcn'] | None = None#

Select preset defaults to mimic a particular model’s original presentation.

loss: ImplicitLoss = 'logistic'#

The loss to use for model training.

negative_strategy: NegativeStrategy | None = None#

The negative sampling strategy. The default is "misranked" for WARP loss and "uniform" for other losses.

negative_count: pydantic.PositiveInt = 1#

The number of negative items to sample for each positive item in the training data. With BPR loss, the positive item is compared to each negative item; with logistic loss, the positive item is treated once per learning round, so this setting effectively makes the model learn on _n_ negatives per positive, rather than giving positive and negative examples equal weight.

positive_weight: pydantic.PositiveFloat = 1.0#

A weighting multiplier to apply to the positive item’s loss, to adjust the relative importance of positive and negative classifications. Only applies to logistic loss.

user_bias: bool | None = None#

Whether to learn a user bias term. If unspecified, the default depends on the loss function (False for pairwise and True for logistic).

item_bias: bool = True#

Whether to learn an item bias term.

convolution_layers: pydantic.NonNegativeInt = 0#

The number of LightGCN convolution layers to use. 0 (the default) configures for standard matrix factorization.

selected_negative_strategy()#
Return type:

NegativeStrategy

classmethod apply_preset(data)#
check_strategies()#
class lenskit.flexmf.FlexMFImplicitScorer(config=None, **kwargs)#

Bases: lenskit.flexmf._base.FlexMFScorerBase

Implicit-feedback rating prediction with FlexMF. This is capable of realizing multiple models, including:

  • BPR-MF (Bayesian personalized ranking) [RFGSchmidtThieme09] (with "pairwise" loss)

  • Logistic matrix factorization [Joh14] (with "logistic" loss)

All use configurable negative sampling, including the sampling approach from WARP.

Stability:

Experimental

Parameters:
  • config (object | None)

  • kwargs (Any)

config: FlexMFImplicitConfig#

The component configuration object. Component classes that support configuration must redefine this attribute with their specific configuration class type, which can be a Python dataclass or a Pydantic model class.

create_trainer(data, options)#

Create a model trainer to train this model.