lenskit.metrics.predict#

Prediction accuracy metrics. See eval-predict-accuracy for an overview and instructions on using these metrics.

Attributes#

Classes#

PredictMetric

Extension to the metric function interface for prediction metrics.

RMSE

Compute RMSE (root mean squared error). This is computed as:

MAE

Compute MAE (mean absolute error). This is computed as:

AvgErrorAccumulator

Module Contents#

type lenskit.metrics.predict.MissingDisposition = Literal['error', 'ignore']#
type lenskit.metrics.predict.ScoreArray = NDArray[np.floating] | pd.Series#
type lenskit.metrics.predict.PredMetric = Callable[[ScoreArray, ScoreArray], float]#
class lenskit.metrics.predict.PredictMetric(missing_scores='error', missing_truth='error')#

Bases: lenskit.metrics._base.Metric

Extension to the metric function interface for prediction metrics.

In addition to the general metric interface, predict metrics can be called with a single item list (or item list collection) that has both scores and a rating field.

Parameters:
  • missing_scores (MissingDisposition) – The action to take when a test item has not been scored. The default throws an exception, avoiding situations where non-scored items are silently excluded from overall statistics.

  • missing_truth (MissingDisposition) – The action to take when no test items are available for a scored item. The default is to also to fail; if you are scoring a superset of the test items for computational efficiency, set this to "ignore".

Stability:
Caller (see Stability Levels).
default = None#
missing_scores: MissingDisposition#
missing_truth: MissingDisposition#
align_scores(predictions, truth=None)#

Align prediction scores and rating values, applying the configured missing dispositions. The result is two Pandas series, predictions and truth, that are aligned and checked for missing data in accordance with the configured options.

Parameters:
Return type:

tuple[pandas.Series[float], pandas.Series[float]]

class lenskit.metrics.predict.RMSE(missing_scores='error', missing_truth='error')#

Bases: PredictMetric

Compute RMSE (root mean squared error). This is computed as:

\[\sum_{r_{ui} \in R} \left(r_{ui} - s(i|u)\right)^2\]

This metric does not do any fallbacks; if you want to compute RMSE with fallback predictions (e.g. usign a bias model when a collaborative filter cannot predict), generate predictions with FallbackScorer.

Stability:
Caller (see Stability Levels).
Parameters:
measure_list(predictions, test=None, /)#

Compute measurements for a single list.

Returns:

  • A float for simple metrics

  • Intermediate data for decomposed metrics

  • A dict mapping metric names to values for multi-metric classes

Parameters:
Return type:

tuple[float, int]

extract_list_metrics(data)#

Extract per-list metric(s) from intermediate measurement data.

Returns:

  • A float for simple metrics

  • A dict mapping metric names to values for multi-metric classes

  • None if no per-list metrics are available

Parameters:

data (tuple[float, int])

create_accumulator()#

Creaet an accumulator to aggregate per-list measurements into summary metrics.

Each result from measure_list() is passed to Accumulator.add().

class lenskit.metrics.predict.MAE(missing_scores='error', missing_truth='error')#

Bases: PredictMetric

Compute MAE (mean absolute error). This is computed as:

\[\sum_{r_{ui} \in R} \left|r_{ui} - s(i|u)\right|\]

This metric does not do any fallbacks; if you want to compute MAE with fallback predictions (e.g. usign a bias model when a collaborative filter cannot predict), generate predictions with FallbackScorer.

Stability:
Caller (see Stability Levels).
Parameters:
measure_list(predictions, test=None, /)#

Compute measurements for a single list.

Returns:

  • A float for simple metrics

  • Intermediate data for decomposed metrics

  • A dict mapping metric names to values for multi-metric classes

Parameters:
Return type:

tuple[float, int]

extract_list_metrics(data)#

Extract per-list metric(s) from intermediate measurement data.

Returns:

  • A float for simple metrics

  • A dict mapping metric names to values for multi-metric classes

  • None if no per-list metrics are available

Parameters:

data (tuple[float, int])

create_accumulator()#

Creaet an accumulator to aggregate per-list measurements into summary metrics.

Each result from measure_list() is passed to Accumulator.add().

class lenskit.metrics.predict.AvgErrorAccumulator(root=False)#
Parameters:

root (bool)

add(value)#
Parameters:

value (tuple[float, int])

accumulate()#
Return type:

dict[str, float]

Exported Aliases#

class lenskit.metrics.predict.ItemList#

Re-exported alias for lenskit.data.ItemList.

lenskit.metrics.predict.ITEM_COMPAT_COLUMN#

Re-exported alias for lenskit.data._adapt.ITEM_COMPAT_COLUMN.

lenskit.metrics.predict.normalize_columns()#

Re-exported alias for lenskit.data._adapt.normalize_columns().

class lenskit.metrics.predict.ValueStatAccumulator#

Re-exported alias for lenskit.data.accum.ValueStatAccumulator.

class lenskit.metrics.predict.AliasedColumn#

Re-exported alias for lenskit.data.types.AliasedColumn.

class lenskit.metrics.predict.Metric#

Re-exported alias for lenskit.metrics._base.Metric.