Data Splitting#

The lenskit.splitting package implements data splitting support for evaluation.

Output Types#

TTSplit(train, test[, name])

A train-test set from splitting or other sources.

Temporal Splitting#

split_global_time

Global temporal train-test split.

User-Based Splitting#

crossfold_users

Partition a dataset user-by-user for user-based cross-validation.

sample_users

Create train-test splits by sampling users.

LastFrac

Select a fraction of test rows per user/item.

LastN

Select a fixed number of test rows per user/item, based on ordering by a field.

SampleFrac

Randomly select a fraction of test rows per user/item.

SampleN

Randomly select a fixed number of test rows per user/item.

Record-Based Splitting#

crossfold_records

Partition a dataset by records into cross-fold partitions.

sample_records

Create a train-test split of data by randomly sampling individual interactions.