lenskit.splitting.sample_users#

lenskit.splitting.sample_users(data: Dataset, size: int, method: HoldoutMethod, *, repeats: int, disjoint: bool = True, test_only: bool = False, rng: RNGInput = None) Iterator[TTSplit]#
lenskit.splitting.sample_users(data: Dataset, size: int, method: HoldoutMethod, *, disjoint: bool = True, rng: RNGInput = None, test_only: bool = False, repeats: None = None) TTSplit

Create train-test splits by sampling users. When repeats is None, returns a single train-test split; otherwise, it returns an iterator over multiple splits. If repeats=1, this function returns an iterator that yields a single train-test pair.

Stability:
Caller (see Stability Levels).
Parameters:
  • data (Dataset) – Data frame containing ratings or other data you wish to partition.

  • size (int) – The sample size.

  • method (HoldoutMethod) – The method for obtaining user test ratings.

  • repeats (int | None) – The number of samples to produce.

  • test_only (bool) – If True, returns splits with empty training sets (useful when you just want to save the test data).

  • rng (lenskit.random.RNGInput) – The random number generator or seed (see Random Seeds).

  • disjoint (bool)

Returns:

The train-test pair(s).

Return type:

Iterator[TTSplit] | TTSplit