lenskit.pipeline.Pipeline ========================= .. py:class:: lenskit.pipeline.Pipeline(config, nodes, *, run_hooks) :canonical: lenskit.pipeline._impl.Pipeline LensKit recommendation pipeline. This is the core abstraction for using LensKit models and other components to produce recommendations in a useful way. It allows you to wire together components in (mostly) abitrary graphs, train them on data, and serialize pipelines to disk for use elsewhere. Pipelines should not be directly instantiated; they must be built with a :class:`~lenskit.pipeline.PipelineBuilder` class, or loaded from a configuration with :meth:`from_config`. If you have a scoring model and just want to generate recommenations with a default setup and minimal configuration, see :func:`~lenskit.pipeline.topn_pipeline` or :class:`~lenskit.pipeline.RecPipelineBuilder`. Pipelines are also :class:`~lenskit.training.Trainable`, and train all trainable components. :Stability: Caller .. py:property:: config :type: lenskit.pipeline.config.PipelineConfig Get the pipline configuration. .. important:: Do not modify the configuration returned, or it will become out-of-sync with the pipeline and likely not behave correctly. .. py:property:: name :type: str | None Get the pipeline name (if configured). .. py:property:: version :type: str | None Get the pipeline version (if configured). .. py:method:: meta() Get the metadata (name, version, hash, etc.) for this pipeline without returning the whole config. .. py:method:: nodes() Get the nodes in the pipeline graph. .. py:method:: node(node: str, *, missing: Literal['error'] = 'error') -> lenskit.pipeline.nodes.Node[object] node(node: str, *, missing: Literal['none'] | None) -> lenskit.pipeline.nodes.Node[object] | None node(node: lenskit.pipeline.nodes.Node[T]) -> lenskit.pipeline.nodes.Node[T] Get the pipeline node with the specified name. If passed a node, it returns the node or fails if the node is not a member of the pipeline. :param node: The name of the pipeline node to look up, or a node to check for membership. :returns: The pipeline node, if it exists. :raises KeyError: The specified node does not exist. .. py:property:: default_node :type: lenskit.pipeline.nodes.Node[Any] | None Get the default node for this pipeline. .. py:method:: component_names() Get the component names (in topological order). .. py:method:: node_input_connections(node) Get the input wirings for a node. .. py:method:: component(node) Get the component at a particular node, if any. .. py:method:: clone() Clone the pipeline, **without** its trained parameters. :returns: A new pipeline with the same components and wiring, but fresh instances created by round-tripping the configuration. .. py:property:: config_hash :type: str Get a hash of the pipeline's configuration to uniquely identify it for logging, version control, or other purposes. The hash format and algorithm are not guaranteed, but is stable within a LensKit version. For the same version of LensKit and component code, the same configuration will produce the same hash, so long as there are no literal nodes. Literal nodes will *usually* hash consistently, but since literals other than basic JSON values are hashed by pickling, hash stability depends on the stability of the pickle bytestream. In LensKit 2025.1, the configuration hash is computed by computing the JSON serialization of the pipeline configuration *without* a hash and returning the hex-encoded SHA256 hash of that configuration. .. py:method:: from_config(config, file_path = None) :staticmethod: Reconstruct a pipeline from a serialized configuration. :param config: The configuration object, as loaded from JSON, TOML, YAML, or similar. Will be validated into a :class:`PipelineConfig`. :param file_path: The path of the file from which the configuration was read, when available. :returns: The configured (but not trained) pipeline. :raises PipelineError: If there is a configuration error reconstructing the pipeline. :Warns: **PipelineWarning** -- If the configuration is funny but usable; for example, the configuration includes a hash but the constructed pipeline does not have a matching hash. .. py:method:: load_config(cfg_file) :classmethod: Load a pipeline from a saved configuration file. :param cfg_file: The path to a TOML, YAML, or JSON file containing the pipeline configuration. :returns: The consructed pipeline. .. seealso:: :meth:`from_config` for the actual pipeline instantiation logic. .. py:method:: modify() Create a pipeline builder from this pipeline in order to modify it. Pipelines cannot be modified in-place, but this method sets up a new builder that will create a modified copy of the pipeline. Unmodified component instances are reused as-is. .. note:: Since default connections are applied in :meth:`~lenskit.pipeline.PipelineBuilder.build`, the modifying builder does not have default connections. .. py:method:: train(data, options = None) Trains the pipeline's trainable components (those implementing the :class:`Trainable` interface) on some training data. If the :attr:`~TrainingOptions.retrain` option is ``False``, this method will skip training components that are already trained (as reported by the :meth:`~Trainable.is_trained` method). .. admonition:: Random Number Generation :class: note If :attr:`TrainingOptions.rng` is set and is not a generator or bit generator (i.e. it is a seed), then this method wraps the seed in a :class:`~numpy.random.SeedSequence` and calls :class:`~numpy.random.SeedSequence.spawn()` to generate a distinct seed for each component in the pipeline. :param data: The dataset to train on. :param options: The training options. If ``None``, default options are used. .. py:method:: run(/, **kwargs: object) -> object run(node: str, /, **kwargs: object) -> object run(nodes: tuple[str, Ellipsis], /, **kwargs: object) -> tuple[object, Ellipsis] run(node: lenskit.pipeline.nodes.Node[T], /, **kwargs: object) -> T run(nodes: tuple[lenskit.pipeline.nodes.Node[T1], lenskit.pipeline.nodes.Node[T2]], /, **kwargs: object) -> tuple[T1, T2] run(nodes: tuple[lenskit.pipeline.nodes.Node[T1], lenskit.pipeline.nodes.Node[T2], lenskit.pipeline.nodes.Node[T3]], /, **kwargs: object) -> tuple[T1, T2, T3] run(nodes: tuple[lenskit.pipeline.nodes.Node[T1], lenskit.pipeline.nodes.Node[T2], lenskit.pipeline.nodes.Node[T3], lenskit.pipeline.nodes.Node[T4]], /, **kwargs: object) -> tuple[T1, T2, T3, T4] run(nodes: tuple[lenskit.pipeline.nodes.Node[T1], lenskit.pipeline.nodes.Node[T2], lenskit.pipeline.nodes.Node[T3], lenskit.pipeline.nodes.Node[T4], lenskit.pipeline.nodes.Node[T5]], /, **kwargs: object) -> tuple[T1, T2, T3, T4, T5] Run the pipeline and obtain the return value(s) of one or more of its components. See :ref:`pipeline-execution` for details of the pipeline execution model. :param nodes: The component(s) to run. :param kwargs: The pipeline's inputs, as defined with :meth:`create_input`. These are passed as-is to :meth:`run_all`, so they can also contain auxillary options like `_profile`. :returns: The pipeline result. If no nodes are supplied, this is the result of the default node. If a single node is supplied, it is the result of that node. If a tuple of nodes is supplied, it is a tuple of their results. :raises PipelineError: when there is a pipeline configuration error (e.g. a cycle). :raises ValueError: when one or more required inputs are missing. :raises TypeError: when one or more required inputs has an incompatible type. :raises other: exceptions thrown by components are passed through. .. py:method:: run_all(*nodes, _profile = None, **kwargs) Run all nodes in the pipeline, or all nodes required to fulfill the requested node, and return a mapping with the full pipeline state (the data attached to each node). This is useful in cases where client code needs to be able to inspect the data at arbitrary steps of the pipeline. It differs from :meth:`run` in two ways: 1. It returns the data from all nodes as a mapping (dictionary-like object), not just the specified nodes as a tuple. 2. If no nodes are specified, it runs *all* nodes. This has the consequence of running nodes that are not required to fulfill the last node (such scenarios typically result from using :meth:`use_first_of`). :param nodes: The nodes to run, as positional arguments (if no nodes are specified, this method runs all nodes). :param _profile: A profiler to profile this pipeline run. :param kwargs: The pipeline inputs. :returns: The full pipeline state, with :attr:`~PipelineState.default` set to the last node specified.