lenskit.data.EntityAttribute#
- class lenskit.data.EntityAttribute(name, spec, table, vocab, rows)#
Bases:
abc.ABCBase class for an attribute associated with an entity class. This class effectively represents a _column_ of a data table of entities: the attribute values for one or more entities. In that regard, it is similar to a Pandas series: it records entity IDs/numbers, like an index, and associated attribute values.
This is the general interface for all entity attributes. Not all access methods are supported for all layouts.
- Stability:
- Caller (see Stability Levels).
- Parameters:
name (str)
table (pyarrow.Table)
vocab (lenskit.data._vocab.Vocabulary)
rows (pyarrow.Int32Array | None)
- layout: lenskit.data.schema.AttrLayout#
The attribute layout.
- property data_type: pyarrow.DataType#
- Abstractmethod:
- Return type:
Get the data type of this attribute set.
- ids()#
Get the entity IDs for this collection of entities.
- Return type:
- id_index()#
Get the entity IDs as a Pandas index.
- Return type:
- numbers()#
Get the entity numbers for the attributes
- Return type:
numpy.ndarray[tuple[int], numpy.dtype[numpy.int32]]
- abstractmethod cat_matrix(*, normalize=None)#
Compute a categorical matrix representation of the attribute.
- Parameters:
normalize (Literal['unit', 'distribution'] | None) – Optional normalization method. “unit”: Normalize each row to unit length. “distribution”: Normalize each row so elements sum to 1
- Returns:
- A tuple containing:
matrix (numpy.ndarray or scipy.sparse.csr_array): The categorical matrix. vocab (Vocabulary or None): The vocabulary associated with the categories.
- Return type:
- property dim_names: list[str] | None#
Get the names attached to this attribute’s dimensions.
Note
Only applicable to vector and sparse attributes.
- abstractmethod pandas(*, missing='null')#
- Parameters:
missing (Literal['null', 'omit'])
- Return type:
- numpy()#
Get the attribute values as a NumPy array.
Note
Undefined attribute values may have undefined contents; they will _usually_ be
NaNor similar, but this is not fully guaranteed.- Return type:
numpy.typing.NDArray[Any]
- arrow()#
Get the attribute values as an Arrow array.
- Return type:
pyarrow.Array[Any] | pyarrow.ChunkedArray[Any]
- scipy()#
Get this attribute as a SciPy sparse array (if it is sparse), or a NumPy array if it is dense.
- Return type:
numpy.typing.NDArray[Any] | scipy.sparse.csr_array
- torch()#
- Return type:
- drop_null()#
Subset this attribute set to only entities for which it is defined.
- __len__()#