myfm.utils.encoders.MultipleValuesToSparseEncoder¶
- class myfm.utils.encoders.MultipleValuesToSparseEncoder(items: typing.Iterable[str], min_freq: int = 1, sep: str = ',', normalize: bool = True, handle_unknown: typing_extensions.Literal[create, ignore, raise] = 'create')[source]¶
Bases:
myfm.utils.encoders.categorical.CategoryValueToSparseEncoder
[str
]The class to N-hot encode a List of items into a sparse matrix representation.
- __init__(items: typing.Iterable[str], min_freq: int = 1, sep: str = ',', normalize: bool = True, handle_unknown: typing_extensions.Literal[create, ignore, raise] = 'create')[source]¶
Construct the encoder by providing a list of strings, each of which is a list of strings concatenated by sep.
- Parameters
items (Iterable[str]) – Iterable of strings, each of which is a concatenated list of possibly multiple items.
min_freq (int, optional) – The minimal frequency for an item to be retained in the known items list, by default 1.
sep (str, optional) – Tells how to separate string back into a list. Defaults to ‘,’.
normalize (bool, optional) – If True, non-zero entry in the encoded matrix will have 1 / N ** 0.5, where N is the number of non-zero entries in that row. Defaults to True.
handle_unknown (Literal["create", "ignore", "raise"], optional) – How to handle previously unseen values during encoding. If “create”, then there is a single category named “__UNK__” for unknown values, ant it is treated as 0th category. If “ignore”, such an item will be ignored. If “raise”, a KeyError is raised. Defaults to “create”.
Methods
__init__
(items[, min_freq, sep, normalize, ...])Construct the encoder by providing a list of strings, each of which is a list of strings concatenated by sep.
names
()Description of each non-zero entry.
to_sparse
(items)- names() List[str] ¶
Description of each non-zero entry.