myfm.utils.benchmark_data.MovieLens100kDataManager

class myfm.utils.benchmark_data.MovieLens100kDataManager(zippath: Optional[pathlib.Path] = None)[source]

Bases: myfm.utils.benchmark_data.loader_base.MovieLensBase

The Data manager for MovieLens 100k dataset.

__init__(zippath: Optional[pathlib.Path] = None)

Methods

__init__([zippath])

genres()

load_movie_info()

load movie meta information.

load_rating_all()

Load the entire rating dataset.

load_rating_kfold_split(K, fold[, random_state])

Load the entire dataset and split it into train/test set.

load_rating_predefined_split(fold)

Read the pre-defined train/test split.

load_user_info()

load user meta information.

Attributes

DEFAULT_PATH

DOWNLOAD_URL

load_movie_info() pandas.core.frame.DataFrame[source]

load movie meta information.

Returns

A dataframe containing meta-information (id, title, release_date, url, genres) about the movies. Multiple genres per movie will be concatenated by “|”.

Return type

pd.DataFrame

load_rating_all() pandas.core.frame.DataFrame[source]

Load the entire rating dataset.

Returns

all the available ratings.

Return type

pd.DataFrame

load_rating_kfold_split(K: int, fold: int, random_state: Optional[int] = 0) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]

Load the entire dataset and split it into train/test set. K-fold

Parameters
  • K (int) – K in the K-fold splitting scheme.

  • fold (int) – fold index.

  • random_state (Union[np.RandomState, int, None], optional) – Controlls random state of the split.

Returns

train and test dataframes.

Return type

Tuple[pd.DataFrame, pd.DataFrame]

Raises

ValueError – When 0 <= fold < K is not met.

load_rating_predefined_split(fold: int) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame][source]

Read the pre-defined train/test split. Fold index ranges from 1 to 5.

Parameters

fold (int) – specifies the fold index.

Returns

train and test dataframes.

Return type

Tuple[pd.DataFrame, pd.DataFrame]

load_user_info() pandas.core.frame.DataFrame[source]

load user meta information.

Returns

user infomation

Return type

pd.DataFrame