Toloka documentation


crowdkit.datasets.load_dataset.load_dataset | Source code

load_dataset(dataset: str, data_dir: Optional[str] = None)

Downloads a dataset from remote and loads it into Pandas objects.

If a dataset is already downloaded, loads it from cache.

Parameters Description

Parameters Type Description
dataset str

str, a dataset name

data_dir Optional[str]

Optional[str] Path to folder where to store downloaded dataset. If None, ~/crowdkit_data is used. default=None. Alternatively, it can be set by the 'CROWDKIT_DATA' environment variable.

  • Returns:

    Tuple[pd.DataFrame, pd.Series], a tuple of workers answers and ground truth labels.

  • Return type:

    Tuple[DataFrame, Series]