MM-cat DaRe
Documentation GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Dataset Repository

The DaRe (Dataset Repository) was created to document and showcase the transformation of single-model datasets into their various multi-model representations. For each dataset, DaRe provides a comprehensive record of initial data as well as details of the multi-model data generation process. Learn more here or choose a dataset and explore!




Yelp dataset

The Yelp dataset includes business information, including reviews, user data and business attributes, offering a view of consumer interactions and feedback.

Original Contents: 5 JSON files, ~30,000KB

Yelp dataset SK


IMDb dataset

The IMDb dataset contains multiple files covering a wide range of information about films, TV shows, and media professionals.

Original Contents:: 7 TSV files, ~1,100KB

IMDb dataset SK


SWAPI dataset

The SWAPI dataset holds detailed information about the Star Wars universe, covering a variety of entities and their interrelations.

Original Contents: 6 JSON files, ~61KB

SWAPI dataset SK


BibleData dataset

The BibleData dataset is a complex dataset containing structured information on Bible texts, translations, and metadata.

Original Contents: 9 CSV files, ~2,800KB

BibleData dataset SK


NASA dataset

The NASA dataset details NASA’s various code projects.

Original Contents: 1 JSON file, ~3,200KB

NASA dataset SK