MM-cat DaRe
Documentation GitHub Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

SWAPI Dataset

SWAPI (Star Wars API) is a web-based API providing structured and detailed information about the Star Wars universe, covering a variety of entities and their interrelations. The SWAPI dataset covers all the available data from the API.

It is structured in JSON format. The files are the following:

  • people with detailed information about characters
  • films with information about the Star Wars movies
  • planets with details about planets
  • species with characteristics of species
  • vehicles with specifications of vehicles
  • starhips with similar information as vehicles, but some added details

SWAPI dataset

Initial Dataset Specifications

Entity Data Link Mapping
People
Mapping
Films
Mapping
Planets
Mapping
Species
Mapping
Vehicles
Mapping
Starships
Mapping

Generated Dataset Specifications

The film data is enriched by embedding detailed information from all referenced entities. Instead of storing only references (URLs) to related data, the selected details of these references are included as embedded objects within the film documents. The transformed dataset is then stored in MongoDB, with each film represented as a comprehensive, self-contained document.

By embedding all related details into each film document and storing it in MongoDB, the transformed SWAPI dataset becomes highly accessible, efficient, and scalable. It supports a wide range of applications, from backend APIs to analytical tools, and aligns well with MongoDB’s strengths in handling nested, hierarchical data.

Entity Output Mapping
Films
Output Mapping

Note: While it is technically possible to define a mapping that embeds the film data with details from referenced entities, it is currently not possible to generate the transformed data. This limitation arises because the Transformation modules in MM-cat are not yet equipped to handle array references effectively. Therefore, we do not provide the transformed dataset at this stage. Enhancements to the Transformation modules to address this limitation are planned for future development.