Skip to content

Bookshelf#

bookshelf is one way Climate Resource reuses datasets across projects.

Key info : Docs Main branch: supported Python versions Licence

PyPI : PyPI PyPI install

Tests : CI Coverage

Other info : Last Commit Contributors

The bookshelf represents a shared collection of curated datasets or Books. Each Book is a preprocessed, versioned dataset including the notebooks used to produce it. As the underlying datasets or processing are updated, new Books can be created (with an updated version in the case of new data or edition if the processing changed). A single dataset may produce multiple Resources if different representations are useful. These Books can be deployed to a shared Bookshelfso that they are accessible by other users.

Users are able to use specific Books within other projects. The dataset and associated metadata is fetched and cached locally. Specific versions of Books can also be pinned for reproducibility purposes.

This repository contains the notebooks that are used to generate the Books as well as a CLI tool for managing these datasets.

This is a prototype and will likely change in future. Other potential ideas:

  • Deployed data are made available via api.climateresource.com.au so that they can be consumed queried smartly
  • Simple web page to allow querying the data

Each Book consists of a datapackage description of the metadata. This datapackage contains the associated Resources and their hashes. Each Resource is fetched when it is first used and then cached for later use.

Where to next?#

If you want to use the tool to create input4MIPs files, we recommend going to our how-to guides. Some other potential points of interest: