Development#
Notes for developers. If you want to get involved, please do! We welcome all kinds of contributions, for example:
- docs fixes/clarifications
- bug reports
- bug fixes
- feature requests
- pull requests
- tutorials
Workflows#
We don't mind whether you use a branching or forking workflow. However, please only push to your own branches, pushing to other people's branches is often a recipe for disaster, is never required in our experience so is best avoided.
Try and keep your merge requests as small as possible (focus on one thing if you can). This makes life much easier for reviewers which allows contributions to be accepted at a faster rate.
Installation#
For development, we rely on uv for all our dependency management. To get started, you will need to make sure that uv
is installed (instructions here).
This project is a uv
workspace, which means that it contains more than one Python package. uv
commands will by default target the root bookshelf
package, but if you wish to target another package you can use the --package
flag.
For all of work, we use our Makefile
. You can read the instructions out and run the commands by hand if you wish, but we generally discourage this because it can be error prone. In order to create your environment, run make virtual-environment
.
If there are any issues, the messages from the Makefile
should guide you through. If not, please raise an issue in the issue tracker.
Language#
We use British English for our development. We do this for consistency with the broader work context of our lead developers.
Versioning#
This package follows the version format described in PEP440 and Semantic Versioning to describe how the version should change depending on the updates to the code base.
Our commit messages are written using written to follow the conventional commits standard which makes it easy to find the commits that matter when traversing through the commit history.
Note
We don't use the commit messages from conventional commits to automatically generate the changelog and release documentation.
The notebooks generating the datasets#
The top-level directory notebooks
contains the notebooks used to produce the Book
s. Each notebook corresponds with a single Volume
(collection of Book
s with the same name
).
Each notebook also has a corresponding .yaml
file containing the latest metadata for the Book
. See the NotebookMetadata
schema(bookshelf.schema.NotebookMetadata
) for the expected format of this file.
Creating a new Volume
#
- Start by copying
example.py
andexample.yaml
and renaming to the name of the new volume. This provides a simple example to get started. - Update
{volume}.yaml
with the correct metadata - Update the fetch and processing steps as needed, adding additional
Resource
s to theBook
as needed. - Run the notebook and check the output
- TODO Perform the release procedure to upload the built book to the remote
BookShelf
bookshelf save {volume}
Updating a Volume
's version#
- Update the
version
attribute in the metadata file - Modify other metadata attributes as needed
- Update the data fetching and processing steps in the notebook
- Run the notebook and check the output
- TODO Perform the release procedure to upload the built book to the remote
BookShelf
Testing a notebook locally#
You can run a notebook with a specified output directory for local testing:
The generated book can then be used directly from the local directory. Note that the path to the custom directory needs to specify the version
of the Book. When loading the Book, you must also specify the version and the edition otherwise it will query the remote bookshelf.
import bookshelf
shelf = bookshelf.BookShelf("/path/to/custom/directory/{version}")
edition = 1
new_book = shelf.load("{notebook_name}", version="{version}", edition=edition)
Releasing#
Releasing is semi-automated via a CI job. The CI job requires the type of version bump that will be performed to be manually specified. See the poetry docs for the list of available bump rules.
Standard process#
The steps required are the following:
-
Bump the version: manually trigger the "bump" stage from the latest commit in main (pipelines are here). A valid "bump_rule" (see https://python-poetry.org/docs/cli/#version) will need to be specified via the "BUMP_RULE" CI variable (see https://docs.gitlab.com/ee/ci/variables/). This will then trigger a release, including publication to PyPI.
-
Download the artefacts from the release job. The
release_notes.md
artefact will be pre-filled with the list of changes included in this release. You find it in the release-bundle zip file at the artefacts section. The announcements section should be completed manually to highlight any particularly notable changes or other announcements (or deleted if not relevant for this release). -
Once the release notes are filled out, use them to make a release.
-
That's it, release done, make noise on social media of choice, do whatever else
-
Enjoy the newly available version