Book data analysis example¶
Loading a dataset¶
Begin by initializing a BookShelf
object. Specify the desired volume and version to retrieve the corresponding book:
In [1]:
Copied!
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
from matplotlib.pyplot import figure
from bookshelf import BookShelf
shelf = BookShelf()
volume = "rcmip-emissions"
version = "v5.1.0"
book = shelf.load(volume, version)
import warnings warnings.filterwarnings("ignore", category=FutureWarning) from matplotlib.pyplot import figure from bookshelf import BookShelf shelf = BookShelf() volume = "rcmip-emissions" version = "v5.1.0" book = shelf.load(volume, version)
/home/runner/work/bookshelf/bookshelf/.venv/lib/python3.10/site-packages/scmdata/database/_database.py:9: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html import tqdm.autonotebook as tqdman
Once the book is loaded, access specific timeseries data in a wide format by using the timeseries
function and specifying the book name. This data will be returned as an scmdata.ScmRun
object. Alternatively, use the get_long_format_data
function to obtain timeseries data in a long format, which returns a pd.DataFrame
object:
In [2]:
Copied!
data_wide = book.timeseries("complete")
# data_long = book.get_long_format_data("complete")
data_wide = book.timeseries("complete") # data_long = book.get_long_format_data("complete")
In [3]:
Copied!
data_wide.filter(variable="Emissions|CO2|MAGICC AFOLU")
data_wide.filter(variable="Emissions|CO2|MAGICC AFOLU")
Out[3]:
<ScmRun (timeseries: 77, timepoints: 751)> Time: Start: 1750-01-01T00:00:00 End: 2500-01-01T00:00:00 Meta: activity_id mip_era model region scenario \ 4 ZECMIP CMIP6 idealised World esm-bell-1000PgC 44 ZECMIP CMIP6 idealised World esm-bell-2000PgC 84 ZECMIP CMIP6 idealised World esm-bell-750PgC 124 not_applicable CMIP5 AIM World rcp60 164 not_applicable CMIP5 IMAGE World rcp26 ... ... ... ... ... ... 8937 not_applicable CMIP6 REMIND-MAGPIE World|R5.2REF ssp534-over 9061 not_applicable CMIP6 REMIND-MAGPIE World|R5.2REF ssp585 9146 not_applicable CMIP6 idealised World esm-pi-CO2pulse 9186 not_applicable CMIP6 idealised World esm-pi-cdr-pulse 9226 not_applicable CMIP6 idealised World esm-piControl unit variable 4 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 44 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 84 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 124 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 164 Mt CO2/yr Emissions|CO2|MAGICC AFOLU ... ... ... 8937 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 9061 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 9146 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 9186 Mt CO2/yr Emissions|CO2|MAGICC AFOLU 9226 Mt CO2/yr Emissions|CO2|MAGICC AFOLU [77 rows x 7 columns]
For long format data, employ pandas
functionality to apply necessary filters.
In [4]:
Copied!
figure(figsize=(10, 6), dpi=160)
data_wide.filter(variable="Emissions|CO2|MAGICC AFOLU").lineplot()
figure(figsize=(10, 6), dpi=160) data_wide.filter(variable="Emissions|CO2|MAGICC AFOLU").lineplot()
Out[4]:
<Axes: xlabel='time', ylabel='Mt CO2/yr'>
This approach allows you to efficiently load, filter, and visualize datasets from your bookshelf, facilitating in-depth analysis and insights.