Package nowcastlib
Nowcast Library
🧙♂️🔧 Utils that can be reused and shared across and beyond the ESO Nowcast project
This is a public repository hosted on GitHub via a push mirror setup in the internal ESO GitLab repository
Installation
Simply run
pip install nowcastlib
Usage and Documentation
Nowcast Library (nowcastlib) consists in a collection of functions organized in submodules (API) and a tool accessible via the command line (CLI). The latter is primarily intended for accessing the Nowcast Library Pipeline, an opinionated yet configurable set of processing steps for wrangling data and evaluating models in a consistent and rigorous way. More information can be found on the nowcastlib pipeline index page (link to markdown and link to hosted docs)
Please refer to the examples folder on GitHub for usage examples.
API
Here is a quick example of how one may import nowcastlib and access to one of the functions:
"""Example showing how to access compute_trig_fields function"""
import nowcastlib as ncl
import pandas as pd
import numpy as np
data_df = pd.DataFrame(
np.array([[0, 3, 4, np.NaN], [32, 4, np.NaN, 4], [56, 8, 0, np.NaN]]).T,
columns=["A", "B", "C"],
index=pd.date_range(start="1/1/2018", periods=4, freq="2min"),
)
result = ncl.datasets.compute_trig_fields(data_df, ["A", "C"])
More in-depth API documentation can be found here.
CLI
Some of the library's functionality is bundled in configurable subcommands
accessible via the terminal with the command nowcastlib
:
usage: nowcastlib [-h] [-v]
{triangulate,preprocess,sync,postprocess,datapipe} ...
positional arguments:
{triangulate,preprocess,sync,postprocess,datapipe}
available commands
triangulate Run `nowcastlib triangulate -h` for further help
preprocess Run `nowcastlib preprocess -h` for further help
sync Run `nowcastlib sync -h` for further help
postprocess Run `nowcastlib postprocess -h` for further help
datapipe Run `nowcastlib datapipe -h` for further help
optional arguments:
-h, --help show this help message and exit
-v, --verbose increase verbosity level from INFO to DEBUG
Repository Structure
The following output is generated with tree . -I 'dist|docs|*.pyc|__pycache__'
.
├── LICENSE
├── Makefile # currently used to build docs
├── README.md
├── de421.bsp # not committed
├── docs/ # html files for the documentation static website
├── examples
│ ├── README.md
│ ├── cli_triangulate_config.yaml
│ ├── data/ # not committed
│ ├── datasync.ipynb
│ ├── output/ # not committed
│ ├── pipeline_datapipe.json
│ ├── pipeline_preprocess.json
│ ├── pipeline_sync.json
│ ├── signals.ipynb
│ └── triangulation.ipynb
├── images
│ └── pipeline_flow.png
├── nowcastlib # the actual source code for the library
│ ├── __init__.py
│ ├── cli
│ │ ├── __init__.py
│ │ └── triangulate.py
│ ├── datasets.py
│ ├── dynlag.py
│ ├── gis.py
│ ├── pipeline
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── cli.py
│ │ ├── process
│ │ │ ├── __init__.py
│ │ │ ├── postprocess
│ │ │ │ ├── __init__.py
│ │ │ │ ├── cli.py
│ │ │ │ └── generate.py
│ │ │ ├── preprocess
│ │ │ │ ├── __init__.py
│ │ │ │ └── cli.py
│ │ │ └── utils.py
│ │ ├── split
│ │ │ └── __init__.py
│ │ ├── structs.py
│ │ ├── sync
│ │ │ ├── __init__.py
│ │ │ └── cli.py
│ │ └── utils.py
│ ├── signals.py
│ └── utils.py
├── poetry.lock # lock file generated by python poetry for dependency mgmt
└── pyproject.toml # general information file, handled by python poetry
Directories and Files not Committed
There are a number of files and folders that are not committed due to their large and static nature that renders them inappropriate for git version control. The following files and folder warrant a brief explanation.
- Certain functions (time since sunset, sun elevation) of the Nowcast Library rely on the use of a .bsp file, containing information on the locations through time of various celestial bodies in the sky. This file will be automatically downloaded upon using one of these functions for the first time.
- The examples scripts make use of a
data/
directory containing a series of csv files. Most of the data used in the examples can be downloaded from the ESO Ambient Condition Database. Users can then change the paths set in the examples to fit their needs. For users interested in replicating the exact structure and contents of the data directory, a compressed copy of it (1.08 GB) is available to ESO members through this Microsoft Sharepoint link. - At times the examples show the serialization functionality of the nowcastlib
pipeline or need to output some data. In these situations the
output/
directory in the examples folder is used.
Development Setup
This repository relies on Poetry for tracking dependencies, building and publishing. It is therefore recommended that developers install poetry and make use of it throughout their development of the project.
Dependencies
Make sure you are in the right Python environment and run
poetry install
This reads pyproject.toml, resolves the dependencies, and installs them.
Deployment
The repository is published to PyPi, so to make it
accessible via a pip install
command as mentioned earlier.
To publish changes follow these steps. Ideally this process is automated via a CI tool triggered by a push/merge to the master branch:
-
Optionally run
poetry version
with the appropriate argument based on semver guidelines. -
Update the documentation by running
console make document
-
Prepare the package by running
console poetry build
-
Ensure you have TestPyPi and PyPi configured as your poetry repositories:
console poetry config repositories.testpypi <https://test.pypi.org/legacy/> poetry config repositories.pypi <https://pypi.org/>
-
Publish the repository to TestPyPi, to see that everything works as expected:
console poetry publish -r testpypi
-
Stage, commit and push your changes (to master) with git.
-
Publish the repository to PyPi:
console poetry publish -r pypi
Upon successful deployment, the library should be available for install via
pip
Expand source code
"""
.. include:: ../README.md
"""
import logging
import nowcastlib.datasets
import nowcastlib.gis
import nowcastlib.signals
import nowcastlib.utils
import nowcastlib.dynlag
import nowcastlib.pipeline
try:
import importlib.metadata as importlib_metadata
except ModuleNotFoundError:
import importlib_metadata
__version__ = importlib_metadata.version(__name__)
__pdoc__ = {
"cli": False,
}
# logging config {{{
root_logger = logging.getLogger(__name__)
root_logger.setLevel(logging.INFO)
logger_formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)
console_handler.setFormatter(logger_formatter)
root_logger.addHandler(console_handler)
# }}}
Sub-modules
nowcastlib.datasets
-
Functions for syncing and chunking multiple datasets.
nowcastlib.dynlag
-
Functions for generating dynamically lagged time series.
nowcastlib.gis
-
Functions for computing metrics related to Geographical information science.
nowcastlib.pipeline
-
Data processing and Model evaluation pipeline for the Nowcast Library Pipeline …
nowcastlib.signals
-
Utils for working with signal processing
nowcastlib.utils
-
Uncategorised utilities.