Package nowcastlib

Nowcast Library

🧙‍♂️🔧 Utils that can be reused and shared across and beyond the ESO Nowcast project

This is a public repository hosted on GitHub via a push mirror setup in the internal ESO GitLab repository

Installation

Simply run

pip install nowcastlib

Usage and Documentation

Nowcast Library (nowcastlib) consists in a collection of functions organized in submodules (API) and a tool accessible via the command line (CLI). The latter is primarily intended for accessing the Nowcast Library Pipeline, an opinionated yet configurable set of processing steps for wrangling data and evaluating models in a consistent and rigorous way. More information can be found on the nowcastlib pipeline index page (link to markdown and link to hosted docs)

Please refer to the examples folder on GitHub for usage examples.

API

Here is a quick example of how one may import nowcastlib and access to one of the functions:

"""Example showing how to access compute_trig_fields function"""
import nowcastlib as ncl
import pandas as pd
import numpy as np

data_df = pd.DataFrame(
    np.array([[0, 3, 4, np.NaN], [32, 4, np.NaN, 4], [56, 8, 0, np.NaN]]).T,
    columns=["A", "B", "C"],
    index=pd.date_range(start="1/1/2018", periods=4, freq="2min"),
)

result = ncl.datasets.compute_trig_fields(data_df, ["A", "C"])

More in-depth API documentation can be found here.

CLI

Some of the library's functionality is bundled in configurable subcommands accessible via the terminal with the command nowcastlib:

usage: nowcastlib [-h] [-v]
                  {triangulate,preprocess,sync,postprocess,datapipe} ...

positional arguments:
  {triangulate,preprocess,sync,postprocess,datapipe}
                        available commands
    triangulate         Run `nowcastlib triangulate -h` for further help
    preprocess          Run `nowcastlib preprocess -h` for further help
    sync                Run `nowcastlib sync -h` for further help
    postprocess         Run `nowcastlib postprocess -h` for further help
    datapipe            Run `nowcastlib datapipe -h` for further help

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         increase verbosity level from INFO to DEBUG

Repository Structure

The following output is generated with tree . -I 'dist|docs|*.pyc|__pycache__'

.
├── LICENSE
├── Makefile # currently used to build docs
├── README.md
├── de421.bsp # not committed
├── docs/ # html files for the documentation static website
├── examples
│   ├── README.md
│   ├── cli_triangulate_config.yaml
│   ├── data/  # not committed
│   ├── datasync.ipynb
│   ├── output/ # not committed
│   ├── pipeline_datapipe.json
│   ├── pipeline_preprocess.json
│   ├── pipeline_sync.json
│   ├── signals.ipynb
│   └── triangulation.ipynb
├── images
│   └── pipeline_flow.png
├── nowcastlib # the actual source code for the library
│   ├── __init__.py
│   ├── cli
│   │   ├── __init__.py
│   │   └── triangulate.py
│   ├── datasets.py
│   ├── dynlag.py
│   ├── gis.py
│   ├── pipeline
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── cli.py
│   │   ├── process
│   │   │   ├── __init__.py
│   │   │   ├── postprocess
│   │   │   │   ├── __init__.py
│   │   │   │   ├── cli.py
│   │   │   │   └── generate.py
│   │   │   ├── preprocess
│   │   │   │   ├── __init__.py
│   │   │   │   └── cli.py
│   │   │   └── utils.py
│   │   ├── split
│   │   │   └── __init__.py
│   │   ├── structs.py
│   │   ├── sync
│   │   │   ├── __init__.py
│   │   │   └── cli.py
│   │   └── utils.py
│   ├── signals.py
│   └── utils.py
├── poetry.lock # lock file generated by python poetry for dependency mgmt
└── pyproject.toml # general information file, handled by python poetry

Directories and Files not Committed

There are a number of files and folders that are not committed due to their large and static nature that renders them inappropriate for git version control. The following files and folder warrant a brief explanation.

  • Certain functions (time since sunset, sun elevation) of the Nowcast Library rely on the use of a .bsp file, containing information on the locations through time of various celestial bodies in the sky. This file will be automatically downloaded upon using one of these functions for the first time.
  • The examples scripts make use of a data/ directory containing a series of csv files. Most of the data used in the examples can be downloaded from the ESO Ambient Condition Database. Users can then change the paths set in the examples to fit their needs. For users interested in replicating the exact structure and contents of the data directory, a compressed copy of it (1.08 GB) is available to ESO members through this Microsoft Sharepoint link.
  • At times the examples show the serialization functionality of the nowcastlib pipeline or need to output some data. In these situations the output/ directory in the examples folder is used.

Development Setup

This repository relies on Poetry for tracking dependencies, building and publishing. It is therefore recommended that developers install poetry and make use of it throughout their development of the project.

Dependencies

Make sure you are in the right Python environment and run

poetry install

This reads pyproject.toml, resolves the dependencies, and installs them.

Deployment

The repository is published to PyPi, so to make it accessible via a pip install command as mentioned earlier.

To publish changes follow these steps. Ideally this process is automated via a CI tool triggered by a push/merge to the master branch:

  1. Optionally run poetry version with the appropriate argument based on semver guidelines.

  2. Update the documentation by running

    console make document

  3. Prepare the package by running

    console poetry build

  4. Ensure you have TestPyPi and PyPi configured as your poetry repositories:

    console poetry config repositories.testpypi <https://test.pypi.org/legacy/> poetry config repositories.pypi <https://pypi.org/>

  5. Publish the repository to TestPyPi, to see that everything works as expected:

    console poetry publish -r testpypi

  6. Stage, commit and push your changes (to master) with git.

  7. Publish the repository to PyPi:

    console poetry publish -r pypi

Upon successful deployment, the library should be available for install via pip

Expand source code
"""
.. include:: ../README.md
"""
import logging
import nowcastlib.datasets
import nowcastlib.gis
import nowcastlib.signals
import nowcastlib.utils
import nowcastlib.dynlag
import nowcastlib.pipeline

try:
    import importlib.metadata as importlib_metadata
except ModuleNotFoundError:
    import importlib_metadata

__version__ = importlib_metadata.version(__name__)

__pdoc__ = {
    "cli": False,
}


# logging config {{{
root_logger = logging.getLogger(__name__)
root_logger.setLevel(logging.INFO)

logger_formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(message)s")

console_handler = logging.StreamHandler()
console_handler.setLevel(logging.INFO)
console_handler.setFormatter(logger_formatter)

root_logger.addHandler(console_handler)
# }}}

Sub-modules

nowcastlib.datasets

Functions for syncing and chunking multiple datasets.

nowcastlib.dynlag

Functions for generating dynamically lagged time series.

nowcastlib.gis

Functions for computing metrics related to Geographical information science.

nowcastlib.pipeline

Data processing and Model evaluation pipeline for the Nowcast Library Pipeline …

nowcastlib.signals

Utils for working with signal processing

nowcastlib.utils

Uncategorised utilities.