qa4sm_autoreports package

Submodules

qa4sm_autoreports.data module

class qa4sm_autoreports.data.ConfigData(validation_run: ValidationRun)[source]

Bases: RunData

Collect variables from the validation run config

collect()[source]

Collect Configuration Variables from all datasets in this validation run.

Returns:

run_vars: dict

Collection of config variables from this run

collect_datasets(i=0)[source]

Collect the Variables from a dataset configuration

Parameters:

i (int, optional) – Id of the dataset to read from the config

class qa4sm_autoreports.data.Data(data=None)[source]

Bases: object

Data container base class for variables from various sources

add(content: dict, section: str = 'Content')[source]

Add content to the collection.

Parameters:
  • content (dict) – Yaml storable content, usually {KEY: Value, …}

  • section (str) – Multiple contents can be stored, specify name of content group (e.g. summary statistics). Each name will be a separate yml section upon export.

append(other)[source]

Add data from other variable

dump(path: Path | str, overwrite: bool = False)[source]

Write content to yml to import later for the report.

Parameters:
  • path (str) – File path to write data to

  • overwrite (bool, optional) – Overwrite will replace an existing file, otherwise will append to it.

classmethod from_yml(path: Path | str)[source]

Load data from a previous stored yml file

Parameters:

path (str or Path) – Path to the yml file to load content from

load(path, mode='r')[source]

Load data from the passed yml file into this Data container.

Parameters:
  • path (str) – Path to the yml file to take content from

  • mode (str, optional) – ‘r’: Read mode will drop any already loaded content ‘a’ Append mode will add content from the to anything loaded

class qa4sm_autoreports.data.NetcdfData(validation_run: ValidationRun)[source]

Bases: RunData

Collect variables from the results netcdf file

collect()[source]
collect_content() dict[source]
stats()[source]
class qa4sm_autoreports.data.NetcdfMetaData(validation_run: ValidationRun)[source]

Bases: RunData

Collect meta variables from the results netcdf file

collect()[source]
collect_metadata_content() dict[str, str][source]
class qa4sm_autoreports.data.RemoteData(validation_run: ValidationRun)[source]

Bases: RunData

Collect variables from service API sources

collect()[source]

Collect Configuration Variables from all datasets in this validation run.

Returns:

run_vars: dict

Collection of config variables from this run

class qa4sm_autoreports.data.RunData(validation_run: ValidationRun)[source]

Bases: Data

Collection of data from multiple validation runs for a report

class qa4sm_autoreports.data.SummaryStatsData(validation_run: ValidationRun)[source]

Bases: RunData

Collect variables from the csv summary stats file of the validation run

collect()[source]

Collect all relevant stats from the downloaded summary table

export_table(path=None)[source]

Export the data to a csv table that is used in the latex report

Parameters:

path (str) – Path to csv file

qa4sm_autoreports.data.load_yml_to_dict(filepath: str | Path) dict[source]

Load a QA4SM-style YAML config/results file into a nested dictionary.

Parameters:
  • filepath (str or Path) – Path to the yml content

  • Returns

  • -------

  • data (dict) – The first level are the content sections names, sub-levels are the variables in that section.

qa4sm_autoreports.extent module

class qa4sm_autoreports.extent.GeographicExtent(min_lat: float, min_lon: float, max_lat: float, max_lon: float)[source]

Bases: object

An immutable geographic bounding box defined by its corner coordinates.

Coordinate conventions

  • Latitude : -90.0 (South Pole) to +90.0 (North Pole)

  • Longitude : -180.0 (antimeridian west) to +180.0 (antimeridian east)

Note

Extents that wrap around the antimeridian (e.g. parts of Alaska / Pacific) are not handled by this class. All longitudes are assumed to satisfy min_lon <= max_lon.

property center: tuple[float, float]

(latitude, longitude) of the geometric centre.

contains(other: GeographicExtent) bool[source]

Return True if other is fully enclosed by this extent.

contains_point(lat: float, lon: float) bool[source]

Return True if the point (lat, lon) lies within this extent.

property corners: tuple[tuple[float, float], ...]

All four corners as (lat, lon) tuples in order: SW, NW, NE, SE.

equals(other: GeographicExtent, tolerance: float = 0.0) bool[source]

Return True when this extent and other cover the same region.

Parameters:
  • other (GeographicExtent The extent to compare against.)

  • tolerance (float Allowed absolute difference in degrees) – for each boundary (default: exact match).

Examples

>>> a = GeographicExtent(0, 0, 10, 10)
>>> b = GeographicExtent(0.0001, 0, 10, 10)
>>> a.equals(b)
False
>>> a.equals(b, tolerance=0.001)
True
classmethod from_corners(lat1: float, lon1: float, lat2: float, lon2: float) GeographicExtent[source]

Convenience constructor that accepts two arbitrary corner points and sorts the coordinates automatically.

Parameters:
  • lat1 (float First corner (any order).)

  • lon1 (float First corner (any order).)

  • lat2 (float Opposite corner (any order).)

  • lon2 (float Opposite corner (any order).)

property height: float

North–south span in degrees.

intersection(other: GeographicExtent) GeographicExtent | None[source]

Compute the intersection of this extent and other.

Returns:

The overlapping region, or None if the extents do not overlap.

Return type:

GeographicExtent

Examples

>>> a = GeographicExtent(0, 0, 10, 10)
>>> b = GeographicExtent(5, 5, 15, 15)
>>> a.intersection(b)
GeographicExtent(min_lat=5, min_lon=5, max_lat=10, max_lon=10)
max_lat: float
max_lon: float
min_lat: float
min_lon: float
static multi_intersection(*extents: GeographicExtent) GeographicExtent | None[source]

Compute the common intersection of two or more extents.

Parameters:

*extents (GeographicExtent) – Two or more extents to intersect.

Returns:

The region common to all supplied extents, or None if there is no common region (or fewer than two extents are supplied).

Return type:

GeographicExtent

Examples

>>> a = GeographicExtent(0, 0, 20, 20)
>>> b = GeographicExtent(5, 5, 25, 25)
>>> c = GeographicExtent(8, 3, 15, 18)
>>> GeographicExtent.multi_intersection(a, b, c)
GeographicExtent(min_lat=8, min_lon=5, max_lat=15, max_lon=18)
overlaps(other: GeographicExtent) bool[source]

Return True if this extent and other have any area in common (touching edges count as overlapping).

plot_map(global_map=False) Figure[source]

Alternative version with higher detail and different styling options.

static union(*extents: GeographicExtent) GeographicExtent[source]

Return the smallest extent that contains all supplied extents.

Parameters:

*extents (GeographicExtent Two or more extents.)

property width: float

East–west span in degrees.

qa4sm_autoreports.report module

class qa4sm_autoreports.report.AutoReportCreator(runs, report_root)[source]

Bases: object

Trigger multiple validation runs, check status, compile PDF.

collect_content(force_download=False)[source]

Collect all content variables for a given run. Write to single file.

Parameters:

force_download (bool, optional (default: False)) – Always download new results. If this is False, only download results if the don’t yet exist.

compile(template_path, main_tex='main.tex', run_tex='run.tex', tex_ignore=None, from_scratch=False)[source]

Collect contents to compile PDF report from templates.

Parameters:
  • template_path (str or Path) – Path where the templates latex files are stored.

  • main_tex (str, optional) – Main tex file

  • run_tex (str, optional) – Tex file template to use for runs (have separate yml bindings).

  • tex_ignore (list, optional) – A list of tex files in the template path to ignore

  • from_scratch (bool, optional) – Download and collect data, even if it already exists.

delete(remote=True)[source]

Delete all runs in this report.

Parameters:
  • local (bool, optional) – Delete the remote version of the run

  • remote (bool, optional) – Delete the local copy of the validation run

download_all_results(delay=1)[source]

Download all results from the server for all runs.

Parameters:

delay (int, optional (default: 1)) – Delay in seconds between API calls to start a run.

classmethod from_results(report_root, connection=None)[source]

Set up report creator from previously created local runs.

Parameters:
  • report_root (str or Path) – Path to the report folder (is created / overwritten)

  • connection (Connection, optional) – Connection to use for all runs. If None, connections will be created based on the instance in each run’s config file.

classmethod from_scratch(report_root, templates_path, connection, run_name_long=False, force=False)[source]

Set up report creator from scratch, i.e. from template configs. If report_root already exists, runs will be loaded from files.

Parameters:
  • report_root (str or Path) – Path to the report folder (is created / overwritten)

  • templates_path (str or Path) – Path where the config templates (json) are found (we use all available files).

  • connection (Connection) – QA4SM Connection

  • run_name_long (bool, optional) – Instead of naming runs “runX”, name them “run X - <template>” instead.

  • force (bool, optional) – Force creating a new report_root from scratch If False, an error is thrown if it exists.

open_datasets() dict[source]
override_params(**kwargs)[source]

Override parameters in all runs loaded for this report.

Parameters:

kwargs – Kwargs are passed to each run’s override_params method.

populate_latex(template_file: str, out_file: str, yaml_bindings: dict, placeholder=re.compile('(?:\\\\detokenize\\{)?\\$<(.+?)>\\$(?:\\})?')) None[source]

Populate run latex file with run data.

Parameters:
  • template_file (str or Path) – Path to the run latex template

  • out_file (str or Path, optional) – Path where the variables are stored (yaml bindings) and where the output is written to.

  • yaml_bindings (dict) – Specify the yaml bindings, if None is passed we use the default bindings from the run and report root.

  • placeholder (re.Pattern, optional) – Placeholder pattern to replace in the tex files. the default looks like \detokenize{$<...>$} and contains python f-strings.

rollback(status=0)[source]

Roll back the report to the selected stage.

Parameters:

status (int) – Target status after rollback.

start_all_runs(delay=1, override=None)[source]

Trigger all validation runs with the run configurations currently loaded in here (self.runs). Use self.runs[i].start() to trigger them individually.

Parameters:
  • delay (int, optional (default: 1)) – Delay in seconds between API calls to start a run.

  • override (dict, optional (default: None)) –

    To override certain settings in all validation runs before starting them, pass them here. Example:

    {'interval_from': "2023-01-01", 'interval_to': "2023-03-31",
     'min_lat': -17.0, 'max_lon': 150.0, ...}
    

property status: int

Status between all validation runs, returned as a numerical code in order of progress - 0 - Staged: Local setup created, not triggered online - 1 - Started: All runs were triggered - 2 - Processed: All runs have finished online - 3 - Collected: All results were downloaded locally - 4 - Compiled: PDF was created

validation_run_table(short_url=True)[source]

Create a table in .csv format that lists all validation runs for this report.

Validation run; URL; Reference; Completed #1; https://test.qa4sm.eu/ui/validation-result/e95eeaeb-1d2f-43c4-b019-b7f3b3dbd29e; ERA5-Land; December 2, 2025

Parameters:

short_url (bool, optional) – URL as link, not full URL

Returns:

df – A table containing the validation runs

Return type:

pd.DataFrame

validations_complete() bool[source]

Check whether all remote runs have already completed.

Returns:

all_done – False if at least one run is not complete yet, else True

Return type:

bool

verify_dataset_availability() bool[source]

Verify for each run that that datasets cover the required period.

Returns:

avail – True if all datasets are available for the requested period, False otherwise.

Return type:

bool

qa4sm_autoreports.run module

class qa4sm_autoreports.run.ValidationRun(config: ValidationConfiguration, root_local: str | Path, connection: Connection, remote_id=None, name_tag=None)[source]

Bases: object

delete(local=True, remote=True)[source]

Delete validation run. Online and/or offline.

Parameters:
  • local (bool, optional) – Delete the remote version of the run

  • remote (bool, optional) – Delete the local copy of the validation run

download_data(force_download=False)[source]

Download the run’s results, i.e., netcdf file, plots.

Parameters:

force_download (bool, optional) – Always download, replace any existing local files. If False, only downloads results that don’t exist locally.

property extent: (<class 'float'>, <class 'float'>, <class 'float'>, <class 'float'>)
classmethod from_remote(local_root: str | Path, connection: Connection, remote_id: str)[source]

Set up ValidationRun based on a remote validation run with a local folder for synchronization.

Parameters:
  • local_root (str) – Local folder where the run data is stored

  • connection (Connection) – Service connection for your user

  • remote_id (str) – Name of the remote run (UID).

Returns:

run

Return type:

ValidationRun

classmethod from_results(local_dir: str | Path, connection: Connection = None, name_tag=None)[source]

Set up ValidationRun based on a previously synchronized, now local, run. Uses: run_id, instance url from response/results files to restore a connection.

Parameters:
  • local_dir (Union[str, Path]) – Local run folder containing at least the config.json or some previously downloaded results.

  • connection (Connection, optional) – Connection to use for the run. If None, a new connection will be created based on the instance in the config file.

  • name_tag (str, optional) – Name to assign to the new run. If None is passed, the name of the local_dir is used.

Returns:

run

Return type:

ValidationRun

classmethod from_template(local_dir: str | Path, connection: Connection, name_tag=None)[source]

Set up ValidationRun based on a previously synchronized, now local, run.

Parameters:
  • local_dir (Union[str, Path]) – Local run folder containing at least the config.json or some previously downloaded results.

  • connection (Connection) – Connection to QA4SM instance to which the validation run should be assigned.

  • name_tag (str, optional) – Name to assign to the new run. If None is passed, the name of the local_dir is used.

Returns:

run

Return type:

ValidationRun

get_reference(reftype='spatial')[source]

Get reference dataset for this run.

Parameters:

reftype (Literal['spatial', 'temporal', 'scaling']) – What scaling reference to get

Returns:

  • dataset (str) – Dataset name

  • version (str) – Version name

  • variable (str) – Variable name

get_results_url()[source]

Get the UI URL of the validation run.

has_remote(raise_error: bool = False)[source]

Check if the validation run has a remote counter part

property instance: str
load_results() Dataset[source]

Load downloaded results as xarray.

open_dataset() Dataset[source]

Read local netcdf data as xarray Dataset

override_params(**kwargs)[source]

Override certain parameters in the validation config file. Such as name_tag and start/end date etc.

Parameters:

kwargs – Keys and new values. Keys must already exist in the config. You cannot add anything new, only change existing fields!

plot_extent(global_map=False)[source]

Create a map plot of the area covered by the validation run.

setup_workdir(clear=False)[source]
start()[source]

Start the current Validation Run on the chosen instance. Creates a local folder and dumps the config and the response from the server there.

Returns:

response – Response from validation run

Return type:

dict

property status: Tuple[str, int]

Check the status of the remote run.

Returns:

see Connection.validation_status()

Return type:

status[str], progress[int]

timing() dict[source]

Get timing information for the remote validation run

Returns:

time – Time information as a dict

Return type:

dict

update_name(new_name: str)[source]
update_remote_id(pk)[source]
property url

Get the API URL of the validation run.

verify_period()[source]

Checks if the chosen validation period is within the range available for all datasets on the service.

Returns:

status – True if all datasets are available, False otherwise

Return type:

bool

qa4sm_autoreports.series module

class qa4sm_autoreports.series.AutoReportSeries(series_root, reports=None, connection=None)[source]

Bases: object

delete_report(report_name, remote=True)[source]

Delete report from series. By default, also deletes the online runs and local copies of the validation runs.

Parameters:
  • report_name (str, int) – Name of the report to delete from the series

  • remote (bool, optional) – Remove the online version of the respective validation runs

new_report(report_name, config_template_path, override_params=None, instance='qa4sm.eu', token=None)[source]

Start a new validation report from config templates on the chosen instance, download and collect all results.

Parameters:
  • report_name (str) – Name of the report (will be added to the list)

  • config_template_path (Path or str) – Path where the .json templates are stored

  • override_params (dict, optional) – Params to override settings in config file

  • instance (str, optional) – Instance to use for the report

  • token (str, optional) – API token for authentication. If None, uses the connection from the series if available, otherwise creates a new connection without token.

reports_complete() bool[source]

Check whether all reports in the collection are complete i.e, collected.

Returns:

status – True if all are done -> Series up-to-date

Return type:

bool,

track_metric(metric, ref_epoch=-1, n_epochs=10, run=None, path_out=None, pretty_name='ubRMSD', unit='m³m⁻³', p_mask_var=None, p_mask_thres=0.05, tsw='bulk', preprocess=None)[source]

Create metric tracking data and plot

Parameters:
  • metric (str, optional) – Metric to track across the epochs. e.g. R_between_0-ISMN_and_1-C3S_combined

  • ref_epoch (int or str, optional) – Reference epoch, i.e. latest one. -1 uses the last report (ordered by name). A number refers to the repoch index, a string to the report name

  • n_epochs (int, optional) – Number of epochs BEFORE the reference epochs to include (includes the reference).

  • run (str, optional) – If the metric should be used from a certain run (from all reports), indicate the run name here. None means we search the metric in all runs, and use the first one if it’s contained in multiple runs for a single report.

  • path_out (str or path) – Where the stored files are stored. None will store all results in the folder of the reference epoch.

  • pretty_name (str, optional) – Display name of the metric, e.g. ubRMSD

  • unit (str, optional) – Pretty unit, no brackets, e.g m³m⁻³

  • p_mask_var (str, optional) – To mask data points where p>thres, pass the p variable name here. The same can be achieved via the preprocess function.

  • p_mask_thres (float, optional) – The p value threshold used for masking, only used when p_mask_var is passed.

  • tsw (str, optional) – Temporal sub-window to use (netcdf dimension). Default is “bulk”

  • preprocess (Callable, optional) –

    Apply to dataset after loading, can be used for e.g. p value masking. Must take and return a dataset. Example:

    lambda ds: ds
    

qa4sm_autoreports.utils module

exception qa4sm_autoreports.utils.ValidationReportError(message='Validation report failed')[source]

Bases: Exception

qa4sm_autoreports.utils.escape_latex(value: str) str[source]

Escape LaTeX special characters in a plain-text string so that it can be safely embedded in a .tex document without breaking compilation.

Parameters:

value (str) – The raw string value to escape.

Returns:

The escaped string, safe for use in LaTeX text mode.

Return type:

str

Module contents