Quickstart

Prerequisites

Install texlive for PDF compilation:

sudo apt install texlive-full

Install the package:

pip install qa4sm-autoreports

Authenticate by creating ~/.qa4smapirc as described in the qa4sm-api docs, or pass the token directly in code.

Connecting to QA4SM

from qa4sm_autoreports import Connection

qa4sm = Connection("qa4sm.eu")                      # uses .qa4smapirc
qa4sm = Connection("qa4sm.eu", token="<your-token>")  # explicit token

Run configuration templates

Each validation run in a report is defined by a JSON configuration template. Place one .json file per run in the templates_path directory — the filename (without the extension) becomes the run’s folder name inside the report directory.

The template follows the QA4SM validation API schema. A working example (ISMN vs. C3S, used by the integration tests) is provided in tests/testdata/report_config_templates/ismn_c3s.json:

{
  "name_tag": "test",
  "interval_from": "2020-06-01",
  "interval_to": "2020-08-31",
  "temporal_matching": 12,
  "anomalies_method": "none",
  "anomalies_from": null,
  "anomalies_to": null,
  "min_lat": 19.095097036262146,
  "min_lon": -156.3971863811718,
  "max_lat": 20.4067416381494,
  "max_lon": -154.67868591465736,
  "scaling_method": "none",
  "metrics": [
    {
      "id": "tcol",
      "value": false
    },
    {
      "id": "bootstrap_tcol_cis",
      "value": false
    },
    {
      "id": "stability_metrics",
      "value": false
    }
  ],
  "intra_annual_metrics": {
    "intra_annual_metrics": false,
    "intra_annual_type": "",
    "intra_annual_overlap": null
  },
  "dataset_configs": [
    {
      "dataset_id": 1,
      "version_id": 70,
      "variable_id": 1,
      "is_spatial_reference": false,
      "is_temporal_reference": true,
      "is_scaling_reference": false,
      "basic_filters": [
        1
      ],
      "parametrised_filters": []
    },
    {
      "dataset_id": 4,
      "version_id": 69,
      "variable_id": 4,
      "is_spatial_reference": true,
      "is_temporal_reference": false,
      "is_scaling_reference": false,
      "basic_filters": [
        1,
        2
      ],
      "parametrised_filters": [
        {
          "id": 24,
          "parameters": "0.00,0.10"
        },
        {
          "id": 18,
          "parameters": "AMMA-CATCH,DAHRA,TAHMO,SD_DEM,CHINA,CTP_SMTMN,HiWATER_EHWSN,HSC_SEOLMACHEON,IIT_KANPUR,KHOREZM,MAQU,MONGOLIA,MySMNet,RUSWET-AGRO,RUSWET-GRASS,RUSWET-VALDAI,SKKU,SW-WHU,KIHS_CMC,KIHS_SMC,VDS,NAQU,NGARI,SMN-SDR,SONTE-China,WIT-Network,AACES,OZNET,SASMAS,BIEBRZA_S-1,CALABRIA,CAMPANIA,FMI,FR_Aqui,GROW,GTK,HOBE,HYDROL-NET_PERUGIA,IMA_CAN1,METEROBS,MOL-RAO,ORACLE,REMEDHUS,RSMN,SMOSMANIA,SWEX_POLAND,TERENO,UDC_SMOS,UMBRIA,UMSUOL,VAS,WEGENERNET,WSMN,HOAL,IPE,COSMOS-UK,LABFLUX,NVE,Ru_CFR,STEMS,TWENTE,XMS-CAT,ARM,AWDN,BNZ-LTER,FLUXNET-AMERIFLUX,ICN,IOWA,PBO_H2O,RISMA,SCAN,SNOTEL,SOILSCAPE,USCRN,USDA-ARS,TxSON,LAB-net,PTSMN"
        }
      ]
    }
  ],
  "val_type": "temporal",
  "settings_changes": {
    "filters": [],
    "anomalies": false,
    "scaling": false,
    "variables": [],
    "versions": []
  }
}

Any top-level field can be overridden at runtime without editing the template, which is useful when only the time period changes between epochs:

report.override_params(
    interval_from="2024-01-01",
    interval_to="2024-03-31",
)

LaTeX report templates

The template_path directory passed to compile() must contain a root main.tex and any supporting files (.tex, .bib, images). All files are copied into a pdf_report/ subfolder inside the report root, then pdflatex is run on main.tex there.

Placeholders \detokenize{$<expr>$} are replaced before compilation. expr is a Python expression evaluated against two variable namespaces: ReportVars (period, QA4SM version, URL, …) and Run1ContentVars, Run2ContentVars, … (per-run dataset names, metrics, …).

run.tex is a per-run template: it is copied into each run’s subdirectory and included via \import{./runN/}{run.tex}, so relative paths to QA4SM graphics resolve correctly.

Example main.tex:

\documentclass[11pt]{article}

% Packages
\usepackage{amsmath}
\usepackage{hyperref}
\usepackage{graphicx}
\usepackage{booktabs}
\usepackage{csvsimple}
\usepackage{wrapfig}
\usepackage[
    top=2cm,
    bottom=2cm,
    left=2.5cm,
    right=2.5cm,
    headheight=14pt
]{geometry}
\hypersetup{
    colorlinks=true,
    linkcolor=black,
    filecolor=magenta,
    urlcolor=blue,
    citecolor=black,
}
\usepackage[authoryear]{natbib}
\usepackage{lmodern}
\usepackage{microtype}      % Better text justification and spacing
\usepackage{import}  % add to preamble
\usepackage{placeins}
\usepackage{subcaption}
\pagestyle{plain}

\newcommand{\platformversion}{\detokenize{$<ReportVars['Common']['qa4sm_version']>$}}
\newcommand{\reportfreq}{Blabla}
\newcommand{\timecovered}{\detokenize{$<ReportVars['Common']['interval_days']>$} days (\detokenize{$<ReportVars['Common']['interval_from']>$} to \detokenize{$<ReportVars['Common']['interval_to']>$})}
\newcommand{\otherreports}{Blabla}
\newcommand{\website}{\detokenize{$<ReportVars['Common']['qa4sm_url']>$}}
\newcommand{\contact}{Blabla}
\newcommand{\manual}{Blabla}

% Document
\begin{document}

    \begin{center}
    {\Large{TEST REPORT}} \\
    \vspace{1mm}
    {\Large{Date of creation: \detokenize{$<ReportVars['Common']['compilation_date']>$}}}
    \end{center}

    \vspace{5mm}
    \hrule
    \vspace{1mm}
    \hrule

    \vspace{3mm}
    \begin{tabular}{ll}
    Blabla: 	                    & {\contact}  \\
    QA4SM version:                  & {\platformversion}   \\
    Time: 	                        & {\timecovered}  \\
    Website: 	                    & {\website}  \\
    \end{tabular}

    \vspace{1mm}
    \hrule

    \section{Test}

    \noindent Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ul

    \noindent Test citation in \citet{hersbach2020} and the paper \citep{hersbach2020}.

    \begin{table}[h]
    \centering
    \csvreader[
        separator=semicolon,
        tabular=llll,
        table head=\toprule Run & URL & Name & Completed \\ \midrule,
        late after last line=\\\bottomrule
    ]{val_run_list.csv}{}%
    { \csvcoli & \csvcolii & \csvcoliii & \csvcoliv }
    \caption{Validation runs included in this report}  % caption first
    \label{tab:val_run_list}                           % label after caption
    \end{table}

    \clearpage

    \include{something.tex}
    \import{./run1/}{run.tex}

    \bibliographystyle{plainnat}
    \bibliography{references.bib}

\end{document}

Example run.tex:

% This template is used for all validation runs in the report (which can use different input datasets!)
\section{Run.tex content for the run}

Fig. \ref{fig:coverage\detokenize{$<ContentVars['report_run_index']>$}} Lorem ipsum dolor sit amet consectetur adipiscing elit.

\vspace{1cm}

\noindent Here is a number for you: \detokenize{$<ContentVars['ConfigVars']['DS1']['version_id']>$}.

\vspace{1cm}

\noindent Lorem ipsum dolor sit amet consectetur adipiscing elit. Quisque faucibus
 ex sapien vitae pellentesque sem placerat. In id cursus mi pretium
tellus duis convallis. Tempus leo eu aenean sed diam urna tempor.
Pulvinar vivamus fringilla lacus nec metus bibendum egestas. Iaculis
massa nisl malesuada lacinia integer nunc posuere. Ut hendrerit semper
vel class aptent taciti sociosqu. Ad litora torquent per conubia nostra
inceptos himenaeos.

% A graphic created during the validation run data collection showing the
% covered area is included.
\begin{figure}[ht!]
    \centering
    \includegraphics[width=.6\textwidth]{"extent.png"}
    \caption{Area covered by all validation runs.}
    \label{fig:coverage\detokenize{$<ContentVars['report_run_index']>$}}
\end{figure}


Lorem ipsum dolor sit amet consectetur adipiscing elit. Quisque faucibus
 ex sapien vitae pellentesque sem placerat. In id cursus mi pretium
tellus duis convallis. Tempus leo eu aenean sed diam urna tempor.
Pulvinar vivamus fringilla lacus nec metus bibendum egestas. Iaculis
massa nisl malesuada lacinia integer nunc posuere. Ut hendrerit semper
vel class aptent taciti sociosqu. Ad litora torquent per conubia nostra
inceptos himenaeos.

We reference Fig. \ref{fig:nobs}.

\begin{figure}[ht!]
    \centering
    \includegraphics[width=.6\textwidth]{"run1/qa4sm_graphics/bulk_overview_n_obs.png"}
    \caption{A plot from a run.}
    \label{fig:nobs}
\end{figure}

For series reports that include metric tracking plots, add a tracking.tex:

\section{Tracking a bunch of metrics}

Here we visualize the tracking plots made after each validation run in the series.

Tracking uses the results from previous runs of the same series like
in Fig. \ref{fig:tracking_ubrmsd}.

\begin{figure}[ht!]
    \centering
    \includegraphics[width=.6\textwidth]{"tracking_ubRMSD.png"}
    \caption{Tracking ubrmsd stats}
    \label{fig:tracking_ubrmsd}
\end{figure}

Creating a single report

A report combines multiple validation runs (one per config template). Place JSON config templates in a folder, then:

from qa4sm_autoreports.report import AutoReportCreator

# 1. Set up from templates (creates local directory structure)
report = AutoReportCreator.from_scratch(
    report_root="/results/my_report",
    templates_path="/configs/my_report_templates",
    connection=qa4sm,
)

# 2. Optionally override parameters (e.g. the validation period)
report.override_params(
    interval_from="2024-01-01",
    interval_to="2024-03-31",
)

# 3. Start all runs on QA4SM
report.start_all_runs()

# 4. Check status
print(report)  # shows each run with status and progress

# 5. Once all runs are done, collect results and compile PDF
if report.validations_complete():
    report.compile(template_path="/configs/my_latex_templates")

Note

compile() calls collect_content() internally, which downloads results and collects variables into YAML files used to populate the LaTeX templates.

Resume an interrupted workflow

If runs are already triggered or results already downloaded, load the existing local state instead of starting from scratch:

report = AutoReportCreator.from_results("/results/my_report",
                                        connection=qa4sm)

Managing a report series

A series is a folder of reports sharing the same configuration but covering different time periods (epochs).

from qa4sm_autoreports.series import AutoReportSeries

series = AutoReportSeries("/results/my_series", connection=qa4sm)

# Add a new report for the next period
report = series.new_report(
    "2024-04-01_to_2024-06-30",
    config_template_path="/configs/my_report_templates",
    override_params={"interval_from": "2024-04-01",
                     "interval_to": "2024-06-30"},
)

# Check series status
print(series)

# Once all validations are done, compile
if series["2024-04-01_to_2024-06-30"].validations_complete():
    series["2024-04-01_to_2024-06-30"].compile("/configs/my_latex_templates")

Tracking metrics over time

After multiple reports in a series are collected, track a metric across epochs:

series.track_metric(
    metric="urmsd_between_0-ISMN_and_1-C3S_combined",
    pretty_name="ubRMSD",
    unit="m³m⁻³",
    ref_epoch=-1,   # last epoch
    n_epochs=12,    # look back 12 epochs
    p_mask_var="p_R_between_0-ISMN_and_1-C3S_combined",  # optional p-masking
    path_out="/results/my_series/latest_report/tracking",
)

This saves a .yml file with per-epoch statistics and a boxplot .png that can be embedded in the LaTeX template.

Report status codes

AutoReportCreator.status returns an integer:

Code	Name	Meaning
0	Staged	Local config created; runs not yet triggered.
1	Started	All runs triggered on QA4SM.
2	Processed	All runs completed on QA4SM.
3	Collected	Results downloaded; `ReportVars.yml` written.
4	Compiled	PDF exists in `pdf_report/` subfolder.