QA4SM Python Client API - Basic Overview

This is the Python client library for interacting with the QA4SM web service for soil moisture validation. It provides programmatic access to discover datasets, run validation jobs, monitor their status, and download results.

Installation

pip install qa4sm-api

Prerequisites

Your account needs to be approved to use the API. Go to [https://qa4sm.eu/ui/user-profile](your profile page) and Request a new API token.

Afterwards you need to store your token on your PCI in the file ~/.qa4smapirc. You can either do this manually or use the CLI tool as described below.

Option 1: Automated Setup with CLI (Recommended)

Use the qa4sm api setup command to automatically create the configuration file:

qa4sm api setup

You will be prompted for your QA4SM credentials (username and password). The command will:

Retrieve your API token from QA4SM.eu
Create or update ~/.qa4smapirc with your credentials
Store your token securely for future use

Option 2: Manual Configuration

Create the ~/.qa4smapirc file manually with your credentials:

Obtain your API token from your user profile page
Create or edit ~/.qa4smapirc and add the following content:

[qa4sm.eu]
token: your_api_token_here
username: your_username

Verifying Your Setup

To verify your authentication is working, you can run the following console command

qa4sm api check

And it should print a success message if everything is configured correctly.

Using the API

Once you’ve set up your authentication, you can use the API in your Python code.

Initial Connection

At the start of your application, you need set up the connection to QA4SM. API commands are then sent through this connection.

from qa4sm_api.client_api import Connection

qa4sm = Connection()

Connection - Core API Methods

The Connection class is the main interface to the QA4SM service.

Dataset Discovery

from qa4sm_api.client_api import Connection

qa4sm = Connection(instance="qa4sm.eu", token="file")

# Get all available datasets (and their ID) as a DataFrame
datasets = qa4sm.datasets()

# Get versions for a specific dataset (by name or ID), e.g.
versions = qa4sm.versions("C3S_combined")
# or via the dataset ID
versions = qa4sm.versions(1)

# Get detailed metadata for a dataset
dataset_info = qa4sm.dataset_info(qa4sm.dataset_id("C3S_combined"))
# or via the ID
dataset_info = qa4sm.dataset_info(1)

# Get detailed metadata for a specific version

version_info = qa4sm.version_info(
    qa4sm.version_id("v202505", "C3S_combined"))
version_info = qa4sm.version_info(70)


# Get the time period for a version
start_date, end_date = qa4sm.get_period(70)

Filter and Variable Information

# Get details about a data filter
filter_info = qa4sm.filter_info(1)

# Get details about a dataset variable
variable_info = qa4sm.variable_info(1)

Validation Status Monitoring

# Check if a validation run exists
exists = qa4sm.validation_exists("9aeb663b-e24e-4541-8331-6ec3e0318d1f")

# Get validation status and progress (0-100%)
status, progress = qa4sm.validation_status("9aeb663b-e24e-4541-8331-6ec3e0318d1f")
# Status can be: 'NOT FOUND', 'SCHEDULED', 'RUNNING', 'DONE', 'CANCELLED', 'ERROR'

# Get validation run timing
start_time, end_time = qa4sm.validation_time("9aeb663b-e24e-4541-8331-6ec3e0318d1f")

# Get validation duration (seconds and formatted string)
duration_seconds, duration_string = qa4sm.validation_duration("9aeb663b-e24e-4541-8331-6ec3e0318d1f")

ValidationConfiguration

The ValidationConfiguration class manages validation job configurations.

from qa4sm_api.client_api import ValidationConfiguration

# Load configuration from a JSON file
config = ValidationConfiguration.from_file("my_config.json")

# Download configuration from an existing validation run
config = ValidationConfiguration.from_remote(
    run_id="9aeb663b-e24e-4541-8331-6ec3e0318d1f",
    instance="qa4sm.eu",
    token="file"
)

# Save configuration to a JSON file
config.dump("output_config.json")

# Access configuration values
name_tag = config['name_tag']
start_date = config['interval_from']
end_date = config['interval_to']

Running Validations

Submit validation jobs for processing on the QA4SM server.

# Method 1: Submit a ValidationConfiguration object
response = qa4sm.run_validation(config)
# The response is a pandas Series with the run_id as the index
run_id = response.index.tolist()[0] if len(response.index) > 0 else None
if run_id:
    print("Validation run started with ID:", run_id)

# Method 2: Submit a configuration file path
response = qa4sm.run_config_validation("my_config.json")
# The response is a pandas Series with the run_id as the index
run_id = response.index.tolist()[0] if len(response.index) > 0 else None

# Method 3: Submit with overrides
response = qa4sm.run_config_validation(
    "my_config.json",
    override={"name_tag": "my_custom_name"}
)

Downloading Results

Retrieve validation outputs including graphics, NetCDF files, and statistics.

# Download all results for a validation run
# This creates:
#   - <run_id>.nc (NetCDF data file)
#   - summary_stats.csv (summary statistics)
#   - qa4sm_graphics/ (directory with visualization files)
qa4sm.download_results(
    run_id="9aeb663b-e24e-4541-8331-6ec3e0318d1f",
    out_dir="/path/to/output",
    force_download=True
)

# Download only the configuration file
config = qa4sm.download_configuration(
    run_id="9aeb663b-e24e-4541-8331-6ec3e0318d1f",
    out_dir="/path/to/output"
)

Quick Start Example

Complete workflow from discovery to results:

from qa4sm_api.client_api import Connection, ValidationConfiguration
import time

# Initialize connection
qa4sm = Connection(instance="qa4sm.eu", token="file")

# Discover available datasets
datasets = qa4sm.datasets()
print(datasets[['short_name', 'pretty_name']])

# Get versions for a dataset
versions = qa4sm.versions("C3S_combined")
print(versions[['short_name', 'pretty_name', 'time_range_start', 'time_range_end']])

# Load or create a validation configuration
config = ValidationConfiguration.from_file("validation_config.json")

# Submit validation job
response = qa4sm.run_validation(config)
# The response is a pandas Series with the run_id as the index
run_id = response.index.tolist()[0] if len(response.index) > 0 else None
if run_id:
    print(f"Validation started: {run_id}")

# Monitor progress
while True:
    status, progress = qa4sm.validation_status(run_id)
    print(f"Status: {status}, Progress: {progress}%")
    
    if status == 'DONE':
        print("Validation completed successfully!")
        break
    elif status in ['ERROR', 'CANCELLED', 'NOT FOUND']:
        print(f"Validation failed: {status}")
        break
    else:
        time.sleep(30)

# Download results
qa4sm.download_results(run_id, out_dir="./my_results")
print(f"Results downloaded to ./my_results/{run_id}.nc")

Error Handling

Common exceptions you may encounter:

from qa4sm_api.globals import ValidationRunNotFoundError, ValidationInstanceError

try:
    status = qa4sm.validation_status("invalid-run-id")
except ValidationRunNotFoundError:
    print("Validation run not found")

try:
    connection = Connection(instance="unknown-instance", token="file")
except ValidationInstanceError:
    print("Unknown validation instance")