Data pre-processing: wps_climdexInput_csv¶

WPS wrapper for climdexInput.csv data pre-processing functions

In [ ]:
import os
import requests
from birdy import WPSClient
from rpy2 import robjects
from urllib.request import urlretrieve
from importlib.resources import files
from tempfile import NamedTemporaryFile

from wps_tools.file_handling import csv_handler
from wps_tools.testing import get_target_url
from quail.utils import test_ci_output
In [2]:
# Ensure we are in the working directory with access to the data
while os.path.basename(os.getcwd()) != "quail":
    os.chdir('../')
In [3]:
# NBVAL_IGNORE_OUTPUT
url = get_target_url("quail")
print(f"Using quail on {url}")
Using quail on https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/quail/wps
In [4]:
quail = WPSClient(url)

Help for individual processes can be diplayed using the ? command (ex/ bird.process?)¶

In [5]:
# NBVAL_IGNORE_OUTPUT
quail.climdex_input_csv?
Signature:
quail.climdex_input_csv(
    prec_file_content,
    na_strings='NULL',
    tmax_column='tmax',
    tmin_column='tmin',
    prec_column='prec',
    tavg_column='tavg',
    base_range='c(1961, 1990)',
    cal='gregorian',
    date_fields="c('year', 'jday')",
    date_format='%Y %j',
    n=5,
    northern_hemisphere=True,
    quantiles='NULL',
    temp_qtiles='c(0.1, 0.9)',
    prec_qtiles='c(0.95, 0.99)',
    max_missing_days='c(annual = 15, monthly =3)',
    min_base_data_fraction_present=0.1,
    loglevel='INFO',
    tmax_file_content=None,
    tmin_file_content=None,
    tavg_file_content=None,
    output_file='output.rda',
    vector_name='days',
    output_formats=None,
)
Docstring:
Process for creating climdexInput object from CSV files

Parameters
----------
tmax_file_content : string
    Content of file with daily maximum temperature data (temporary alternative to taking file).
tmin_file_content : string
    Content of file with daily minimum temperature data (temporary alternative to taking file).
prec_file_content : string
    Content of file with daily total precipitation data (temporary alternative to taking file).
tavg_file_content : string
    Content of file with daily mean temperature data (temporary alternative to taking file).
na_strings : string
    Strings used for NA values; passed to read.csv
tmax_column : string
    Column name for tmax data.
tmin_column : string
    Column name for tmin data.
prec_column : string
    Column name for prec data.
tavg_column : string
    Column name for tavg data.
base_range : string
    Years to use for the baseline
cal : string
    The calendar type used in the input files.
date_fields : string
    Vector of names consisting of the columns to be concatenated together with spaces.
date_format : string
    Date format as taken by strptime.
n : integer
    Number of days to use as window for daily quantiles.
northern_hemisphere : boolean
    Number of days to use as window for daily quantiles.
quantiles : string
    Threshold quantiles for supplied variables.
temp_qtiles : string
    Quantiles to calculate for temperature variables
prec_qtiles : string
    Quantiles to calculate for precipitation
max_missing_days : string
    Vector containing thresholds for number of days allowed missing per year (annual) and per month (monthly).
min_base_data_fraction_present : float
    Minimum fraction of base data that must be present for quantile to be calculated for a particular day
output_file : string
    Filename to store the output Rdata (extension .rda)
vector_name : string
    Name to label the output vector
loglevel : {'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'}string
    Logging level

Returns
-------
climdexinput : ComplexData:mimetype:`application/x-gzip`
    Output R data file for generated climdexInput
File:      ~/github/quail/</tmp/quail-venv/lib/python3.8/site-packages/birdy/client/base.py-2>
Type:      method

Run wps_climdexInput process¶

In [ ]:
with NamedTemporaryFile(suffix=".rda", prefix="summer_days_", dir="/tmp", delete=True) as output_file:
    output = quail.climdex_input_csv(
            tmax_file_content=csv_handler(
                (files("tests") / "data/1018935_MAX_TEMP.csv").resolve()),
            tmin_file_content=csv_handler(
                (files("tests") / "data/1018935_MIN_TEMP.csv").resolve()),
            prec_file_content=csv_handler(
                (files("tests") / "data/1018935_ONE_DAY_PRECIPITATION.csv").resolve()),
            tmax_column='MAX_TEMP',
            tmin_column='MIN_TEMP',
            prec_column='ONE_DAY_PRECIPITATION',
            base_range="c(1971, 2000)",
            vector_name="climdexInput",
        )
ci_url = output.get()[0]
In [7]:
test_ci_output(
        ci_url, "climdexInput", "climdexInput.rda", "ci"
    )