Data pre-processing: wps_climdexInput_csv¶
WPS wrapper for climdexInput.csv data pre-processing functions
In [ ]:
import os
import requests
from birdy import WPSClient
from rpy2 import robjects
from urllib.request import urlretrieve
from importlib.resources import files
from tempfile import NamedTemporaryFile
from wps_tools.file_handling import csv_handler
from wps_tools.testing import get_target_url
from quail.utils import test_ci_output
In [2]:
# Ensure we are in the working directory with access to the data
while os.path.basename(os.getcwd()) != "quail":
os.chdir('../')
In [3]:
# NBVAL_IGNORE_OUTPUT
url = get_target_url("quail")
print(f"Using quail on {url}")
Using quail on https://docker-dev03.pcic.uvic.ca/twitcher/ows/proxy/quail/wps
In [4]:
quail = WPSClient(url)
Help for individual processes can be diplayed using the ? command (ex/ bird.process?)¶
In [5]:
# NBVAL_IGNORE_OUTPUT
quail.climdex_input_csv?
Signature: quail.climdex_input_csv( prec_file_content, na_strings='NULL', tmax_column='tmax', tmin_column='tmin', prec_column='prec', tavg_column='tavg', base_range='c(1961, 1990)', cal='gregorian', date_fields="c('year', 'jday')", date_format='%Y %j', n=5, northern_hemisphere=True, quantiles='NULL', temp_qtiles='c(0.1, 0.9)', prec_qtiles='c(0.95, 0.99)', max_missing_days='c(annual = 15, monthly =3)', min_base_data_fraction_present=0.1, loglevel='INFO', tmax_file_content=None, tmin_file_content=None, tavg_file_content=None, output_file='output.rda', vector_name='days', output_formats=None, ) Docstring: Process for creating climdexInput object from CSV files Parameters ---------- tmax_file_content : string Content of file with daily maximum temperature data (temporary alternative to taking file). tmin_file_content : string Content of file with daily minimum temperature data (temporary alternative to taking file). prec_file_content : string Content of file with daily total precipitation data (temporary alternative to taking file). tavg_file_content : string Content of file with daily mean temperature data (temporary alternative to taking file). na_strings : string Strings used for NA values; passed to read.csv tmax_column : string Column name for tmax data. tmin_column : string Column name for tmin data. prec_column : string Column name for prec data. tavg_column : string Column name for tavg data. base_range : string Years to use for the baseline cal : string The calendar type used in the input files. date_fields : string Vector of names consisting of the columns to be concatenated together with spaces. date_format : string Date format as taken by strptime. n : integer Number of days to use as window for daily quantiles. northern_hemisphere : boolean Number of days to use as window for daily quantiles. quantiles : string Threshold quantiles for supplied variables. temp_qtiles : string Quantiles to calculate for temperature variables prec_qtiles : string Quantiles to calculate for precipitation max_missing_days : string Vector containing thresholds for number of days allowed missing per year (annual) and per month (monthly). min_base_data_fraction_present : float Minimum fraction of base data that must be present for quantile to be calculated for a particular day output_file : string Filename to store the output Rdata (extension .rda) vector_name : string Name to label the output vector loglevel : {'CRITICAL', 'ERROR', 'WARNING', 'INFO', 'DEBUG', 'NOTSET'}string Logging level Returns ------- climdexinput : ComplexData:mimetype:`application/x-gzip` Output R data file for generated climdexInput File: ~/github/quail/</tmp/quail-venv/lib/python3.8/site-packages/birdy/client/base.py-2> Type: method
Run wps_climdexInput process¶
In [ ]:
with NamedTemporaryFile(suffix=".rda", prefix="summer_days_", dir="/tmp", delete=True) as output_file:
output = quail.climdex_input_csv(
tmax_file_content=csv_handler(
(files("tests") / "data/1018935_MAX_TEMP.csv").resolve()),
tmin_file_content=csv_handler(
(files("tests") / "data/1018935_MIN_TEMP.csv").resolve()),
prec_file_content=csv_handler(
(files("tests") / "data/1018935_ONE_DAY_PRECIPITATION.csv").resolve()),
tmax_column='MAX_TEMP',
tmin_column='MIN_TEMP',
prec_column='ONE_DAY_PRECIPITATION',
base_range="c(1971, 2000)",
vector_name="climdexInput",
)
ci_url = output.get()[0]
In [7]:
test_ci_output(
ci_url, "climdexInput", "climdexInput.rda", "ci"
)