OOI Rolls Out Initial QARTOD Tests
As part of the ongoing the Ocean Observatories Initiative (OOI) effort to improve data quality, OOI is implementing Quality Assurance of Real-Time Oceanographic Data (QARTOD) tests on an instrument-by-instrument basis. Led by the United States Integrated Ocean Observing System (U.S. IOOS), the QARTOD effort draws on the ocean observing community to provide manuals, which outline and identify tests to evaluate data quality by variable and instrument type. Currently, OOI is focused on implementing the Gross Range and Climatology Tests for the variables associated with CTD, pH, and pCO2 sensors. Over the coming months tests will be applied to data collected by pressure sensors, bio-optical sensors, and dissolved oxygen sensors. Ultimately, where and when appropriate, QARTOD tests will be applied to the relevant variables for all OOI sensors.
The Gross Range test aims to identify data that fall outside either the sensor measurement range or is a statistical outlier. OOI identifies failed/bad data with a threshold value based on the calibration range for a given sensor. We also calculate suspicious/interesting data thresholds as the mean ± 3 standard deviations based on the historical OOI data for the variable at a deployed location. As implemented by OOI, the Gross Range test identifies data that either fall outside of the sensor calibration range, and is thus “bad”, or data that are statistical outliers based on the historic OOI data for that location.
The Climatology Test is a variation on the Gross Range Test, modifying the relevant suspicious/interesting data thresholds for each calendar-month by accounting for seasonal cycles. The OOI time series are short (<8 years) relative to the World Meteorological Organization (WMO) recommended 30-year climatology reference period. To help ensure quality, we calculate seasonal cycles for a given variable using harmonic analysis, a method that is less susceptible to spurious values that can arise either from data gaps, measurement errors or from the presence of real, but anomalous, geophysical conditions in the available record. First, we group the data by calendar-month (e.g. January, February, …, December) and calculate the average for each month. Then, we apply the monthly-averaged-data with a two-cycle (annual plus semiannual) harmonic model. Each harmonic is determined using a least-squares fit – a procedure that minimizes the sum of the squares of the differences between the data points and the curve to be fit. This produces a “climatological” fit for each calendar-month.
Next, we calculate the standard deviation for each calendar-month from the grouped observations for the month. The thresholds for suspicious/interesting data are set as the climatological-fit ± 3 standard deviations. Occasionally, data gaps may mean that there are no historical observations for a given calendar-month. In these instances, we linearly interpolate the threshold from the nearest months. For sensors mounted on profiler moorings or vehicles, we first divide the data into subsets using standardized depth bins to account for differences in seasonality and variability at different depths in the water column. The resulting test identifies data that fall outside of typical seasonal variability determined from the historic OOI data for that location.