Quality Control

All of our data quality control procedures are based on quality best practices and the goal of meeting the U.S. Integrated Ocean Observing System (IOOS) Quality Assurance of Real Time Ocean Data (QARTOD) standards. For instances where QARTOD manuals do not exist, we are developing our own automated quality tests.  We also conduct extensive Human-In-The-Loop (HITL) reviews as the data streams are collected.  This multi-tiered approach ensures OOI instruments are highly reliable, and the data quality is monitored, managed, and communicated to the community.  These quality processes result in dependable and consistent OOI data for research and in the classroom.  To learn more, click on the links below.

If a user has a question or identifies an issue concerning an OOI data product or QA/QC procedure, please contact the Data Team through the HelpDesk.

Automated Testing

As part of the ongoing OOI effort to improve data quality, OOI is implementing automated data quality testing based on Quality Assurance of Real Time Ocean Data (QARTOD) quality control standards. Led by the United States Integrated Ocean Observing System (U.S. IOOS), the QARTOD effort draws on the broad oceanographic observing community to provide manuals for different instrument classes (e.g. salinity, pH, or waves), which outline best practices and identify tests for evaluating data quality. A common code-base is available on GitHub and actively maintained by IOOS partner Axiom Data Science.

OOI has committed to implementing available QARTOD tests where appropriate. QARTOD is well-documented and actively maintained with an engaged user-base across multiple data collection and repository programs. The publicly available code-base with standardized tests and flag definitions that result in simplified, easy-to-interpret results. For instruments with no existing QARTOD manual, such as seawater pCO2, OOI is implementing “QARTOD-like” quality control (QC) using similar approaches. However, some instruments deployed by OOI, such as seismic sensors and multispectral sensors, are not well-suited for QARTOD testing.

How are Automated Test results communicated?

OOI utilizes a standardized data-flagging scheme, where each data point for an evaluated variable (e.g. salinity) receives one of the following flags:

  • 1 if the data point passed the test;
  • 2 if a test was not evaluated;
  • 3 if the data point is either interesting/unusual or suspect;
  • 4 if the data point fails the test;
  • 9 if the data point is missing.

In interpreting the flag designations, it is important to consider that a flag of 3 does not necessarily mean a data point is bad – it could also mean something interesting or unusual occurred that resulted in the given data point being outside of the expected test threshold. Climatology test data are an exception to the data-flagging scheme.  Climatology test data are simply classified as “pass” or “suspect/interesting.”  See Table 3 for a description of each test and its implementation.

Importantly, automated testing only flags data, it does not remove data. OOI is committed to delivering all available data; the goal is to provide further information on the possible quality of the data.

How is Automated Testing implemented by OOI?

OOI prioritized initial implementation of automated testing on instruments and variables which are shared across arrays and with broad or high scientific interest, such as CTDs, seawater pH and pCO2, dissolved oxygen, and chlorophyll/fluorescence. Table 3 provides an overview of tests currently in operation.

Location and syntax tests are handled by the OOI Operations and Management Systems (OMS) and data ingestion procedures respectively.  Flags are not generated for location and syntax., OMS alerts will generate HITL reviews and result in annotations concerning location, while particles with incorrect syntax are rejected during ingestion.

OOI implemented the gross range and climatology tests first. Note that the climatology test is a site-specific seasonally varying range test, based on OOI data where possible, and not a World Ocean Atlas-like climatology. These two tests, in addition to other tests under development, utilize thresholds and ranges which are calculated from existing OOI datasets.

Automated Test Overview

Location

Test checks that the reported present physical location (latitude/longitude) is within operator-determined limits. Implementation: OOI OMS provides location alerts that trigger HITL review, OOI operators generate annotations

Syntax

Received data message (full message) contains the proper structure without any indicators of flawed transmission. Implementation: During the ingestion process, data points that fail syntax are not ingested.

Gross Range

Data point exceeds vendor or operator-selected min/max. Fail Range: Threshold test based on vendor operational range, flagging fail data points only. User Range: Applies all OOI operator test thresholds flagging pass, fail, interest/suspect data points.

Climatology

Variation of gross range. This tests whether data points fall within seasonal expectations. Implementation: OOI operators generate/apply site-specific thresholds flagging pass, interest/suspect data points.

The code used to calculate the thresholds is publicly available at the Ocean Observatories GitHub repository and resulting threshold tables are available at the ocean observatories qc-lookup GitHub repository. The tests executed and results are added to the datasets as variables named _qartod_results and _qartod_executed, with the relevant tested data variable name prepended (e.g. practical_salinity would be practical_salinity_qartod_results and practical_salinity_qartod_executed). The _qartod_executed variable is a list of the individual results of each of the tests applied stored as a string. The tests applied and the order in which they were applied are stored in the variable metadata attributes. The _qartod_results provide a summary result of all the tests applied.

Status of OOI Automated Testing

Automated testing of OOI data streams was initiated upon commissioning of the arrays circa 2016. In 2020, the Program re-dedicated efforts towards automated QC based on QARTOD standards. These QC flags are available to the user in NetCDF files and in Data Explorer.

Phase 1 includes gross range tests (fail range and user range), and climatology tests. Phase 2 includes the development of gap, timing, and flatline tests. Phase 3 is an analysis of non-QARTOD manual instruments and development of potential tests. The syntax and location tests are considered operational checks and are handled within OOI operations and management systems and data ingestion processes prior to delivery to the public data sites.

Automated test products will be available in specific locations where OOI supplies data, initially via the Machine-to-Machine M2M) process, and then propagated to THREDDS and Data Explorer access.

Automated Testing Implementation Timeline

An overview of the QARTOD-based OOI automated QC implementation plan.

Phase 1

Focus: Instruments with QARTOD Manuals

1. Fail Range Tests Complete
2. Complete QA/QC Flagging Statistics Tool Complete
Phase 2
Phase 3

Table 3: Automated Test Status by Instrument-Class

The current status of automated test development, by instrument-class, and availability

1 Instrument specific to profiler or vehicle
2 Location test performed within OMS, if instruments are off station operator will annotate data
3 Syntax performed during ingestion and particle rejected if does not match regex, no flag generated

Table 4: Current Data Parameters undergoing Automated Testing

Breaking down the test status in more detail by showing the type of automated test (fail range, user range, climatology) applied to the parameters available for each instrument class.