Metadata Review Improves OOI Data
OOI’s data teams have just completed an extensive, year-long review of critical metadata to ensure the quality and usability of data for OOI data users. The review covered data collected through the end of 2019 and included instrument calibration coefficients, instrument deployment assignments, and deployment dates. Moving forward, all metadata verification will conform to the standards established during the review.
“Our reason for undertaking this review was no more complicated than to make the data better for our data users,” explains Jeffrey Glatstein, Senior Manager of Cyberinfrastructure and OOI Data Delivery Lead. “It is the first time since the inception of the program that we’ve really gone in and looked at the metadata from top to bottom. If there was a calibration that was off, a depth missing, or something misspelled, we found it.
“This intense and deliberative review process brought historic metadata up to current standards to ensure continuity, completeness of records, and consistency in how metadata are reported now and moving forward.”
The data teams used a combination of human review and an automated script development process to identify and correct data issues. The human-in-the-loop (HITL) process ensured that two sets of eyes verified each metadata product, whenever possible, while the scripts performed automated verification and generated reports to pass back into the HITL workflow.
“This initiative is part of ongoing OOI efforts to make its data more accessible, user friendly, and integrated into ongoing science,” adds Glatstein.
Check Previously Downloaded Data
The OOI Data Portal operates on a process-on-demand model, which means that data downloaded prior to the end of 2019 should be checked to see if relevant metadata has been modified.
Users can check to see if changes were made to relevant metadata by clicking here. This link provides a searchable database by array, platform, and instrument to help ensure that previously downloaded data are correct or if they need to be re-downloaded so users are working with the best available data. The OOI data teams are continuing to verify the historical deployment assignments/dates, and the results will be updated accordingly
A Gargantuan Effort
As part of the transition of OOI to 2.0 in October 2018, the RCA data team initiated a comprehensive audit of all critical metadata to ensure that data products served by the OOI Cyberinfrastructure system meet Quality Assurance/Quality Control standards set by the program and expected from the user community. This daunting task included the examination of over 700 calibration files from 2013 to the present. The results of this audit were used to aid in evaluation of current processes and guide in adapting workflows to improve QA/QC efforts and communication to the users, a vital component to building confidence in the OOI datasets as reliable and valuable resources that can be used in scientific research and education.
Wendi Ruef, Research Scientist, Regional Cabled Array
The CGSN Data Team worked carefully and methodically through thousands of files containing over 30,000 calibration coefficients and other critical metadata. We now have a high level of confidence in past metadata and a strong process for continued review going forward.
Al Plueddemann, Chief Scientist, Coastal Global Scale Nodes