Posts Tagged ‘metadata’
Efforts to Standardize Data Continue
The OOI Data Teams have recently made great strides in ongoing efforts to standardize data, making it easier for users to understand what OOI data and metadata are available. Efforts have focused on improving labeling, descriptions, and correcting units to ensure consistency. A major improvement underway is matching variable naming conventions with those governed by Climate and Forecast (CF) metadata standards.
The first round of changes is expected to be completed by the end of June 2022. Once these changes are implemented, existing scripts used to download and process OOI data files could be impacted depending on how the code was written. The Data Teams will publish a list of affected streams and recommended code updates prior to the release of these changes, to highlight the improvements and to allow for processing script modifications.
Read MoreNew Discrete Water Sampling Spreadsheets Available
To provide context and comparison for data collected by OOI instrumentation, OOI collects and disseminates data collected by shipboard underway sensors and from water samples from CTD casts. Shipboard underway data can be accessed by using username and password ‘guest’ on the OOI Alfresco Document Management System, organized by cruise. Each cruise folder contains a Ship Data folder in the format provided by the ship operators and a Water Sampling subfolder. The Water Sampling subfolder includes scanned and digitized versions of the CTD logs, as well as, discrete water sample analyses in the formats provided by the labs which conducted the analyses.
[caption id="attachment_20259" align="alignleft" width="199"] Collecting water samples from the CTD rosette on the Pioneer 8 cruise aboard the R/V Neil Armstrong. ©WHOI.[/caption]To make these data more easily accessible to the science community, we have developed a common template to provide a full set of discrete water sample data from a cruise. These “Discrete_Sample_Summary” spreadsheets include the details for each Niskin bottle fired on a CTD cast, the CTD instrument rosette data from the time of bottle closure, and the water sample data and quality flags based on World Ocean Circulation Experiment (WOCE) standards.
These CSV files with common data formats can easily be read and manipulated in MATLAB, Python, or other computing programs and languages. Because water analysis data are received at different times from different labs, these spreadsheets are updated as data become available. An accompanying README file contains version history, general notes, and a description of the quality flags. The original spreadsheets from labs, which may contain additional data and methodology, will also be posted.
An example of how to read and use this discrete sample data can be found in this Jupyter notebook. Discrete_Sample_Summary spreadsheets have been posted for the Regional Cabled Array cruises 6-10, the Coastal Endurance Array cruises 1-13, and the Global Irminger Sea Array cruises 1-6. We will continue to work on completing spreadsheets for past cruises as well as cruises going forward.
[caption id="attachment_20261" align="aligncenter" width="640"] Comparison of dissolved oxygen data on the Washington Shelf Surface Mooring with water sampling data from Endurance Cruise 13. Data from Deployment 10 and Deployment 11 are plotted together, and overlap during 5-7 July.[/caption] Read MoreA Hat for a New Name
In July, OOI will launch the Beta version of our new and improved data discovery tool. We are asking for the community’s help in naming the tool, which will make it possible to:
- search and download data from cabled and moored arrays and recovered data from for in situ physical, chemical, geological, and biological observations
- compare datasets across regions and disciplines
- generate and share custom data views
- download full data sets using ERDDAP
Want to take a crack at coming up with a name? The selected winner will receive an OOI hat in recognition of his/her creativity. If there is more than one winner, each will receive an OOI hat.
The deadline for submission is 15 May 2020. Please submit your nominations to dtrewcrist@whoi.edu, with subject line: “I deserve a hat!
Read MoreMetadata Review Improves OOI Data
OOI’s data teams have just completed an extensive, year-long review of critical metadata to ensure the quality and usability of data for OOI data users. The review covered data collected through the end of 2019 and included instrument calibration coefficients, instrument deployment assignments, and deployment dates. Moving forward, all metadata verification will conform to the standards established during the review.
“Our reason for undertaking this review was no more complicated than to make the data better for our data users,” explains Jeffrey Glatstein, Senior Manager of Cyberinfrastructure and OOI Data Delivery Lead. “It is the first time since the inception of the program that we’ve really gone in and looked at the metadata from top to bottom. If there was a calibration that was off, a depth missing, or something misspelled, we found it.
“This intense and deliberative review process brought historic metadata up to current standards to ensure continuity, completeness of records, and consistency in how metadata are reported now and moving forward.”
The data teams used a combination of human review and an automated script development process to identify and correct data issues. The human-in-the-loop (HITL) process ensured that two sets of eyes verified each metadata product, whenever possible, while the scripts performed automated verification and generated reports to pass back into the HITL workflow.
“This initiative is part of ongoing OOI efforts to make its data more accessible, user friendly, and integrated into ongoing science,” adds Glatstein.
Check Previously Downloaded Data
The OOI Data Portal operates on a process-on-demand model, which means that data downloaded prior to the end of 2019 should be checked to see if relevant metadata has been modified.
Users can check to see if changes were made to relevant metadata by clicking here. This link provides a searchable database by array, platform, and instrument to help ensure that previously downloaded data are correct or if they need to be re-downloaded so users are working with the best available data. The OOI data teams are continuing to verify the historical deployment assignments/dates, and the results will be updated accordingly
[feature]A Gargantuan Effort
As part of the transition of OOI to 2.0 in October 2018, the RCA data team initiated a comprehensive audit of all critical metadata to ensure that data products served by the OOI Cyberinfrastructure system meet Quality Assurance/Quality Control standards set by the program and expected from the user community. This daunting task included the examination of over 700 calibration files from 2013 to the present. The results of this audit were used to aid in evaluation of current processes and guide in adapting workflows to improve QA/QC efforts and communication to the users, a vital component to building confidence in the OOI datasets as reliable and valuable resources that can be used in scientific research and education.
Wendi Ruef, Research Scientist, Regional Cabled Array
The CGSN Data Team worked carefully and methodically through thousands of files containing over 30,000 calibration coefficients and other critical metadata. We now have a high level of confidence in past metadata and a strong process for continued review going forward.
Al Plueddemann, Chief Scientist, Coastal Global Scale Nodes
[/feature] Read More