Easing Sharing of Glider Data
The OOI’s Coastal and Global Array teams regularly use Teledyne-Webb Slocum Gliders to collect ocean observations within and around the array moorings. The gliders fly up and down the water column from the surface down to a maximum depth of 1000 meters, collecting data such as dissolved oxygen concentrations, temperature, salinity, and other physical parameters to measure ocean conditions.
OOI shares its glider data with the Integrated Ocean Observing System (IOOS) Glider Data Assembly Center (DAC). IOOS serves as a national repository for glider data sets, serving as a centralized location for wide distribution and use. It allows researchers to access and analyze glider data sets using common tools regardless of the glider type or organization that deployed the glider.
OOI serves data to these repositories in two ways. When the gliders are in the water, data are telemetered, providing near real-time data to these platforms. Once the gliders are recovered, data are downloaded, metadata provided, and data are resubmitted to the Glider DAC as a permanent record.
The behind-the-scene process transmitting this huge amount of data is quite complex. OOI Data Team members, Collin Dobson of the Coastal and Global Scale Nodes at Woods Hole Oceanographic Institution (WHOI) and Stuart Pearce of the Coastal Endurance Array at Oregon State University (OSU) teamed up to streamline the process and catch up on a backlog of submission of recovered data.
Pearce took the lead in getting the OOI data into the DAC. In 2018, he began writing code for a system to transmit near real-time and recovered data. Once the scripts (processing code) were operational by about mid-2019, Pearce implemented them to streamline the flow of Endurance Array glider data into the DAC. Dobson then adopted the code and applied it to the transmission of glider data from the Pioneer, Station Papa, and Irminger Sea Arrays into the repository.
As it turned out, timing was optimum. “ I finished my code at the same time that the Glider DAC allowed higher resolution recovered datasets to be uploaded,” said Pearce. “So I was able to adjust my code to accommodate the upload of any scientific variable as long as it had a CF compliant standard name to go with it.” This opened up a whole range of data that could be transmitted in a consistent fashion to the DAC. CF refers to the “Climate and Forecast” metadata conventions that provide community accepted guidance for metadata variables and sets standards for designating time ranges and locations of data collection. Dobson gave an example of the name convention for density: Sea_water_density.
“Being CF compliant ensures your data have the required metadata and makes the data so much more usable across the board,” added Dobson. “If I wanted to include oxygen as a variable, for example, I have to make sure to use the CF standard name for dissolved oxygen and report the results in CF standard units.”
The Endurance Array team was the first group to add any of the non-CTD variables into the Glider DAC. This important step forward was recognized by the glider community, and was announced at a May 2019 workshop at Rutgers with 150 conveyors of glider data in attendance. One of Pearce’s gliders was used as the example of how and what could be achieved with the new code.
To help expedite the transfer of all gliders into the DAC, Pearce made his code open access. The additional metadata will help advance the work of storm forecasters, researchers, and others interested in improving understanding ocean processes.