The National Science Foundations’ Ocean Observatories Initiative (OOI) currently supports approximately 760 deployed sensors that continuously produce over 200 unique data from the seafloor to the air-sea interface. The OOI Cyberinfrastructure now serves over 250 terabytes of data, with more data coming in every second.
“The size and complexity of ocean data is growing beyond the capacity of what one person and one computer can handle,” says Friedrich Knuth, OOI Data Evaluator. “We need to be thinking of collaborative, cloud-based tools to really explore the capacity of these data.”
“Big data” is not a challenge exclusive to the OOI, or the oceanographic community, the issue permeates throughout the scientific world and is the impetus for the creation of hackweeks by the eScience Institute.
“Our goal is data democratization,” says Amanda Tan, eScience Institute Cloud Technology Lead. “We want to get data out to anyone and everyone who wants to use and work with it in a meaningful way. To do that, we use hackweeks as an avenue to help users build open source, reproducible tools that can turn sensor data into publishable results.”
In February 2018 the UW eScience Institute, in collaboration with UW Applied Physics Laboratory and the OOI, hosted the first ever ocean related event, the Cabled Array Hack Week (CAHW), focusing on data from the OOI Cabled Array. The event was a three-day immersive experience with about 25 participants ranging from students through senior scientists.
The CAHW was a grass roots effort led by, Wu-Jung Lee, a researcher at the Applied Physics Laboratory (APL) and an early adopter of OOI data.
“I started out as a domain scientist, really focused in my one area. But I have realized that I need to dig more into my computational skill set to work with these large data sets,” says Lee. “After working with eScience last year on another project, I had the idea that we should create a Hackweek for the oceans.”
Lee then brought her idea to the eScience Institute, the OOI Cabled Array team based at the UW, and the OOI Data Team based at Rutgers University to see if they would be interested. “I knew what I wanted to see happen with this hackweek, but not how to get there. The team was what made it a successful event.”
The goal of the Hackweek was to build a stronger user community around the Regional Cabled Array, and to create and promote effective computational data analysis workflows of the real-time data stream.
For three days, hackweek participants devoted themselves to OOI data and collaborative group projects. The hackweek was structured with tutorials focused on “how tos” for data mining, processing, and visualization techniques, while most of the time was spent in small groups hacking away on their own individual projects.
“The key to hackweeks is that they are immersive,” says Knuth. “Everyone brings a unique perspective to the table and is committed to being there together sweating through the code and the data. For example, we had APL Cabled Array engineer Eric McCrae, designer of the Shallow Profiler Mooring, working with a group developing code to process the data from that platform. That was a big highlight of the week for me, to see the potential of that kind of end-to-end relationship.”
Projects currently being developed include a mobile app to identify whale vocalizations from OOI Cabled Array Hydrophone data entitled “Whaldr.” As the name suggests, it is loosely based on the social networking app, Tinder, where users swipe right or left to determine if there is a whale making noises. Other projects include the creation of a draft common data format to better share and process acoustic data from scientific echosounders and Acoustic Doppler Current Profilers. More details on projects can be found on the Cabled Array Hack Week website.
“We hold the standard for data visualization and access in our pockets,” says Knuth holding up his smart phone. “Scientists should have the same tools to make data equally as easily accessible and understandable. eScience is working towards helping us make that possible.”
After a successful workshop on Cabled Array data, the team is not done. They will be running an OOI-wide Oceanhackweek with data spanning all of the arrays in August. Click here for more details and to apply.
The OOI Cabled Array Hack Week was supported by the University of Washington School of Oceanography, the Applied Physics Laboratory, the eScience Institute, and the Ocean Observatories Initiative. In addition to Knuth, Lee, and Tan, organizers included Valentina Staneva, Rob Fatland, and Aaron Marburg.