Crowdsourced Bathymetry in the NOAA Big Data Program

The International Hydrographic Organization (IHO) defines Crowdsourced bathymetry (CSB) as the collection of depth measurements from vessels, using standard navigation instruments, while engaged in routine maritime operations.

In 2020, nearly 85% of the seafloor remains unmapped and unexplored in part due to both technical challenges and the high costs associated with data collection activities. For the last several years, the IHO has focused on encouraging innovative supplementary data gathering activities, such as the collection of crowdsourced bathymetry (CSB), to help address these gaps in bathymetric data. NOAA chairs the IHO Crowdsourced Bathymetry Working Group and hosts the IHO Data Centre for Digital Bathymetry (IHO DCDB) at NOAA’s National Centers for Environmental Information (NCEI).

This page includes information on data structure and sample use cases to help you get started. You can find additional information about the project from the Crowdsourced Bathymetry tab at the IHO Data Centre for Digital Bathymetry website.

Accessing the Archive Data

CSB data is hosted in the noaa-bathymetry-pds Amazon S3 bucket in the us-east-1 AWS region. The address for the public bucket is: https://noaa-bathymetry-pds.s3.amazonaws.com/.

Each file is available as an object in Amazon S3. The basic data format is:

/<Year>/<Month>/<Day>/<filename>

Where:

  • <Year> is the year the data was collected
  • <Month> is the month of the year the data was collected
  • <Day> is the day of the month the data was collected
  • <filename> is the name of the file containing the data. These are comma separated value (csv) files.

All files in the archive use the same csv format (.csv) where the first line is the header describing the columns of data UNIQUE_ID,FILE_UUID,LON,LAT,DEPTH,TIME,PLATFORM_NAME,PROVIDER

Where:

  • UNIQUE_ID = Unique ID of the platform/ship
  • FILE_UUID = Unique ID of data file submitted
  • LON = Longitude
  • LAT = Latitude
  • DEPTH = Depth in meters
  • TIME = Time formatted as ISO 8601
  • PLATFORM_NAME = Ship name
  • PROVIDER = Organization providing the data

Accessing the Archive Data using AWS CLI

Using the AWS CLI is the most convenient way to get the data from S3. The CLI is a set of command line tools that enable functionality such as ls, cp, and sync for S3 buckets. To install the CLI read the installation instructions for your platform.

For example, to list all the data for June 6th, 2019 do:

aws s3 ls s3://noaa-bathymetry-pds/csv/2019/06/26/ --no-sign-request

The --no-sign-request flag enables you to run the command without providing credentials. This works because the bucket is publicly accessible.

To download all of the data for June 26th, 2019:

aws s3 cp s3://noaa-bathymetry-pds/csv/2019/06/26/ . --recursive --no-sign-request

Here the --recursive flag tells the CLI to grab all objects which begin with the keypath s3://noaa-bathymetry-pds/csv/2019/06/26/.

SNS

We have set up public Amazon Simple Notification Service (SNS) topics that create a notification for every new object added to the Amazon S3 bucket on AWS. To start, you can subscribe to these notifications using email, Amazon SQS or AWS Lambda. This means you can automatically trigger event-based processing to derive value added products.

The Amazon Resource Name (ARN) for the SNS data is:

arn:aws:sns:us-east-1:123901341784:NewBathymetryObject