S3 Storage Service Overview

GBDX stores ancillary data and derived products in an Amazon Web Services (AWS) S3 bucket. When a workflow is run on the GBDX platform, task outputs can be "persisted" to the GBDX Customer Data S3 bucket.

The purpose of the GBDX S3 Storage Service is to allow users to access this data. The service provides the temporary credentials required to access a Prefix, Folder, or Object in the S3 bucket.

In this course, we'll provide an overview of the S3 Storage Service. When you've completed this course, go to

Lesson: Get S3 Bucket Contents

The step-by-step tutorial will show you how to request temporary credentials, and how to use AWS CLI to access the contents of your S3 bucket.

AWS S3 Bucket

The full name of the S3 bucket is: s3://gbd-customer-data/{prefix}/{folder}/{object}
Temporary credentials are required by AWS to access the S3 bucket. . Most commonly, you'll request access to the prefix, but it's possible to request access to a specific folder or object within the prefix.

Resource Definition
Prefix A Prefix is a storage location within the main bucket. Storage buckets are associated with account IDs. Only users that belong to the account have access to the Prefix
Folder A Folder is a namespace inside a Prefix. For example, /my/folder/is a valid folder name. All objects would have {Prefix}/my/folder/ at the start of their name. The folder name is typically your Account ID
Object Individual files stored on S3 are called Objects. Objects are prepended with the user's Prefix and folder name (when there is one)

GBDX S3 Storage Service

The S3 Storage Service is a set of API endpoints that allow users to request temporary credentials to an S3 bucket. You'll need an OAuth2 token to make API requests.

API Requests:

Get Temp S3 Creds for Prefix
Update a Prefix
Get Prefix Size
Get temp creds-Folder
Get temp creds-Object
Get Download URL for an object

Response Codes

A 404 error code means the requested Prefix, Folder, or Object does not exist and is required.

Temporary Credentials

The credentials provided by the GBDX S3 Storage Service are temporary and session-based. When a request is made for credentials, the system will return:

  • Bucket (The S3 Bucket is gbd-customer-data)
  • Prefix (The Prefix is typically your account ID)
  • S3_access_key
  • S3_secret_key
  • S3_session_token

Session Duration

The default duration that the credentials are valid is 3600 seconds. The requester can change the duration when making an API request. The valid range for duration is 900 seconds to 129600 seconds.

Duration Type Value
Minimum 900 seconds (.25 hours)
Default 3600 seconds (1 hour)
Maximum 129600 seconds (36 hours)

Step by Step Tutorial

Our step-by-step tutorial walks you through the process of getting temporary S3 credentials and accessing the contents of an S3 bucket using AWS CLI. Lesson: Get S3 Bucket Contents .

