GBDX

S3 Access Course

This document describes how to access and manage the contents of a GBDX S3 location.

Last Updated: April 24, 2019

Definitions

Term Definition
Amazon S3 Management Console (S3 Console) Amazon's S3 Graphical User Interface for managing S3 content. The GBDX Dashboard links to the S3 Console and automatically signs the user in with their GBDX credentials.
AWS CLI The Amazon Web Services Command Line Interface for controlling Amazon Cloud Services.
Folder A Folder is a namespace inside a Prefix. The folder name is typically set by the task .
GBDX Dashboard A Graphical User Interface(GUI) for managing user accounts and account information.
GBDX S3 bucket This refers to an AWS S3 bucket where files are stored. All files created as a result of running a workflow are stored in a "prefix" in a GBDX S3 bucket. 1B image products that are ordered through GBDX are also stored in a GBDX S3 bucket. This bucket is only accessible by the workflow system.
GBDX S3 contents Refers to the files stored in a GBDX S3 location.
GBDX S3 location Refers to the location where workflow output files are stored for an account. Input files that are uploaded to use in a workflow are stored here too.This is the S3 bucket + S3 prefix. The prefix is typically the account ID
GBDX S3 storage Refers to the contractually defined storage allotment. This is always reported in GBs.
GBDX S3 Storage Service A set of GBDX APIs used to request temporary credentials
GBDXtools An open-source Python-based project for using the GBDX platform.
GUI Graphical User Interface
Object Individual files stored on S3 are called Objects. Objects are prepended with the user's Prefix and folder name (when there is one).
Prefix A Prefix is a storage location within the main S3 bucket. It may also be referred to as a directory. Prefixes are associated with account IDs. The Prefix name is typically the account ID. Only users that belong to the account have access to the Prefix.
S3 Amazon's Cloud Storage Service
Workflow A series of tasks (algorithms) chained together and run to produce a specified output.

Overview

When a workflow is run on on the GBDX platform, the output data is saved to a GBDX S3 location. This location is a "prefix" within a GBDX S3 bucket. There are several ways to access and manage the contents of the GBDX S3 location, described in this document.

The full path to your GBDX S3 contents is:

s3://{s3 bucket}/{prefix}/{folder}/{object}

GBDX S3 Management Tools

The contents of a GBDX S3 location can be accessed and managed using the following tools:

Tool Description Dependencies Tutorial
AWS CLI A command line interface that supports downloading a file or the contents of a folder, uploading, and deleting files. Use the GBDX S3 Storage Service to get credentials to the GBDX S3 location first. GBDX S3 Storage Service AWS CLI tutorial
Amazon S3 Management Console (S3 Console) A Graphical User InterfaceI that supports downloading a file, deleting files, uploading files, and more. The GBDX Dashboard, located at https://dashboard.geobigdata.io. Log in to the dashboard with GBDX credentials (email address and password). S3 Console tutorial
GBDXtools A python-based project that supports downloading, deleting, and uploading files from S3. GBDXtools setup or GBDX Notebooks GBDXtools tutorial
GBDX S3 Storage Service A set of API endpoints that provide temporary credentials to a GBDX S3 location. The temporary credentials are used with the AWS CLI. The service also supports getting the size of a prefix and getting a download url for an object. An API client; i.e. Postman S3 Storage Service

Supported Features

This table shows the actions that are supported by each tool.

Amazon S3 Management Console S3 Storage Service AWS CLI GBDXtools
Delete a file X X X X
Delete a folder or directory X X X
Download a file X X X
Download the contents of a directory X X
Download file as . . . X
Get a download url for an Object X X
Get the size of a Prefix X X
Upload a file X X X
Upload a folder or Directory X X X

S3 Management Tools Descriptions

AWS CLI

The AWS CLI lets you download, delete, and upload files or directories. To use the AWS CLI, it must be installed, and you must have temporary credentials from the S3 Storage Service.

Tutorial: Manage S3 Contents with AWS CLI
AWS CLI Installation: AWS CLI

Amazon S3 Management Console (S3 Console)

A link to the S3 Console is available from the Accounts section of the GBDX Dashboard. When you sign into the Dashboard using your GBDX account credentials, the link will take you to the GBDX S3 contents for your account.

The S3 Console allows you to:

  • Download a file (individual files only)
  • Upload a file to a folder
  • Delete a file
  • Bulk delete files or folders

*Note: Bulk download is not supported by the S3 Console. Use the AWS CLI or GBDXtools to download the contents of a directory or folder.

Tutorial: Manage S3 Contents with the Amazon S3 Management Console

GBDXtools

If you're a GBDXtools user, you can manage S3 contents using Python scripts. Downloading, deleting, and uploading files and directories are supported.
Tutorial: Manage S3 Contents with GBDXtools

S3 Storage Service

The S3 Storage Service is a set of API endpoints that allow you to:

  • Get temporary credentials to the S3 location. You can then use these credentials to manage your S3 contents with the AWS CLI.
  • Get the size of your S3 Prefix
  • Get a download url for a file

S3 Access Course


This document describes how to access and manage the contents of a GBDX S3 location.

Last Updated: April 24, 2019

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.