GBDX

Lesson: Get S3 Bucket Contents

By the end of this lesson, you'll be able to access the contents of your S3 bucket.

Step 1: Get an OAuth2 Token (prerequsite)

All GBDX API requests require a header with a valid token.

For more information on getting an OAuth2 token, see:
Authentication Course
Video: Get an Oauth2 Token
Get OAuth Token (api-key)

Step 2: Make API request for temp credentials

API Request

GET temporary credentials to the AWS S3 bucket with this API request.

https://geobigdata.io/s3creds/v1/prefix

The default duration is set to 3600 seconds (one hour). To change the duration of the temporary credentials, append prefix?duration=<new value> to the request.

https://geobigdata.io/s3creds/v1/prefix?duration=129600

  • See the overview section for minimum and maximum duration.

Response

This is an example response with temporary credentials. We'll use these later in the lesson.

{
  "S3_secret_key": "JjnAy0NdO2WO5N7JgGRTuCoy3zqQdnSA4KLD9ogb",
  "prefix": "782839ja-2059-42de-ad70-03e6e5192fc6",
  "bucket": "gbd-customer-data",
  "S3_access_key": "ASIAIE3GWTJ7LLJLMCYB",
  "S3_session_token": "AQoDYXdzEBoa4ANP+zSjA4Pi6KLXiqw91T6oYhaJCMFmdtLcVcQcrGd1aIcy/3J8ZLfDmIkzsDWJhL8TvPvASaqxt/xYj8+SmlGNgnGH1jpSNwsDzCqTItlm6N5y8BZjCgSj3EKoyWW7XbTAAn+evMfMQEPlZM6onEdsYsm0CVx0DY8JnvTJBhA7I06/3g8XmSqOTxOfpqsYK5jt1JxseG956UOAWD35k34/r2BSQ+GKPpQ/drlcfPlQR/lDBopi8VejFh0Wq0GRUHg+yEJvZ1Ytrtm8R1MdMasXb3jVtMxm4SNH5/dVEP61yq9cA5B9UIl2LoFJYGx+fSnwRVaC0/1NjJzRNJmsR48Kyfaop1FNsKuCXWnGg1LWktnJRZft3vs+eaXQ2rvscex9cwxxg0Er9I9B1F0qD9ucHyrpxgRetMMymp3omIHMB3wcI+QCx39MKkBDCdpXNE3fCd0TaCbXX48XbJVaACCp60aNfvtkt7nRkyDsTx/gQ6GUpPiONxX8BKYLbsg6yvcXCyy6umAZBcOq+dYWxm5MSvIjJHFHbgS1+6xaJTkyAmtXRcJHzWwaUTmDe9Fh/qXA8aVu9NW2hf/aok61HqZqothqIEQeox7wI+21spXAh+uT+kT2YIDsnxRxR1GpMXkgzPLFtwU="
}

These credentials map to AWS values.

See the mapping table in the next step to see when to use the values.

3. Export temporary credentials for AWS CLI

This example assumes you have installed AWS CLI

To install AWS CLI, see AWS Command Line Interface.

AWS CLI requires the following connection string values:

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN

The S3 Storage Service returns these values, but uses a different naming convention. This table shows how Amazon's values map to the credentials provided by the GBDX Storage Service. The examples also help explain this.

AWS Requires this value
Use this GBDX value

AWS_ACCESS_KEY_ID

S3_access_key

AWS_SECRET_ACCESS_KEY

S3_secret_key

AWS_SESSION_TOKEN

S3_session_token

LINUX USERS To export on Linux, run the following commands to set the values:

export AWS_ACCESS_KEY_ID=[YOUR_S3_access_key]
export AWS_SECRET_ACCESS_KEY=[YOUR S3_secret_key]
export AWS_SESSION_TOKEN=[YOUR_S3_session_token]

Example:

This example uses the credentials from the example response provided in step 2.

export AWS_ACCESS_KEY_ID=ASIAIE3GWTJ7LLJLMCYB
export AWS_SECRET_ACCESS_KEY=JjnAy0NdO2WO5N7JgGRTuCoy3zqQdnSA4KLD9ogb 
export AWS_SESSION_TOKEN=AQoDYXdzEBoa4ANP+zSjA4Pi6KLXiqw91T6oYhaJCMFmdtLcVcQcrGd1aIcy/3J8ZLfDmIkzsDWJhL8TvPvASaqxt/xYj8+SmlGNgnGH1jpSNwsDzCqTItlm6N5y8BZjCgSj3EKoyWW7XbTAAn+evMfMQEPlZM6onEdsYsm0CVx0DY8JnvTJBhA7I06/3g8XmSqOTxOfpqsYK5jt1JxseG956UOAWD35k34/r2BSQ+GKPpQ/drlcfPlQR/lDBopi8VejFh0Wq0GRUHg+yEJvZ1Ytrtm8R1MdMasXb3jVtMxm4SNH5/dVEP61yq9cA5B9UIl2LoFJYGx+fSnwRVaC0/1NjJzRNJmsR48Kyfaop1FNsKuCXWnGg1LWktnJRZft3vs+eaXQ2rvscex9cwxxg0Er9I9B1F0qD9ucHyrpxgRetMMymp3omIHMB3wcI+QCx39MKkBDCdpXNE3fCd0TaCbXX48XbJVaACCp60aNfvtkt7nRkyDsTx/gQ6GUpPiONxX8BKYLbsg6yvcXCyy6umAZBcOq+dYWxm5MSvIjJHFHbgS1+6xaJTkyAmtXRcJHzWwaUTmDe9Fh/qXA8aVu9NW2hf/aok61HqZqothqIEQeox7wI+21spXAh+uT+kT2YIDsnxRxR1GpMXkgzPLFtwU=

WINDOWS USERS To export on Windows, run the following commands to set the values.

SET AWS_ACCESS_KEY_ID=[YOUR_S3_access_key]
SET AWS_SECRET_ACCESS_KEY=[YOUR_S3_secret_key]
SET AWS_SESSION_TOKEN=[YOUR_S3_session_token]

Example:

This example uses the credentials from the example response provided in step 2.

SET AWS_ACCESS_KEY_ID=ASIAIE3GWTJ7LLJLMCYB
SET AWS_SECRET_ACCESS_KEY=JjnAy0NdO2WO5N7JgGRTuCoy3zqQdnSA4KLD9ogb 
SET AWS_SESSION_TOKEN=AQoDYXdzEBoa4ANP+zSjA4Pi6KLXiqw91T6oYhaJCMFmdtLcVcQcrGd1aIcy/3J8ZLfDmIkzsDWJhL8TvPvASaqxt/xYj8+SmlGNgnGH1jpSNwsDzCqTItlm6N5y8BZjCgSj3EKoyWW7XbTAAn+evMfMQEPlZM6onEdsYsm0CVx0DY8JnvTJBhA7I06/3g8XmSqOTxOfpqsYK5jt1JxseG956UOAWD35k34/r2BSQ+GKPpQ/drlcfPlQR/lDBopi8VejFh0Wq0GRUHg+yEJvZ1Ytrtm8R1MdMasXb3jVtMxm4SNH5/dVEP61yq9cA5B9UIl2LoFJYGx+fSnwRVaC0/1NjJzRNJmsR48Kyfaop1FNsKuCXWnGg1LWktnJRZft3vs+eaXQ2rvscex9cwxxg0Er9I9B1F0qD9ucHyrpxgRetMMymp3omIHMB3wcI+QCx39MKkBDCdpXNE3fCd0TaCbXX48XbJVaACCp60aNfvtkt7nRkyDsTx/gQ6GUpPiONxX8BKYLbsg6yvcXCyy6umAZBcOq+dYWxm5MSvIjJHFHbgS1+6xaJTkyAmtXRcJHzWwaUTmDe9Fh/qXA8aVu9NW2hf/aok61HqZqothqIEQeox7wI+21spXAh+uT+kT2YIDsnxRxR1GpMXkgzPLFtwU=

S3cmd can also be used instead of AWS CLI.

Note: S3cmd can also be used to access data in your S3 Bucket. To set up and use S3cmd with Linux, see http://s3tools.org/s3cmd

s3tools.org.

4. List Contents in your S3 Bucket

Now that you've set your temporary credentials for AWS CLI, you can access the contents of your S3 bucket.

Take the bucket and prefix values from Step 2 and use them in the AWS CLI commands (listing contents, down loading contents, etc.)

  1. List contents:
    aws s3 ls s3://<bucket>/<prefix>/

Example:

 aws s3 ls s3://gbd-customer-data/782839ja-2059-42de-ad70-03e6e5192fc6/
        
                                   PRE 59652acc-6347-4a19-bf53-75458ef6cd07/
        2015-07-30 19:55:30          5 .info
  1. Download files:
aws s3 cp s3://<bucket>/<prefix>/remote_text_file.txt .

aws s3 cp s3://gbd-customer-data/f3c8f7a1-afae-49a6-9d40-d358d4ccfbd9/download.txt .
  1. For a list of S3 file commands, see the AWS Command Line Interface Reference Page.