Amazon Simple Storage Service

2018/09/05 - Amazon Simple Storage Service - 2 updated api methods

Changes  Parquet input format support added for the SelectObjectContent API

RestoreObject (updated) Link ¶
Changes (request)
{'RestoreRequest': {'SelectParameters': {'InputSerialization': {'Parquet': {}}}}}

Restores an archived copy of an object back into Amazon S3

See also: AWS API Documentation

Request Syntax

client.restore_object(
    Bucket='string',
    Key='string',
    VersionId='string',
    RestoreRequest={
        'Days': 123,
        'GlacierJobParameters': {
            'Tier': 'Standard'|'Bulk'|'Expedited'
        },
        'Type': 'SELECT',
        'Tier': 'Standard'|'Bulk'|'Expedited',
        'Description': 'string',
        'SelectParameters': {
            'InputSerialization': {
                'CSV': {
                    'FileHeaderInfo': 'USE'|'IGNORE'|'NONE',
                    'Comments': 'string',
                    'QuoteEscapeCharacter': 'string',
                    'RecordDelimiter': 'string',
                    'FieldDelimiter': 'string',
                    'QuoteCharacter': 'string',
                    'AllowQuotedRecordDelimiter': True|False
                },
                'CompressionType': 'NONE'|'GZIP'|'BZIP2',
                'JSON': {
                    'Type': 'DOCUMENT'|'LINES'
                },
                'Parquet': {}

            },
            'ExpressionType': 'SQL',
            'Expression': 'string',
            'OutputSerialization': {
                'CSV': {
                    'QuoteFields': 'ALWAYS'|'ASNEEDED',
                    'QuoteEscapeCharacter': 'string',
                    'RecordDelimiter': 'string',
                    'FieldDelimiter': 'string',
                    'QuoteCharacter': 'string'
                },
                'JSON': {
                    'RecordDelimiter': 'string'
                }
            }
        },
        'OutputLocation': {
            'S3': {
                'BucketName': 'string',
                'Prefix': 'string',
                'Encryption': {
                    'EncryptionType': 'AES256'|'aws:kms',
                    'KMSKeyId': 'string',
                    'KMSContext': 'string'
                },
                'CannedACL': 'private'|'public-read'|'public-read-write'|'authenticated-read'|'aws-exec-read'|'bucket-owner-read'|'bucket-owner-full-control',
                'AccessControlList': [
                    {
                        'Grantee': {
                            'DisplayName': 'string',
                            'EmailAddress': 'string',
                            'ID': 'string',
                            'Type': 'CanonicalUser'|'AmazonCustomerByEmail'|'Group',
                            'URI': 'string'
                        },
                        'Permission': 'FULL_CONTROL'|'WRITE'|'WRITE_ACP'|'READ'|'READ_ACP'
                    },
                ],
                'Tagging': {
                    'TagSet': [
                        {
                            'Key': 'string',
                            'Value': 'string'
                        },
                    ]
                },
                'UserMetadata': [
                    {
                        'Name': 'string',
                        'Value': 'string'
                    },
                ],
                'StorageClass': 'STANDARD'|'REDUCED_REDUNDANCY'|'STANDARD_IA'|'ONEZONE_IA'
            }
        }
    },
    RequestPayer='requester'
)
type Bucket

string

param Bucket

[REQUIRED]

type Key

string

param Key

[REQUIRED]

type VersionId

string

param VersionId

type RestoreRequest

dict

param RestoreRequest

Container for restore job parameters.

  • Days (integer) --

    Lifetime of the active copy in days. Do not use with restores that specify OutputLocation.

  • GlacierJobParameters (dict) --

    Glacier related parameters pertaining to this job. Do not use with restores that specify OutputLocation.

    • Tier (string) -- [REQUIRED]

      Glacier retrieval tier at which the restore will be processed.

  • Type (string) --

    Type of restore request.

  • Tier (string) --

    Glacier retrieval tier at which the restore will be processed.

  • Description (string) --

    The optional description for the job.

  • SelectParameters (dict) --

    Describes the parameters for Select job types.

    • InputSerialization (dict) -- [REQUIRED]

      Describes the serialization format of the object.

      • CSV (dict) --

        Describes the serialization of a CSV-encoded object.

        • FileHeaderInfo (string) --

          Describes the first line of input. Valid values: None, Ignore, Use.

        • Comments (string) --

          Single character used to indicate a row should be ignored when present at the start of a row.

        • QuoteEscapeCharacter (string) --

          Single character used for escaping the quote character inside an already escaped value.

        • RecordDelimiter (string) --

          Value used to separate individual records.

        • FieldDelimiter (string) --

          Value used to separate individual fields in a record.

        • QuoteCharacter (string) --

          Value used for escaping where the field delimiter is part of the value.

        • AllowQuotedRecordDelimiter (boolean) --

          Specifies that CSV field values may contain quoted record delimiters and such records should be allowed. Default value is FALSE. Setting this value to TRUE may lower performance.

      • CompressionType (string) --

        Specifies object's compression format. Valid values: NONE, GZIP, BZIP2. Default Value: NONE.

      • JSON (dict) --

        Specifies JSON as object's input serialization format.

        • Type (string) --

          The type of JSON. Valid values: Document, Lines.

      • Parquet (dict) --

        Specifies Parquet as object's input serialization format.

    • ExpressionType (string) -- [REQUIRED]

      The type of the provided expression (e.g., SQL).

    • Expression (string) -- [REQUIRED]

      The expression that is used to query the object.

    • OutputSerialization (dict) -- [REQUIRED]

      Describes how the results of the Select job are serialized.

      • CSV (dict) --

        Describes the serialization of CSV-encoded Select results.

        • QuoteFields (string) --

          Indicates whether or not all output fields should be quoted.

        • QuoteEscapeCharacter (string) --

          Single character used for escaping the quote character inside an already escaped value.

        • RecordDelimiter (string) --

          Value used to separate individual records.

        • FieldDelimiter (string) --

          Value used to separate individual fields in a record.

        • QuoteCharacter (string) --

          Value used for escaping where the field delimiter is part of the value.

      • JSON (dict) --

        Specifies JSON as request's output serialization format.

        • RecordDelimiter (string) --

          The value used to separate individual records in the output.

  • OutputLocation (dict) --

    Describes the location where the restore job's output is stored.

    • S3 (dict) --

      Describes an S3 location that will receive the results of the restore request.

      • BucketName (string) -- [REQUIRED]

        The name of the bucket where the restore results will be placed.

      • Prefix (string) -- [REQUIRED]

        The prefix that is prepended to the restore results for this request.

      • Encryption (dict) --

        Describes the server-side encryption that will be applied to the restore results.

        • EncryptionType (string) -- [REQUIRED]

          The server-side encryption algorithm used when storing job results in Amazon S3 (e.g., AES256, aws:kms).

        • KMSKeyId (string) --

          If the encryption type is aws:kms, this optional value specifies the AWS KMS key ID to use for encryption of job results.

        • KMSContext (string) --

          If the encryption type is aws:kms, this optional value can be used to specify the encryption context for the restore results.

      • CannedACL (string) --

        The canned ACL to apply to the restore results.

      • AccessControlList (list) --

        A list of grants that control access to the staged results.

        • (dict) --

          • Grantee (dict) --

            • DisplayName (string) --

              Screen name of the grantee.

            • EmailAddress (string) --

              Email address of the grantee.

            • ID (string) --

              The canonical user ID of the grantee.

            • Type (string) -- [REQUIRED]

              Type of grantee

            • URI (string) --

              URI of the grantee group.

          • Permission (string) --

            Specifies the permission given to the grantee.

      • Tagging (dict) --

        The tag-set that is applied to the restore results.

        • TagSet (list) -- [REQUIRED]

          • (dict) --

            • Key (string) -- [REQUIRED]

              Name of the tag.

            • Value (string) -- [REQUIRED]

              Value of the tag.

      • UserMetadata (list) --

        A list of metadata to store with the restore results in S3.

        • (dict) --

          A metadata key-value pair to store with an object.

          • Name (string) --

          • Value (string) --

      • StorageClass (string) --

        The class of storage used to store the restore results.

type RequestPayer

string

param RequestPayer

Confirms that the requester knows that she or he will be charged for the request. Bucket owners need not specify this parameter in their requests. Documentation on downloading objects from requester pays buckets can be found at http://docs.aws.amazon.com/AmazonS3/latest/dev/ObjectsinRequesterPaysBuckets.html

rtype

dict

returns

Response Syntax

{
    'RequestCharged': 'requester',
    'RestoreOutputPath': 'string'
}

Response Structure

  • (dict) --

    • RequestCharged (string) --

      If present, indicates that the requester was successfully charged for the request.

    • RestoreOutputPath (string) --

      Indicates the path in the provided S3 output location where Select results will be restored to.

SelectObjectContent (updated) Link ¶
Changes (request)
{'InputSerialization': {'Parquet': {}}}

This operation filters the contents of an Amazon S3 object based on a simple Structured Query Language (SQL) statement. In the request, along with the SQL expression, you must also specify a data serialization format (JSON or CSV) of the object. Amazon S3 uses this to parse object data into records, and returns only records that match the specified SQL expression. You must also specify the data serialization format for the response.

See also: AWS API Documentation

Request Syntax

client.select_object_content(
    Bucket='string',
    Key='string',
    SSECustomerAlgorithm='string',
    SSECustomerKey=b'bytes',
    SSECustomerKeyMD5='string',
    Expression='string',
    ExpressionType='SQL',
    RequestProgress={
        'Enabled': True|False
    },
    InputSerialization={
        'CSV': {
            'FileHeaderInfo': 'USE'|'IGNORE'|'NONE',
            'Comments': 'string',
            'QuoteEscapeCharacter': 'string',
            'RecordDelimiter': 'string',
            'FieldDelimiter': 'string',
            'QuoteCharacter': 'string',
            'AllowQuotedRecordDelimiter': True|False
        },
        'CompressionType': 'NONE'|'GZIP'|'BZIP2',
        'JSON': {
            'Type': 'DOCUMENT'|'LINES'
        },
        'Parquet': {}

    },
    OutputSerialization={
        'CSV': {
            'QuoteFields': 'ALWAYS'|'ASNEEDED',
            'QuoteEscapeCharacter': 'string',
            'RecordDelimiter': 'string',
            'FieldDelimiter': 'string',
            'QuoteCharacter': 'string'
        },
        'JSON': {
            'RecordDelimiter': 'string'
        }
    }
)
type Bucket

string

param Bucket

[REQUIRED]

The S3 Bucket.

type Key

string

param Key

[REQUIRED]

The Object Key.

type SSECustomerAlgorithm

string

param SSECustomerAlgorithm

The SSE Algorithm used to encrypt the object. For more information, go to Server-Side Encryption (Using Customer-Provided Encryption Keys.

type SSECustomerKey

bytes

param SSECustomerKey

The SSE Customer Key. For more information, go to Server-Side Encryption (Using Customer-Provided Encryption Keys.

type SSECustomerKeyMD5

string

param SSECustomerKeyMD5

The SSE Customer Key MD5. For more information, go to Server-Side Encryption (Using Customer-Provided Encryption Keys.

type Expression

string

param Expression

[REQUIRED]

The expression that is used to query the object.

type ExpressionType

string

param ExpressionType

[REQUIRED]

The type of the provided expression (e.g., SQL).

type RequestProgress

dict

param RequestProgress

Specifies if periodic request progress information should be enabled.

  • Enabled (boolean) --

    Specifies whether periodic QueryProgress frames should be sent. Valid values: TRUE, FALSE. Default value: FALSE.

type InputSerialization

dict

param InputSerialization

[REQUIRED]

Describes the format of the data in the object that is being queried.

  • CSV (dict) --

    Describes the serialization of a CSV-encoded object.

    • FileHeaderInfo (string) --

      Describes the first line of input. Valid values: None, Ignore, Use.

    • Comments (string) --

      Single character used to indicate a row should be ignored when present at the start of a row.

    • QuoteEscapeCharacter (string) --

      Single character used for escaping the quote character inside an already escaped value.

    • RecordDelimiter (string) --

      Value used to separate individual records.

    • FieldDelimiter (string) --

      Value used to separate individual fields in a record.

    • QuoteCharacter (string) --

      Value used for escaping where the field delimiter is part of the value.

    • AllowQuotedRecordDelimiter (boolean) --

      Specifies that CSV field values may contain quoted record delimiters and such records should be allowed. Default value is FALSE. Setting this value to TRUE may lower performance.

  • CompressionType (string) --

    Specifies object's compression format. Valid values: NONE, GZIP, BZIP2. Default Value: NONE.

  • JSON (dict) --

    Specifies JSON as object's input serialization format.

    • Type (string) --

      The type of JSON. Valid values: Document, Lines.

  • Parquet (:class:`.EventStream`) --

    Specifies Parquet as object's input serialization format.

type OutputSerialization

dict

param OutputSerialization

[REQUIRED]

Describes the format of the data that you want Amazon S3 to return in response.

  • CSV (dict) --

    Describes the serialization of CSV-encoded Select results.

    • QuoteFields (string) --

      Indicates whether or not all output fields should be quoted.

    • QuoteEscapeCharacter (string) --

      Single character used for escaping the quote character inside an already escaped value.

    • RecordDelimiter (string) --

      Value used to separate individual records.

    • FieldDelimiter (string) --

      Value used to separate individual fields in a record.

    • QuoteCharacter (string) --

      Value used for escaping where the field delimiter is part of the value.

  • JSON (dict) --

    Specifies JSON as request's output serialization format.

    • RecordDelimiter (string) --

      The value used to separate individual records in the output.

rtype

dict

returns

The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.

Response Syntax

{
    'Payload': EventStream({
        'Records': {
            'Payload': b'bytes'
        },
        'Stats': {
            'Details': {
                'BytesScanned': 123,
                'BytesProcessed': 123,
                'BytesReturned': 123
            }
        },
        'Progress': {
            'Details': {
                'BytesScanned': 123,
                'BytesProcessed': 123,
                'BytesReturned': 123
            }
        },
        'Cont': {},
        'End': {}
    })
}

Response Structure

  • (dict) --

    • Payload (:class:`.EventStream`) --

      • Records (dict) --

        The Records Event.

        • Payload (bytes) --

          The byte array of partial, one or more result records.

      • Stats (dict) --

        The Stats Event.

        • Details (dict) --

          The Stats event details.

          • BytesScanned (integer) --

            Total number of object bytes scanned.

          • BytesProcessed (integer) --

            Total number of uncompressed object bytes processed.

          • BytesReturned (integer) --

            Total number of bytes of records payload data returned.

      • Progress (dict) --

        The Progress Event.

        • Details (dict) --

          The Progress event details.

          • BytesScanned (integer) --

            Current number of object bytes scanned.

          • BytesProcessed (integer) --

            Current number of uncompressed object bytes processed.

          • BytesReturned (integer) --

            Current number of bytes of records payload data returned.

      • Cont (:class:`.EventStream`) --

        The Continuation Event.

      • End (:class:`.EventStream`) --

        The End Event.