Amazon Simple Storage Service

2018/05/07 - Amazon Simple Storage Service - 1 updated api methods

Changes  Update s3 client to latest version

SelectObjectContent (updated) Link ΒΆ
Changes (response)
{'Payload': {'Progress': {'Details': {'BytesReturned': 'long'}},
             'Stats': {'Details': {'BytesReturned': 'long'}}}}

This operation filters the contents of an Amazon S3 object based on a simple Structured Query Language (SQL) statement. In the request, along with the SQL expression, you must also specify a data serialization format (JSON or CSV) of the object. Amazon S3 uses this to parse object data into records, and returns only records that match the specified SQL expression. You must also specify the data serialization format for the response.

See also: AWS API Documentation

Request Syntax

client.select_object_content(
    Bucket='string',
    Key='string',
    SSECustomerAlgorithm='string',
    SSECustomerKey='string',
    SSECustomerKeyMD5='string',
    Expression='string',
    ExpressionType='SQL',
    RequestProgress={
        'Enabled': True|False
    },
    InputSerialization={
        'CSV': {
            'FileHeaderInfo': 'USE'|'IGNORE'|'NONE',
            'Comments': 'string',
            'QuoteEscapeCharacter': 'string',
            'RecordDelimiter': 'string',
            'FieldDelimiter': 'string',
            'QuoteCharacter': 'string'
        },
        'CompressionType': 'NONE'|'GZIP',
        'JSON': {
            'Type': 'DOCUMENT'|'LINES'
        }
    },
    OutputSerialization={
        'CSV': {
            'QuoteFields': 'ALWAYS'|'ASNEEDED',
            'QuoteEscapeCharacter': 'string',
            'RecordDelimiter': 'string',
            'FieldDelimiter': 'string',
            'QuoteCharacter': 'string'
        },
        'JSON': {
            'RecordDelimiter': 'string'
        }
    }
)
type Bucket:

string

param Bucket:

[REQUIRED] The S3 Bucket.

type Key:

string

param Key:

[REQUIRED] The Object Key.

type SSECustomerAlgorithm:

string

param SSECustomerAlgorithm:

The SSE Algorithm used to encrypt the object. For more information, go to Server-Side Encryption (Using Customer-Provided Encryption Keys.

type SSECustomerKey:

string

param SSECustomerKey:

The SSE Customer Key. For more information, go to Server-Side Encryption (Using Customer-Provided Encryption Keys.

type SSECustomerKeyMD5:

string

param SSECustomerKeyMD5:

The SSE Customer Key MD5. For more information, go to Server-Side Encryption (Using Customer-Provided Encryption Keys.

type Expression:

string

param Expression:

[REQUIRED] The expression that is used to query the object.

type ExpressionType:

string

param ExpressionType:

[REQUIRED] The type of the provided expression (e.g., SQL).

type RequestProgress:

dict

param RequestProgress:

Specifies if periodic request progress information should be enabled.

  • Enabled (boolean) -- Specifies whether periodic QueryProgress frames should be sent. Valid values: TRUE, FALSE. Default value: FALSE.

type InputSerialization:

dict

param InputSerialization:

[REQUIRED] Describes the format of the data in the object that is being queried.

  • CSV (dict) -- Describes the serialization of a CSV-encoded object.

    • FileHeaderInfo (string) -- Describes the first line of input. Valid values: None, Ignore, Use.

    • Comments (string) -- Single character used to indicate a row should be ignored when present at the start of a row.

    • QuoteEscapeCharacter (string) -- Single character used for escaping the quote character inside an already escaped value.

    • RecordDelimiter (string) -- Value used to separate individual records.

    • FieldDelimiter (string) -- Value used to separate individual fields in a record.

    • QuoteCharacter (string) -- Value used for escaping where the field delimiter is part of the value.

  • CompressionType (string) -- Specifies object's compression format. Valid values: NONE, GZIP. Default Value: NONE.

  • JSON (dict) -- Specifies JSON as object's input serialization format.

    • Type (string) -- The type of JSON. Valid values: Document, Lines.

type OutputSerialization:

dict

param OutputSerialization:

[REQUIRED] Describes the format of the data that you want Amazon S3 to return in response.

  • CSV (dict) -- Describes the serialization of CSV-encoded Select results.

    • QuoteFields (string) -- Indicates whether or not all output fields should be quoted.

    • QuoteEscapeCharacter (string) -- Single character used for escaping the quote character inside an already escaped value.

    • RecordDelimiter (string) -- Value used to separate individual records.

    • FieldDelimiter (string) -- Value used to separate individual fields in a record.

    • QuoteCharacter (string) -- Value used for escaping where the field delimiter is part of the value.

  • JSON (dict) -- Specifies JSON as request's output serialization format.

    • RecordDelimiter (string) -- The value used to separate individual records in the output.

rtype:

dict

returns:

The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.

Response Syntax

{
    'Payload': EventStream({
        'Records': {
            'Payload': b'bytes'
        },
        'Stats': {
            'Details': {
                'BytesScanned': 123,
                'BytesProcessed': 123,
                'BytesReturned': 123
            }
        },
        'Progress': {
            'Details': {
                'BytesScanned': 123,
                'BytesProcessed': 123,
                'BytesReturned': 123
            }
        },
        'Cont': {},
        'End': {}
    })
}

Response Structure

  • (dict) --

    • Payload (:class:`.EventStream`) --

      • Records (dict) -- The Records Event.

        • Payload (bytes) -- The byte array of partial, one or more result records.

      • Stats (dict) -- The Stats Event.

        • Details (dict) -- The Stats event details.

          • BytesScanned (integer) -- Total number of object bytes scanned.

          • BytesProcessed (integer) -- Total number of uncompressed object bytes processed.

          • BytesReturned (integer) -- Total number of bytes of records payload data returned.

      • Progress (dict) -- The Progress Event.

        • Details (dict) -- The Progress event details.

          • BytesScanned (integer) -- Current number of object bytes scanned.

          • BytesProcessed (integer) -- Current number of uncompressed object bytes processed.

          • BytesReturned (integer) -- Current number of bytes of records payload data returned.

      • Cont (:class:`.EventStream`) -- The Continuation Event.

      • End (:class:`.EventStream`) -- The End Event.