Amazon Athena

2019/11/25 - Amazon Athena - 2 updated api methods

Changes  This release adds additional query lifecycle metrics to the QueryExecutionStatistics object in GetQueryExecution response.

BatchGetQueryExecution (updated) Link ¶
Changes (response)
{'QueryExecutions': {'Statistics': {'QueryPlanningTimeInMillis': 'long',
                                    'QueryQueueTimeInMillis': 'long',
                                    'ServiceProcessingTimeInMillis': 'long',
                                    'TotalExecutionTimeInMillis': 'long'}}}

Returns the details of a single query execution or a list of up to 50 query executions, which you provide as an array of query execution ID strings. Requires you to have access to the workgroup in which the queries ran. To get a list of query execution IDs, use ListQueryExecutionsInput$WorkGroup. Query executions differ from named (saved) queries. Use BatchGetNamedQueryInput to get details about named queries.

See also: AWS API Documentation

Request Syntax

client.batch_get_query_execution(
    QueryExecutionIds=[
        'string',
    ]
)
type QueryExecutionIds

list

param QueryExecutionIds

[REQUIRED]

An array of query execution IDs.

  • (string) --

rtype

dict

returns

Response Syntax

{
    'QueryExecutions': [
        {
            'QueryExecutionId': 'string',
            'Query': 'string',
            'StatementType': 'DDL'|'DML'|'UTILITY',
            'ResultConfiguration': {
                'OutputLocation': 'string',
                'EncryptionConfiguration': {
                    'EncryptionOption': 'SSE_S3'|'SSE_KMS'|'CSE_KMS',
                    'KmsKey': 'string'
                }
            },
            'QueryExecutionContext': {
                'Database': 'string'
            },
            'Status': {
                'State': 'QUEUED'|'RUNNING'|'SUCCEEDED'|'FAILED'|'CANCELLED',
                'StateChangeReason': 'string',
                'SubmissionDateTime': datetime(2015, 1, 1),
                'CompletionDateTime': datetime(2015, 1, 1)
            },
            'Statistics': {
                'EngineExecutionTimeInMillis': 123,
                'DataScannedInBytes': 123,
                'DataManifestLocation': 'string',
                'TotalExecutionTimeInMillis': 123,
                'QueryQueueTimeInMillis': 123,
                'QueryPlanningTimeInMillis': 123,
                'ServiceProcessingTimeInMillis': 123
            },
            'WorkGroup': 'string'
        },
    ],
    'UnprocessedQueryExecutionIds': [
        {
            'QueryExecutionId': 'string',
            'ErrorCode': 'string',
            'ErrorMessage': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • QueryExecutions (list) --

      Information about a query execution.

      • (dict) --

        Information about a single instance of a query execution.

        • QueryExecutionId (string) --

          The unique identifier for each query execution.

        • Query (string) --

          The SQL query statements which the query execution ran.

        • StatementType (string) --

          The type of query statement that was run. DDL indicates DDL query statements. DML indicates DML (Data Manipulation Language) query statements, such as CREATE TABLE AS SELECT . UTILITY indicates query statements other than DDL and DML, such as SHOW CREATE TABLE , or DESCRIBE <table> .

        • ResultConfiguration (dict) --

          The location in Amazon S3 where query results were stored and the encryption option, if any, used for query results. These are known as "client-side settings". If workgroup settings override client-side settings, then the query uses the location for the query results and the encryption configuration that are specified for the workgroup.

          • OutputLocation (string) --

            The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/ . To run the query, you must specify the query results location using one of the ways: either for individual queries using either this setting (client-side), or in the workgroup, using WorkGroupConfiguration. If none of them is set, Athena issues an error that no output location is provided. For more information, see Query Results. If workgroup settings override client-side settings, then the query uses the settings specified for the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration.

          • EncryptionConfiguration (dict) --

            If query results are encrypted in Amazon S3, indicates the encryption option used (for example, SSE-KMS or CSE-KMS ) and key information. This is a client-side setting. If workgroup settings override client-side settings, then the query uses the encryption configuration that is specified for the workgroup, and also uses the location for storing query results specified in the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration and Workgroup Settings Override Client-Side Settings.

            • EncryptionOption (string) --

              Indicates whether Amazon S3 server-side encryption with Amazon S3-managed keys ( SSE-S3 ), server-side encryption with KMS-managed keys ( SSE-KMS ), or client-side encryption with KMS-managed keys (CSE-KMS) is used.

              If a query runs in a workgroup and the workgroup overrides client-side settings, then the workgroup's setting for encryption is used. It specifies whether query results must be encrypted, for all queries that run in this workgroup.

            • KmsKey (string) --

              For SSE-KMS and CSE-KMS , this is the KMS key ARN or ID.

        • QueryExecutionContext (dict) --

          The database in which the query execution occurred.

          • Database (string) --

            The name of the database.

        • Status (dict) --

          The completion date, current state, submission time, and state change reason (if applicable) for the query execution.

          • State (string) --

            The state of query execution. QUEUED state is listed but is not used by Athena and is reserved for future use. RUNNING indicates that the query has been submitted to the service, and Athena will execute the query as soon as resources are available. SUCCEEDED indicates that the query completed without errors. FAILED indicates that the query experienced an error and did not complete processing. CANCELLED indicates that a user input interrupted query execution.

          • StateChangeReason (string) --

            Further detail about the status of the query.

          • SubmissionDateTime (datetime) --

            The date and time that the query was submitted.

          • CompletionDateTime (datetime) --

            The date and time that the query completed.

        • Statistics (dict) --

          The amount of data scanned during the query execution and the amount of time that it took to execute, and the type of statement that was run.

          • EngineExecutionTimeInMillis (integer) --

            The number of milliseconds that the query took to execute.

          • DataScannedInBytes (integer) --

            The number of bytes in the data that was queried.

          • DataManifestLocation (string) --

            The location and file name of a data manifest file. The manifest file is saved to the Athena query results location in Amazon S3. The manifest file tracks files that the query wrote to Amazon S3. If the query fails, the manifest file also tracks files that the query intended to write. The manifest is useful for identifying orphaned files resulting from a failed query. For more information, see Working with Query Results, Output Files, and Query History in the Amazon Athena User Guide .

          • TotalExecutionTimeInMillis (integer) --

            The number of milliseconds that Athena took to run the query.

          • QueryQueueTimeInMillis (integer) --

            The number of milliseconds that the query was in your query queue waiting for resources. Note that if transient errors occur, Athena might automatically add the query back to the queue.

          • QueryPlanningTimeInMillis (integer) --

            The number of milliseconds that Athena took to plan the query processing flow. This includes the time spent retrieving table partitions from the data source. Note that because the query engine performs the query planning, query planning time is a subset of engine processing time.

          • ServiceProcessingTimeInMillis (integer) --

            The number of milliseconds that Athena took to finalize and publish the query results after the query engine finished running the query.

        • WorkGroup (string) --

          The name of the workgroup in which the query ran.

    • UnprocessedQueryExecutionIds (list) --

      Information about the query executions that failed to run.

      • (dict) --

        Describes a query execution that failed to process.

        • QueryExecutionId (string) --

          The unique identifier of the query execution.

        • ErrorCode (string) --

          The error code returned when the query execution failed to process, if applicable.

        • ErrorMessage (string) --

          The error message returned when the query execution failed to process, if applicable.

GetQueryExecution (updated) Link ¶
Changes (response)
{'QueryExecution': {'Statistics': {'QueryPlanningTimeInMillis': 'long',
                                   'QueryQueueTimeInMillis': 'long',
                                   'ServiceProcessingTimeInMillis': 'long',
                                   'TotalExecutionTimeInMillis': 'long'}}}

Returns information about a single execution of a query if you have access to the workgroup in which the query ran. Each time a query executes, information about the query execution is saved with a unique ID.

See also: AWS API Documentation

Request Syntax

client.get_query_execution(
    QueryExecutionId='string'
)
type QueryExecutionId

string

param QueryExecutionId

[REQUIRED]

The unique ID of the query execution.

rtype

dict

returns

Response Syntax

{
    'QueryExecution': {
        'QueryExecutionId': 'string',
        'Query': 'string',
        'StatementType': 'DDL'|'DML'|'UTILITY',
        'ResultConfiguration': {
            'OutputLocation': 'string',
            'EncryptionConfiguration': {
                'EncryptionOption': 'SSE_S3'|'SSE_KMS'|'CSE_KMS',
                'KmsKey': 'string'
            }
        },
        'QueryExecutionContext': {
            'Database': 'string'
        },
        'Status': {
            'State': 'QUEUED'|'RUNNING'|'SUCCEEDED'|'FAILED'|'CANCELLED',
            'StateChangeReason': 'string',
            'SubmissionDateTime': datetime(2015, 1, 1),
            'CompletionDateTime': datetime(2015, 1, 1)
        },
        'Statistics': {
            'EngineExecutionTimeInMillis': 123,
            'DataScannedInBytes': 123,
            'DataManifestLocation': 'string',
            'TotalExecutionTimeInMillis': 123,
            'QueryQueueTimeInMillis': 123,
            'QueryPlanningTimeInMillis': 123,
            'ServiceProcessingTimeInMillis': 123
        },
        'WorkGroup': 'string'
    }
}

Response Structure

  • (dict) --

    • QueryExecution (dict) --

      Information about the query execution.

      • QueryExecutionId (string) --

        The unique identifier for each query execution.

      • Query (string) --

        The SQL query statements which the query execution ran.

      • StatementType (string) --

        The type of query statement that was run. DDL indicates DDL query statements. DML indicates DML (Data Manipulation Language) query statements, such as CREATE TABLE AS SELECT . UTILITY indicates query statements other than DDL and DML, such as SHOW CREATE TABLE , or DESCRIBE <table> .

      • ResultConfiguration (dict) --

        The location in Amazon S3 where query results were stored and the encryption option, if any, used for query results. These are known as "client-side settings". If workgroup settings override client-side settings, then the query uses the location for the query results and the encryption configuration that are specified for the workgroup.

        • OutputLocation (string) --

          The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/ . To run the query, you must specify the query results location using one of the ways: either for individual queries using either this setting (client-side), or in the workgroup, using WorkGroupConfiguration. If none of them is set, Athena issues an error that no output location is provided. For more information, see Query Results. If workgroup settings override client-side settings, then the query uses the settings specified for the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration.

        • EncryptionConfiguration (dict) --

          If query results are encrypted in Amazon S3, indicates the encryption option used (for example, SSE-KMS or CSE-KMS ) and key information. This is a client-side setting. If workgroup settings override client-side settings, then the query uses the encryption configuration that is specified for the workgroup, and also uses the location for storing query results specified in the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration and Workgroup Settings Override Client-Side Settings.

          • EncryptionOption (string) --

            Indicates whether Amazon S3 server-side encryption with Amazon S3-managed keys ( SSE-S3 ), server-side encryption with KMS-managed keys ( SSE-KMS ), or client-side encryption with KMS-managed keys (CSE-KMS) is used.

            If a query runs in a workgroup and the workgroup overrides client-side settings, then the workgroup's setting for encryption is used. It specifies whether query results must be encrypted, for all queries that run in this workgroup.

          • KmsKey (string) --

            For SSE-KMS and CSE-KMS , this is the KMS key ARN or ID.

      • QueryExecutionContext (dict) --

        The database in which the query execution occurred.

        • Database (string) --

          The name of the database.

      • Status (dict) --

        The completion date, current state, submission time, and state change reason (if applicable) for the query execution.

        • State (string) --

          The state of query execution. QUEUED state is listed but is not used by Athena and is reserved for future use. RUNNING indicates that the query has been submitted to the service, and Athena will execute the query as soon as resources are available. SUCCEEDED indicates that the query completed without errors. FAILED indicates that the query experienced an error and did not complete processing. CANCELLED indicates that a user input interrupted query execution.

        • StateChangeReason (string) --

          Further detail about the status of the query.

        • SubmissionDateTime (datetime) --

          The date and time that the query was submitted.

        • CompletionDateTime (datetime) --

          The date and time that the query completed.

      • Statistics (dict) --

        The amount of data scanned during the query execution and the amount of time that it took to execute, and the type of statement that was run.

        • EngineExecutionTimeInMillis (integer) --

          The number of milliseconds that the query took to execute.

        • DataScannedInBytes (integer) --

          The number of bytes in the data that was queried.

        • DataManifestLocation (string) --

          The location and file name of a data manifest file. The manifest file is saved to the Athena query results location in Amazon S3. The manifest file tracks files that the query wrote to Amazon S3. If the query fails, the manifest file also tracks files that the query intended to write. The manifest is useful for identifying orphaned files resulting from a failed query. For more information, see Working with Query Results, Output Files, and Query History in the Amazon Athena User Guide .

        • TotalExecutionTimeInMillis (integer) --

          The number of milliseconds that Athena took to run the query.

        • QueryQueueTimeInMillis (integer) --

          The number of milliseconds that the query was in your query queue waiting for resources. Note that if transient errors occur, Athena might automatically add the query back to the queue.

        • QueryPlanningTimeInMillis (integer) --

          The number of milliseconds that Athena took to plan the query processing flow. This includes the time spent retrieving table partitions from the data source. Note that because the query engine performs the query planning, query planning time is a subset of engine processing time.

        • ServiceProcessingTimeInMillis (integer) --

          The number of milliseconds that Athena took to finalize and publish the query results after the query engine finished running the query.

      • WorkGroup (string) --

        The name of the workgroup in which the query ran.