Amazon Athena

2020/06/01 - Amazon Athena - 9 new 3 updated api methods

Changes  This release adds support for connecting Athena to your own Apache Hive Metastores in addition to the AWS Glue Data Catalog. For more information, please see https://docs.aws.amazon.com/athena/latest/ug/connect-to-data-source-hive.html

ListDataCatalogs (new) Link ¶

Lists the data catalogs in the current AWS account.

See also: AWS API Documentation

Request Syntax

client.list_data_catalogs(
    NextToken='string',
    MaxResults=123
)
type NextToken

string

param NextToken

A token generated by the Athena service that specifies where to continue pagination if a previous request was truncated. To obtain the next set of pages, pass in the NextToken from the response object of the previous page call.

type MaxResults

integer

param MaxResults

Specifies the maximum number of data catalogs to return.

rtype

dict

returns

Response Syntax

{
    'DataCatalogsSummary': [
        {
            'CatalogName': 'string',
            'Type': 'LAMBDA'|'GLUE'|'HIVE'
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • DataCatalogsSummary (list) --

      A summary list of data catalogs.

      • (dict) --

        The summary information for the data catalog, which includes its name and type.

        • CatalogName (string) --

          The name of the data catalog.

        • Type (string) --

          The data catalog type.

    • NextToken (string) --

      A token generated by the Athena service that specifies where to continue pagination if a previous request was truncated. To obtain the next set of pages, pass in the NextToken from the response object of the previous page call.

ListTableMetadata (new) Link ¶

Lists the metadata for the tables in the specified data catalog database.

See also: AWS API Documentation

Request Syntax

client.list_table_metadata(
    CatalogName='string',
    DatabaseName='string',
    Expression='string',
    NextToken='string',
    MaxResults=123
)
type CatalogName

string

param CatalogName

[REQUIRED]

The name of the data catalog for which table metadata should be returned.

type DatabaseName

string

param DatabaseName

[REQUIRED]

The name of the database for which table metadata should be returned.

type Expression

string

param Expression

A regex filter that pattern-matches table names. If no expression is supplied, metadata for all tables are listed.

type NextToken

string

param NextToken

A token generated by the Athena service that specifies where to continue pagination if a previous request was truncated. To obtain the next set of pages, pass in the NextToken from the response object of the previous page call.

type MaxResults

integer

param MaxResults

Specifies the maximum number of results to return.

rtype

dict

returns

Response Syntax

{
    'TableMetadataList': [
        {
            'Name': 'string',
            'CreateTime': datetime(2015, 1, 1),
            'LastAccessTime': datetime(2015, 1, 1),
            'TableType': 'string',
            'Columns': [
                {
                    'Name': 'string',
                    'Type': 'string',
                    'Comment': 'string'
                },
            ],
            'PartitionKeys': [
                {
                    'Name': 'string',
                    'Type': 'string',
                    'Comment': 'string'
                },
            ],
            'Parameters': {
                'string': 'string'
            }
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • TableMetadataList (list) --

      A list of table metadata.

      • (dict) --

        Contains metadata for a table.

        • Name (string) --

          The name of the table.

        • CreateTime (datetime) --

          The time that the table was created.

        • LastAccessTime (datetime) --

          The last time the table was accessed.

        • TableType (string) --

          The type of table. In Athena, only EXTERNAL_TABLE is supported.

        • Columns (list) --

          A list of the columns in the table.

          • (dict) --

            Contains metadata for a column in a table.

            • Name (string) --

              The name of the column.

            • Type (string) --

              The data type of the column.

            • Comment (string) --

              Optional information about the column.

        • PartitionKeys (list) --

          A list of the partition keys in the table.

          • (dict) --

            Contains metadata for a column in a table.

            • Name (string) --

              The name of the column.

            • Type (string) --

              The data type of the column.

            • Comment (string) --

              Optional information about the column.

        • Parameters (dict) --

          A set of custom key/value pairs for table properties.

          • (string) --

            • (string) --

    • NextToken (string) --

      A token generated by the Athena service that specifies where to continue pagination if a previous request was truncated. To obtain the next set of pages, pass in the NextToken from the response object of the previous page call.

UpdateDataCatalog (new) Link ¶

Updates the data catalog that has the specified name.

See also: AWS API Documentation

Request Syntax

client.update_data_catalog(
    Name='string',
    Type='LAMBDA'|'GLUE'|'HIVE',
    Description='string',
    Parameters={
        'string': 'string'
    }
)
type Name

string

param Name

[REQUIRED]

The name of the data catalog to update. The catalog name must be unique for the AWS account and can use a maximum of 128 alphanumeric, underscore, at sign, or hyphen characters.

type Type

string

param Type

[REQUIRED]

Specifies the type of data catalog to update. Specify LAMBDA for a federated catalog, GLUE for AWS Glue Catalog, or HIVE for an external hive metastore.

type Description

string

param Description

New or modified text that describes the data catalog.

type Parameters

dict

param Parameters

Specifies the Lambda function or functions to use for updating the data catalog. This is a mapping whose values depend on the catalog type.

  • For the HIVE data catalog type, use the following syntax. The metadata-function parameter is required. The sdk-version parameter is optional and defaults to the currently supported version. metadata-function=lambda_arn, sdk-version=version_number

  • For the LAMBDA data catalog type, use one of the following sets of required parameters, but not both.

    • If you have one Lambda function that processes metadata and another for reading the actual data, use the following syntax. Both parameters are required. metadata-function=lambda_arn, record-function=lambda_arn

    • If you have a composite Lambda function that processes both metadata and data, use the following syntax to specify your Lambda function. function=lambda_arn

  • The GLUE type has no parameters.

  • (string) --

    • (string) --

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

ListDatabases (new) Link ¶

Lists the databases in the specified data catalog.

See also: AWS API Documentation

Request Syntax

client.list_databases(
    CatalogName='string',
    NextToken='string',
    MaxResults=123
)
type CatalogName

string

param CatalogName

[REQUIRED]

The name of the data catalog that contains the databases to return.

type NextToken

string

param NextToken

A token generated by the Athena service that specifies where to continue pagination if a previous request was truncated. To obtain the next set of pages, pass in the NextToken from the response object of the previous page call.

type MaxResults

integer

param MaxResults

Specifies the maximum number of results to return.

rtype

dict

returns

Response Syntax

{
    'DatabaseList': [
        {
            'Name': 'string',
            'Description': 'string',
            'Parameters': {
                'string': 'string'
            }
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • DatabaseList (list) --

      A list of databases from a data catalog.

      • (dict) --

        Contains metadata information for a database in a data catalog.

        • Name (string) --

          The name of the database.

        • Description (string) --

          An optional description of the database.

        • Parameters (dict) --

          A set of custom key/value pairs.

          • (string) --

            • (string) --

    • NextToken (string) --

      A token generated by the Athena service that specifies where to continue pagination if a previous request was truncated. To obtain the next set of pages, pass in the NextToken from the response object of the previous page call.

GetDataCatalog (new) Link ¶

Returns the specified data catalog.

See also: AWS API Documentation

Request Syntax

client.get_data_catalog(
    Name='string'
)
type Name

string

param Name

[REQUIRED]

The name of the data catalog to return.

rtype

dict

returns

Response Syntax

{
    'DataCatalog': {
        'Name': 'string',
        'Description': 'string',
        'Type': 'LAMBDA'|'GLUE'|'HIVE',
        'Parameters': {
            'string': 'string'
        }
    }
}

Response Structure

  • (dict) --

    • DataCatalog (dict) --

      The data catalog returned.

      • Name (string) --

        The name of the data catalog. The catalog name must be unique for the AWS account and can use a maximum of 128 alphanumeric, underscore, at sign, or hyphen characters.

      • Description (string) --

        An optional description of the data catalog.

      • Type (string) --

        The type of data catalog: LAMBDA for a federated catalog, GLUE for AWS Glue Catalog, or HIVE for an external hive metastore.

      • Parameters (dict) --

        Specifies the Lambda function or functions to use for the data catalog. This is a mapping whose values depend on the catalog type.

        • For the HIVE data catalog type, use the following syntax. The metadata-function parameter is required. The sdk-version parameter is optional and defaults to the currently supported version. metadata-function=lambda_arn, sdk-version=version_number

        • For the LAMBDA data catalog type, use one of the following sets of required parameters, but not both.

          • If you have one Lambda function that processes metadata and another for reading the actual data, use the following syntax. Both parameters are required. metadata-function=lambda_arn, record-function=lambda_arn

          • If you have a composite Lambda function that processes both metadata and data, use the following syntax to specify your Lambda function. function=lambda_arn

        • The GLUE type has no parameters.

        • (string) --

          • (string) --

GetTableMetadata (new) Link ¶

Returns table metadata for the specified catalog, database, and table.

See also: AWS API Documentation

Request Syntax

client.get_table_metadata(
    CatalogName='string',
    DatabaseName='string',
    TableName='string'
)
type CatalogName

string

param CatalogName

[REQUIRED]

The name of the data catalog that contains the database and table metadata to return.

type DatabaseName

string

param DatabaseName

[REQUIRED]

The name of the database that contains the table metadata to return.

type TableName

string

param TableName

[REQUIRED]

The name of the table for which metadata is returned.

rtype

dict

returns

Response Syntax

{
    'TableMetadata': {
        'Name': 'string',
        'CreateTime': datetime(2015, 1, 1),
        'LastAccessTime': datetime(2015, 1, 1),
        'TableType': 'string',
        'Columns': [
            {
                'Name': 'string',
                'Type': 'string',
                'Comment': 'string'
            },
        ],
        'PartitionKeys': [
            {
                'Name': 'string',
                'Type': 'string',
                'Comment': 'string'
            },
        ],
        'Parameters': {
            'string': 'string'
        }
    }
}

Response Structure

  • (dict) --

    • TableMetadata (dict) --

      An object that contains table metadata.

      • Name (string) --

        The name of the table.

      • CreateTime (datetime) --

        The time that the table was created.

      • LastAccessTime (datetime) --

        The last time the table was accessed.

      • TableType (string) --

        The type of table. In Athena, only EXTERNAL_TABLE is supported.

      • Columns (list) --

        A list of the columns in the table.

        • (dict) --

          Contains metadata for a column in a table.

          • Name (string) --

            The name of the column.

          • Type (string) --

            The data type of the column.

          • Comment (string) --

            Optional information about the column.

      • PartitionKeys (list) --

        A list of the partition keys in the table.

        • (dict) --

          Contains metadata for a column in a table.

          • Name (string) --

            The name of the column.

          • Type (string) --

            The data type of the column.

          • Comment (string) --

            Optional information about the column.

      • Parameters (dict) --

        A set of custom key/value pairs for table properties.

        • (string) --

          • (string) --

GetDatabase (new) Link ¶

Returns a database object for the specfied database and data catalog.

See also: AWS API Documentation

Request Syntax

client.get_database(
    CatalogName='string',
    DatabaseName='string'
)
type CatalogName

string

param CatalogName

[REQUIRED]

The name of the data catalog that contains the database to return.

type DatabaseName

string

param DatabaseName

[REQUIRED]

The name of the database to return.

rtype

dict

returns

Response Syntax

{
    'Database': {
        'Name': 'string',
        'Description': 'string',
        'Parameters': {
            'string': 'string'
        }
    }
}

Response Structure

  • (dict) --

    • Database (dict) --

      The database returned.

      • Name (string) --

        The name of the database.

      • Description (string) --

        An optional description of the database.

      • Parameters (dict) --

        A set of custom key/value pairs.

        • (string) --

          • (string) --

DeleteDataCatalog (new) Link ¶

Deletes a data catalog.

See also: AWS API Documentation

Request Syntax

client.delete_data_catalog(
    Name='string'
)
type Name

string

param Name

[REQUIRED]

The name of the data catalog to delete.

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

CreateDataCatalog (new) Link ¶

Creates (registers) a data catalog with the specified name and properties. Catalogs created are visible to all users of the same AWS account.

See also: AWS API Documentation

Request Syntax

client.create_data_catalog(
    Name='string',
    Type='LAMBDA'|'GLUE'|'HIVE',
    Description='string',
    Parameters={
        'string': 'string'
    },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
type Name

string

param Name

[REQUIRED]

The name of the data catalog to create. The catalog name must be unique for the AWS account and can use a maximum of 128 alphanumeric, underscore, at sign, or hyphen characters.

type Type

string

param Type

[REQUIRED]

The type of data catalog to create: LAMBDA for a federated catalog, GLUE for AWS Glue Catalog, or HIVE for an external hive metastore.

type Description

string

param Description

A description of the data catalog to be created.

type Parameters

dict

param Parameters

Specifies the Lambda function or functions to use for creating the data catalog. This is a mapping whose values depend on the catalog type.

  • For the HIVE data catalog type, use the following syntax. The metadata-function parameter is required. The sdk-version parameter is optional and defaults to the currently supported version. metadata-function=lambda_arn, sdk-version=version_number

  • For the LAMBDA data catalog type, use one of the following sets of required parameters, but not both.

    • If you have one Lambda function that processes metadata and another for reading the actual data, use the following syntax. Both parameters are required. metadata-function=lambda_arn, record-function=lambda_arn

    • If you have a composite Lambda function that processes both metadata and data, use the following syntax to specify your Lambda function. function=lambda_arn

  • The GLUE type has no parameters.

  • (string) --

    • (string) --

type Tags

list

param Tags

A list of comma separated tags to add to the data catalog that is created.

  • (dict) --

    A label that you assign to a resource. In Athena, a resource can be a workgroup or data catalog. Each tag consists of a key and an optional value, both of which you define. For example, you can use tags to categorize Athena workgroups or data catalogs by purpose, owner, or environment. Use a consistent set of tag keys to make it easier to search and filter workgroups or data catalogs in your account. For best practices, see Tagging Best Practices. Tag keys can be from 1 to 128 UTF-8 Unicode characters, and tag values can be from 0 to 256 UTF-8 Unicode characters. Tags can use letters and numbers representable in UTF-8, and the following characters: + - = . _ : / @. Tag keys and values are case-sensitive. Tag keys must be unique per resource. If you specify more than one tag, separate them by commas.

    • Key (string) --

      A tag key. The tag key length is from 1 to 128 Unicode characters in UTF-8. You can use letters and numbers representable in UTF-8, and the following characters: + - = . _ : / @. Tag keys are case-sensitive and must be unique per resource.

    • Value (string) --

      A tag value. The tag value length is from 0 to 256 Unicode characters in UTF-8. You can use letters and numbers representable in UTF-8, and the following characters: + - = . _ : / @. Tag values are case-sensitive.

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

BatchGetQueryExecution (updated) Link ¶
Changes (response)
{'QueryExecutions': {'QueryExecutionContext': {'Catalog': 'string'}}}

Returns the details of a single query execution or a list of up to 50 query executions, which you provide as an array of query execution ID strings. Requires you to have access to the workgroup in which the queries ran. To get a list of query execution IDs, use ListQueryExecutionsInput$WorkGroup. Query executions differ from named (saved) queries. Use BatchGetNamedQueryInput to get details about named queries.

See also: AWS API Documentation

Request Syntax

client.batch_get_query_execution(
    QueryExecutionIds=[
        'string',
    ]
)
type QueryExecutionIds

list

param QueryExecutionIds

[REQUIRED]

An array of query execution IDs.

  • (string) --

rtype

dict

returns

Response Syntax

{
    'QueryExecutions': [
        {
            'QueryExecutionId': 'string',
            'Query': 'string',
            'StatementType': 'DDL'|'DML'|'UTILITY',
            'ResultConfiguration': {
                'OutputLocation': 'string',
                'EncryptionConfiguration': {
                    'EncryptionOption': 'SSE_S3'|'SSE_KMS'|'CSE_KMS',
                    'KmsKey': 'string'
                }
            },
            'QueryExecutionContext': {
                'Database': 'string',
                'Catalog': 'string'
            },
            'Status': {
                'State': 'QUEUED'|'RUNNING'|'SUCCEEDED'|'FAILED'|'CANCELLED',
                'StateChangeReason': 'string',
                'SubmissionDateTime': datetime(2015, 1, 1),
                'CompletionDateTime': datetime(2015, 1, 1)
            },
            'Statistics': {
                'EngineExecutionTimeInMillis': 123,
                'DataScannedInBytes': 123,
                'DataManifestLocation': 'string',
                'TotalExecutionTimeInMillis': 123,
                'QueryQueueTimeInMillis': 123,
                'QueryPlanningTimeInMillis': 123,
                'ServiceProcessingTimeInMillis': 123
            },
            'WorkGroup': 'string'
        },
    ],
    'UnprocessedQueryExecutionIds': [
        {
            'QueryExecutionId': 'string',
            'ErrorCode': 'string',
            'ErrorMessage': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • QueryExecutions (list) --

      Information about a query execution.

      • (dict) --

        Information about a single instance of a query execution.

        • QueryExecutionId (string) --

          The unique identifier for each query execution.

        • Query (string) --

          The SQL query statements which the query execution ran.

        • StatementType (string) --

          The type of query statement that was run. DDL indicates DDL query statements. DML indicates DML (Data Manipulation Language) query statements, such as CREATE TABLE AS SELECT . UTILITY indicates query statements other than DDL and DML, such as SHOW CREATE TABLE , or DESCRIBE <table> .

        • ResultConfiguration (dict) --

          The location in Amazon S3 where query results were stored and the encryption option, if any, used for query results. These are known as "client-side settings". If workgroup settings override client-side settings, then the query uses the location for the query results and the encryption configuration that are specified for the workgroup.

          • OutputLocation (string) --

            The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/ . To run the query, you must specify the query results location using one of the ways: either for individual queries using either this setting (client-side), or in the workgroup, using WorkGroupConfiguration. If none of them is set, Athena issues an error that no output location is provided. For more information, see Query Results. If workgroup settings override client-side settings, then the query uses the settings specified for the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration.

          • EncryptionConfiguration (dict) --

            If query results are encrypted in Amazon S3, indicates the encryption option used (for example, SSE-KMS or CSE-KMS ) and key information. This is a client-side setting. If workgroup settings override client-side settings, then the query uses the encryption configuration that is specified for the workgroup, and also uses the location for storing query results specified in the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration and Workgroup Settings Override Client-Side Settings.

            • EncryptionOption (string) --

              Indicates whether Amazon S3 server-side encryption with Amazon S3-managed keys ( SSE-S3 ), server-side encryption with KMS-managed keys ( SSE-KMS ), or client-side encryption with KMS-managed keys (CSE-KMS) is used.

              If a query runs in a workgroup and the workgroup overrides client-side settings, then the workgroup's setting for encryption is used. It specifies whether query results must be encrypted, for all queries that run in this workgroup.

            • KmsKey (string) --

              For SSE-KMS and CSE-KMS , this is the KMS key ARN or ID.

        • QueryExecutionContext (dict) --

          The database in which the query execution occurred.

          • Database (string) --

            The name of the database used in the query execution.

          • Catalog (string) --

            The name of the data catalog used in the query execution.

        • Status (dict) --

          The completion date, current state, submission time, and state change reason (if applicable) for the query execution.

          • State (string) --

            The state of query execution. QUEUED indicates that the query has been submitted to the service, and Athena will execute the query as soon as resources are available. RUNNING indicates that the query is in execution phase. SUCCEEDED indicates that the query completed without errors. FAILED indicates that the query experienced an error and did not complete processing. CANCELLED indicates that a user input interrupted query execution.

            Note

            Athena automatically retries your queries in cases of certain transient errors. As a result, you may see the query state transition from RUNNING or FAILED to QUEUED .

          • StateChangeReason (string) --

            Further detail about the status of the query.

          • SubmissionDateTime (datetime) --

            The date and time that the query was submitted.

          • CompletionDateTime (datetime) --

            The date and time that the query completed.

        • Statistics (dict) --

          Query execution statistics, such as the amount of data scanned, the amount of time that the query took to process, and the type of statement that was run.

          • EngineExecutionTimeInMillis (integer) --

            The number of milliseconds that the query took to execute.

          • DataScannedInBytes (integer) --

            The number of bytes in the data that was queried.

          • DataManifestLocation (string) --

            The location and file name of a data manifest file. The manifest file is saved to the Athena query results location in Amazon S3. The manifest file tracks files that the query wrote to Amazon S3. If the query fails, the manifest file also tracks files that the query intended to write. The manifest is useful for identifying orphaned files resulting from a failed query. For more information, see Working with Query Results, Output Files, and Query History in the Amazon Athena User Guide .

          • TotalExecutionTimeInMillis (integer) --

            The number of milliseconds that Athena took to run the query.

          • QueryQueueTimeInMillis (integer) --

            The number of milliseconds that the query was in your query queue waiting for resources. Note that if transient errors occur, Athena might automatically add the query back to the queue.

          • QueryPlanningTimeInMillis (integer) --

            The number of milliseconds that Athena took to plan the query processing flow. This includes the time spent retrieving table partitions from the data source. Note that because the query engine performs the query planning, query planning time is a subset of engine processing time.

          • ServiceProcessingTimeInMillis (integer) --

            The number of milliseconds that Athena took to finalize and publish the query results after the query engine finished running the query.

        • WorkGroup (string) --

          The name of the workgroup in which the query ran.

    • UnprocessedQueryExecutionIds (list) --

      Information about the query executions that failed to run.

      • (dict) --

        Describes a query execution that failed to process.

        • QueryExecutionId (string) --

          The unique identifier of the query execution.

        • ErrorCode (string) --

          The error code returned when the query execution failed to process, if applicable.

        • ErrorMessage (string) --

          The error message returned when the query execution failed to process, if applicable.

GetQueryExecution (updated) Link ¶
Changes (response)
{'QueryExecution': {'QueryExecutionContext': {'Catalog': 'string'}}}

Returns information about a single execution of a query if you have access to the workgroup in which the query ran. Each time a query executes, information about the query execution is saved with a unique ID.

See also: AWS API Documentation

Request Syntax

client.get_query_execution(
    QueryExecutionId='string'
)
type QueryExecutionId

string

param QueryExecutionId

[REQUIRED]

The unique ID of the query execution.

rtype

dict

returns

Response Syntax

{
    'QueryExecution': {
        'QueryExecutionId': 'string',
        'Query': 'string',
        'StatementType': 'DDL'|'DML'|'UTILITY',
        'ResultConfiguration': {
            'OutputLocation': 'string',
            'EncryptionConfiguration': {
                'EncryptionOption': 'SSE_S3'|'SSE_KMS'|'CSE_KMS',
                'KmsKey': 'string'
            }
        },
        'QueryExecutionContext': {
            'Database': 'string',
            'Catalog': 'string'
        },
        'Status': {
            'State': 'QUEUED'|'RUNNING'|'SUCCEEDED'|'FAILED'|'CANCELLED',
            'StateChangeReason': 'string',
            'SubmissionDateTime': datetime(2015, 1, 1),
            'CompletionDateTime': datetime(2015, 1, 1)
        },
        'Statistics': {
            'EngineExecutionTimeInMillis': 123,
            'DataScannedInBytes': 123,
            'DataManifestLocation': 'string',
            'TotalExecutionTimeInMillis': 123,
            'QueryQueueTimeInMillis': 123,
            'QueryPlanningTimeInMillis': 123,
            'ServiceProcessingTimeInMillis': 123
        },
        'WorkGroup': 'string'
    }
}

Response Structure

  • (dict) --

    • QueryExecution (dict) --

      Information about the query execution.

      • QueryExecutionId (string) --

        The unique identifier for each query execution.

      • Query (string) --

        The SQL query statements which the query execution ran.

      • StatementType (string) --

        The type of query statement that was run. DDL indicates DDL query statements. DML indicates DML (Data Manipulation Language) query statements, such as CREATE TABLE AS SELECT . UTILITY indicates query statements other than DDL and DML, such as SHOW CREATE TABLE , or DESCRIBE <table> .

      • ResultConfiguration (dict) --

        The location in Amazon S3 where query results were stored and the encryption option, if any, used for query results. These are known as "client-side settings". If workgroup settings override client-side settings, then the query uses the location for the query results and the encryption configuration that are specified for the workgroup.

        • OutputLocation (string) --

          The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/ . To run the query, you must specify the query results location using one of the ways: either for individual queries using either this setting (client-side), or in the workgroup, using WorkGroupConfiguration. If none of them is set, Athena issues an error that no output location is provided. For more information, see Query Results. If workgroup settings override client-side settings, then the query uses the settings specified for the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration.

        • EncryptionConfiguration (dict) --

          If query results are encrypted in Amazon S3, indicates the encryption option used (for example, SSE-KMS or CSE-KMS ) and key information. This is a client-side setting. If workgroup settings override client-side settings, then the query uses the encryption configuration that is specified for the workgroup, and also uses the location for storing query results specified in the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration and Workgroup Settings Override Client-Side Settings.

          • EncryptionOption (string) --

            Indicates whether Amazon S3 server-side encryption with Amazon S3-managed keys ( SSE-S3 ), server-side encryption with KMS-managed keys ( SSE-KMS ), or client-side encryption with KMS-managed keys (CSE-KMS) is used.

            If a query runs in a workgroup and the workgroup overrides client-side settings, then the workgroup's setting for encryption is used. It specifies whether query results must be encrypted, for all queries that run in this workgroup.

          • KmsKey (string) --

            For SSE-KMS and CSE-KMS , this is the KMS key ARN or ID.

      • QueryExecutionContext (dict) --

        The database in which the query execution occurred.

        • Database (string) --

          The name of the database used in the query execution.

        • Catalog (string) --

          The name of the data catalog used in the query execution.

      • Status (dict) --

        The completion date, current state, submission time, and state change reason (if applicable) for the query execution.

        • State (string) --

          The state of query execution. QUEUED indicates that the query has been submitted to the service, and Athena will execute the query as soon as resources are available. RUNNING indicates that the query is in execution phase. SUCCEEDED indicates that the query completed without errors. FAILED indicates that the query experienced an error and did not complete processing. CANCELLED indicates that a user input interrupted query execution.

          Note

          Athena automatically retries your queries in cases of certain transient errors. As a result, you may see the query state transition from RUNNING or FAILED to QUEUED .

        • StateChangeReason (string) --

          Further detail about the status of the query.

        • SubmissionDateTime (datetime) --

          The date and time that the query was submitted.

        • CompletionDateTime (datetime) --

          The date and time that the query completed.

      • Statistics (dict) --

        Query execution statistics, such as the amount of data scanned, the amount of time that the query took to process, and the type of statement that was run.

        • EngineExecutionTimeInMillis (integer) --

          The number of milliseconds that the query took to execute.

        • DataScannedInBytes (integer) --

          The number of bytes in the data that was queried.

        • DataManifestLocation (string) --

          The location and file name of a data manifest file. The manifest file is saved to the Athena query results location in Amazon S3. The manifest file tracks files that the query wrote to Amazon S3. If the query fails, the manifest file also tracks files that the query intended to write. The manifest is useful for identifying orphaned files resulting from a failed query. For more information, see Working with Query Results, Output Files, and Query History in the Amazon Athena User Guide .

        • TotalExecutionTimeInMillis (integer) --

          The number of milliseconds that Athena took to run the query.

        • QueryQueueTimeInMillis (integer) --

          The number of milliseconds that the query was in your query queue waiting for resources. Note that if transient errors occur, Athena might automatically add the query back to the queue.

        • QueryPlanningTimeInMillis (integer) --

          The number of milliseconds that Athena took to plan the query processing flow. This includes the time spent retrieving table partitions from the data source. Note that because the query engine performs the query planning, query planning time is a subset of engine processing time.

        • ServiceProcessingTimeInMillis (integer) --

          The number of milliseconds that Athena took to finalize and publish the query results after the query engine finished running the query.

      • WorkGroup (string) --

        The name of the workgroup in which the query ran.

StartQueryExecution (updated) Link ¶
Changes (request)
{'QueryExecutionContext': {'Catalog': 'string'}}

Runs the SQL query statements contained in the Query . Requires you to have access to the workgroup in which the query ran. Running queries against an external catalog requires GetDataCatalog permission to the catalog. For code samples using the AWS SDK for Java, see Examples and Code Samples in the Amazon Athena User Guide .

See also: AWS API Documentation

Request Syntax

client.start_query_execution(
    QueryString='string',
    ClientRequestToken='string',
    QueryExecutionContext={
        'Database': 'string',
        'Catalog': 'string'
    },
    ResultConfiguration={
        'OutputLocation': 'string',
        'EncryptionConfiguration': {
            'EncryptionOption': 'SSE_S3'|'SSE_KMS'|'CSE_KMS',
            'KmsKey': 'string'
        }
    },
    WorkGroup='string'
)
type QueryString

string

param QueryString

[REQUIRED]

The SQL query statements to be executed.

type ClientRequestToken

string

param ClientRequestToken

A unique case-sensitive string used to ensure the request to create the query is idempotent (executes only once). If another StartQueryExecution request is received, the same response is returned and another query is not created. If a parameter has changed, for example, the QueryString , an error is returned.

Warning

This token is listed as not required because AWS SDKs (for example the AWS SDK for Java) auto-generate the token for users. If you are not using the AWS SDK or the AWS CLI, you must provide this token or the action will fail.

This field is autopopulated if not provided.

type QueryExecutionContext

dict

param QueryExecutionContext

The database within which the query executes.

  • Database (string) --

    The name of the database used in the query execution.

  • Catalog (string) --

    The name of the data catalog used in the query execution.

type ResultConfiguration

dict

param ResultConfiguration

Specifies information about where and how to save the results of the query execution. If the query runs in a workgroup, then workgroup's settings may override query settings. This affects the query results location. The workgroup settings override is specified in EnforceWorkGroupConfiguration (true/false) in the WorkGroupConfiguration. See WorkGroupConfiguration$EnforceWorkGroupConfiguration.

  • OutputLocation (string) --

    The location in Amazon S3 where your query results are stored, such as s3://path/to/query/bucket/ . To run the query, you must specify the query results location using one of the ways: either for individual queries using either this setting (client-side), or in the workgroup, using WorkGroupConfiguration. If none of them is set, Athena issues an error that no output location is provided. For more information, see Query Results. If workgroup settings override client-side settings, then the query uses the settings specified for the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration.

  • EncryptionConfiguration (dict) --

    If query results are encrypted in Amazon S3, indicates the encryption option used (for example, SSE-KMS or CSE-KMS ) and key information. This is a client-side setting. If workgroup settings override client-side settings, then the query uses the encryption configuration that is specified for the workgroup, and also uses the location for storing query results specified in the workgroup. See WorkGroupConfiguration$EnforceWorkGroupConfiguration and Workgroup Settings Override Client-Side Settings.

    • EncryptionOption (string) -- [REQUIRED]

      Indicates whether Amazon S3 server-side encryption with Amazon S3-managed keys ( SSE-S3 ), server-side encryption with KMS-managed keys ( SSE-KMS ), or client-side encryption with KMS-managed keys (CSE-KMS) is used.

      If a query runs in a workgroup and the workgroup overrides client-side settings, then the workgroup's setting for encryption is used. It specifies whether query results must be encrypted, for all queries that run in this workgroup.

    • KmsKey (string) --

      For SSE-KMS and CSE-KMS , this is the KMS key ARN or ID.

type WorkGroup

string

param WorkGroup

The name of the workgroup in which the query is being started.

rtype

dict

returns

Response Syntax

{
    'QueryExecutionId': 'string'
}

Response Structure

  • (dict) --

    • QueryExecutionId (string) --

      The unique ID of the query that ran as a result of this request.