AWS Glue

2022/03/18 - AWS Glue - 9 new3 updated api methods

Changes  Added 9 new APIs for AWS Glue Interactive Sessions: ListSessions, StopSession, CreateSession, GetSession, DeleteSession, RunStatement, GetStatement, ListStatements, CancelStatement

DeleteSession (new) Link ¶

Deletes the session.

See also: AWS API Documentation

Request Syntax

client.delete_session(
    Id='string',
    RequestOrigin='string'
)
type Id:

string

param Id:

[REQUIRED]

The ID of the session to be deleted.

type RequestOrigin:

string

param RequestOrigin:

The name of the origin of the delete session request.

rtype:

dict

returns:

Response Syntax

{
    'Id': 'string'
}

Response Structure

  • (dict) --

    • Id (string) --

      Returns the ID of the deleted session.

CancelStatement (new) Link ¶

Cancels the statement..

See also: AWS API Documentation

Request Syntax

client.cancel_statement(
    SessionId='string',
    Id=123,
    RequestOrigin='string'
)
type SessionId:

string

param SessionId:

[REQUIRED]

The Session ID of the statement to be cancelled.

type Id:

integer

param Id:

[REQUIRED]

The ID of the statement to be cancelled.

type RequestOrigin:

string

param RequestOrigin:

The origin of the request to cancel the statement.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

  • (dict) --

ListStatements (new) Link ¶

Lists statements for the session.

See also: AWS API Documentation

Request Syntax

client.list_statements(
    SessionId='string',
    RequestOrigin='string',
    NextToken='string'
)
type SessionId:

string

param SessionId:

[REQUIRED]

The Session ID of the statements.

type RequestOrigin:

string

param RequestOrigin:

The origin of the request to list statements.

type NextToken:

string

param NextToken:

rtype:

dict

returns:

Response Syntax

{
    'Statements': [
        {
            'Id': 123,
            'Code': 'string',
            'State': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
            'Output': {
                'Data': {
                    'TextPlain': 'string'
                },
                'ExecutionCount': 123,
                'Status': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
                'ErrorName': 'string',
                'ErrorValue': 'string',
                'Traceback': [
                    'string',
                ]
            },
            'Progress': 123.0,
            'StartedOn': 123,
            'CompletedOn': 123
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • Statements (list) --

      Returns the list of statements.

      • (dict) --

        The statement or request for a particular action to occur in a session.

        • Id (integer) --

          The ID of the statement.

        • Code (string) --

          The execution code of the statement.

        • State (string) --

          The state while request is actioned.

        • Output (dict) --

          The output in JSON.

          • Data (dict) --

            The code execution output.

            • TextPlain (string) --

              The code execution output in text format.

          • ExecutionCount (integer) --

            The execution count of the output.

          • Status (string) --

            The status of the code execution output.

          • ErrorName (string) --

            The name of the error in the output.

          • ErrorValue (string) --

            The error value of the output.

          • Traceback (list) --

            The traceback of the output.

            • (string) --

        • Progress (float) --

          The code execution progress.

        • StartedOn (integer) --

          The unix time and date that the job definition was started.

        • CompletedOn (integer) --

          The unix time and date that the job definition was completed.

    • NextToken (string) --

CreateSession (new) Link ¶

Creates a new session.

See also: AWS API Documentation

Request Syntax

client.create_session(
    Id='string',
    Description='string',
    Role='string',
    Command={
        'Name': 'string',
        'PythonVersion': 'string'
    },
    Timeout=123,
    IdleTimeout=123,
    DefaultArguments={
        'string': 'string'
    },
    Connections={
        'Connections': [
            'string',
        ]
    },
    MaxCapacity=123.0,
    NumberOfWorkers=123,
    WorkerType='Standard'|'G.1X'|'G.2X',
    SecurityConfiguration='string',
    GlueVersion='string',
    Tags={
        'string': 'string'
    },
    RequestOrigin='string'
)
type Id:

string

param Id:

[REQUIRED]

The ID of the session request.

type Description:

string

param Description:

The description of the session.

type Role:

string

param Role:

[REQUIRED]

The IAM Role ARN

type Command:

dict

param Command:

[REQUIRED]

The SessionCommand that runs the job.

  • Name (string) --

    Specifies the name of the SessionCommand.Can be 'glueetl' or 'gluestreaming'.

  • PythonVersion (string) --

    Specifies the Python version. The Python version indicates the version supported for jobs of type Spark.

type Timeout:

integer

param Timeout:

The number of seconds before request times out.

type IdleTimeout:

integer

param IdleTimeout:

The number of seconds when idle before request times out.

type DefaultArguments:

dict

param DefaultArguments:

A map array of key-value pairs. Max is 75 pairs.

  • (string) --

    • (string) --

type Connections:

dict

param Connections:

The number of connections to use for the session.

  • Connections (list) --

    A list of connections used by the job.

    • (string) --

type MaxCapacity:

float

param MaxCapacity:

The number of AWS Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

type NumberOfWorkers:

integer

param NumberOfWorkers:

The number of workers to use for the session.

type WorkerType:

string

param WorkerType:

The Worker Type. Can be one of G.1X, G.2X, Standard

type SecurityConfiguration:

string

param SecurityConfiguration:

The name of the SecurityConfiguration structure to be used with the session

type GlueVersion:

string

param GlueVersion:

The Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The GlueVersion must be greater than 2.0.

type Tags:

dict

param Tags:

The map of key value pairs (tags) belonging to the session.

  • (string) --

    • (string) --

type RequestOrigin:

string

param RequestOrigin:

The origin of the request.

rtype:

dict

returns:

Response Syntax

{
    'Session': {
        'Id': 'string',
        'CreatedOn': datetime(2015, 1, 1),
        'Status': 'PROVISIONING'|'READY'|'FAILED'|'TIMEOUT'|'STOPPING'|'STOPPED',
        'ErrorMessage': 'string',
        'Description': 'string',
        'Role': 'string',
        'Command': {
            'Name': 'string',
            'PythonVersion': 'string'
        },
        'DefaultArguments': {
            'string': 'string'
        },
        'Connections': {
            'Connections': [
                'string',
            ]
        },
        'Progress': 123.0,
        'MaxCapacity': 123.0,
        'SecurityConfiguration': 'string',
        'GlueVersion': 'string'
    }
}

Response Structure

  • (dict) --

    • Session (dict) --

      Returns the session object in the response.

      • Id (string) --

        The ID of the session.

      • CreatedOn (datetime) --

        The time and date when the session was created.

      • Status (string) --

        The session status.

      • ErrorMessage (string) --

        The error message displayed during the session.

      • Description (string) --

        The description of the session.

      • Role (string) --

        The name or Amazon Resource Name (ARN) of the IAM role associated with the Session.

      • Command (dict) --

        The command object.See SessionCommand.

        • Name (string) --

          Specifies the name of the SessionCommand.Can be 'glueetl' or 'gluestreaming'.

        • PythonVersion (string) --

          Specifies the Python version. The Python version indicates the version supported for jobs of type Spark.

      • DefaultArguments (dict) --

        A map array of key-value pairs. Max is 75 pairs.

        • (string) --

          • (string) --

      • Connections (dict) --

        The number of connections used for the session.

        • Connections (list) --

          A list of connections used by the job.

          • (string) --

      • Progress (float) --

        The code execution progress of the session.

      • MaxCapacity (float) --

        The number of AWS Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

      • SecurityConfiguration (string) --

        The name of the SecurityConfiguration structure to be used with the session.

      • GlueVersion (string) --

        The Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The GlueVersion must be greater than 2.0.

GetStatement (new) Link ¶

Retrieves the statement.

See also: AWS API Documentation

Request Syntax

client.get_statement(
    SessionId='string',
    Id=123,
    RequestOrigin='string'
)
type SessionId:

string

param SessionId:

[REQUIRED]

The Session ID of the statement.

type Id:

integer

param Id:

[REQUIRED]

The Id of the statement.

type RequestOrigin:

string

param RequestOrigin:

The origin of the request.

rtype:

dict

returns:

Response Syntax

{
    'Statement': {
        'Id': 123,
        'Code': 'string',
        'State': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
        'Output': {
            'Data': {
                'TextPlain': 'string'
            },
            'ExecutionCount': 123,
            'Status': 'WAITING'|'RUNNING'|'AVAILABLE'|'CANCELLING'|'CANCELLED'|'ERROR',
            'ErrorName': 'string',
            'ErrorValue': 'string',
            'Traceback': [
                'string',
            ]
        },
        'Progress': 123.0,
        'StartedOn': 123,
        'CompletedOn': 123
    }
}

Response Structure

  • (dict) --

    • Statement (dict) --

      Returns the statement.

      • Id (integer) --

        The ID of the statement.

      • Code (string) --

        The execution code of the statement.

      • State (string) --

        The state while request is actioned.

      • Output (dict) --

        The output in JSON.

        • Data (dict) --

          The code execution output.

          • TextPlain (string) --

            The code execution output in text format.

        • ExecutionCount (integer) --

          The execution count of the output.

        • Status (string) --

          The status of the code execution output.

        • ErrorName (string) --

          The name of the error in the output.

        • ErrorValue (string) --

          The error value of the output.

        • Traceback (list) --

          The traceback of the output.

          • (string) --

      • Progress (float) --

        The code execution progress.

      • StartedOn (integer) --

        The unix time and date that the job definition was started.

      • CompletedOn (integer) --

        The unix time and date that the job definition was completed.

RunStatement (new) Link ¶

Executes the statement.

See also: AWS API Documentation

Request Syntax

client.run_statement(
    SessionId='string',
    Code='string',
    RequestOrigin='string'
)
type SessionId:

string

param SessionId:

[REQUIRED]

The Session Id of the statement to be run.

type Code:

string

param Code:

[REQUIRED]

The statement code to be run.

type RequestOrigin:

string

param RequestOrigin:

The origin of the request.

rtype:

dict

returns:

Response Syntax

{
    'Id': 123
}

Response Structure

  • (dict) --

    • Id (integer) --

      Returns the Id of the statement that was run.

GetSession (new) Link ¶

Retrieves the session.

See also: AWS API Documentation

Request Syntax

client.get_session(
    Id='string',
    RequestOrigin='string'
)
type Id:

string

param Id:

[REQUIRED]

The ID of the session.

type RequestOrigin:

string

param RequestOrigin:

The origin of the request.

rtype:

dict

returns:

Response Syntax

{
    'Session': {
        'Id': 'string',
        'CreatedOn': datetime(2015, 1, 1),
        'Status': 'PROVISIONING'|'READY'|'FAILED'|'TIMEOUT'|'STOPPING'|'STOPPED',
        'ErrorMessage': 'string',
        'Description': 'string',
        'Role': 'string',
        'Command': {
            'Name': 'string',
            'PythonVersion': 'string'
        },
        'DefaultArguments': {
            'string': 'string'
        },
        'Connections': {
            'Connections': [
                'string',
            ]
        },
        'Progress': 123.0,
        'MaxCapacity': 123.0,
        'SecurityConfiguration': 'string',
        'GlueVersion': 'string'
    }
}

Response Structure

  • (dict) --

    • Session (dict) --

      The session object is returned in the response.

      • Id (string) --

        The ID of the session.

      • CreatedOn (datetime) --

        The time and date when the session was created.

      • Status (string) --

        The session status.

      • ErrorMessage (string) --

        The error message displayed during the session.

      • Description (string) --

        The description of the session.

      • Role (string) --

        The name or Amazon Resource Name (ARN) of the IAM role associated with the Session.

      • Command (dict) --

        The command object.See SessionCommand.

        • Name (string) --

          Specifies the name of the SessionCommand.Can be 'glueetl' or 'gluestreaming'.

        • PythonVersion (string) --

          Specifies the Python version. The Python version indicates the version supported for jobs of type Spark.

      • DefaultArguments (dict) --

        A map array of key-value pairs. Max is 75 pairs.

        • (string) --

          • (string) --

      • Connections (dict) --

        The number of connections used for the session.

        • Connections (list) --

          A list of connections used by the job.

          • (string) --

      • Progress (float) --

        The code execution progress of the session.

      • MaxCapacity (float) --

        The number of AWS Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

      • SecurityConfiguration (string) --

        The name of the SecurityConfiguration structure to be used with the session.

      • GlueVersion (string) --

        The Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The GlueVersion must be greater than 2.0.

StopSession (new) Link ¶

Stops the session.

See also: AWS API Documentation

Request Syntax

client.stop_session(
    Id='string',
    RequestOrigin='string'
)
type Id:

string

param Id:

[REQUIRED]

The ID of the session to be stopped.

type RequestOrigin:

string

param RequestOrigin:

The origin of the request.

rtype:

dict

returns:

Response Syntax

{
    'Id': 'string'
}

Response Structure

  • (dict) --

    • Id (string) --

      Returns the Id of the stopped session.

ListSessions (new) Link ¶

Retrieve a session..

See also: AWS API Documentation

Request Syntax

client.list_sessions(
    NextToken='string',
    MaxResults=123,
    Tags={
        'string': 'string'
    },
    RequestOrigin='string'
)
type NextToken:

string

param NextToken:

The token for the next set of results, or null if there are no more result.

type MaxResults:

integer

param MaxResults:

The maximum number of results.

type Tags:

dict

param Tags:

Tags belonging to the session.

  • (string) --

    • (string) --

type RequestOrigin:

string

param RequestOrigin:

The origin of the request.

rtype:

dict

returns:

Response Syntax

{
    'Ids': [
        'string',
    ],
    'Sessions': [
        {
            'Id': 'string',
            'CreatedOn': datetime(2015, 1, 1),
            'Status': 'PROVISIONING'|'READY'|'FAILED'|'TIMEOUT'|'STOPPING'|'STOPPED',
            'ErrorMessage': 'string',
            'Description': 'string',
            'Role': 'string',
            'Command': {
                'Name': 'string',
                'PythonVersion': 'string'
            },
            'DefaultArguments': {
                'string': 'string'
            },
            'Connections': {
                'Connections': [
                    'string',
                ]
            },
            'Progress': 123.0,
            'MaxCapacity': 123.0,
            'SecurityConfiguration': 'string',
            'GlueVersion': 'string'
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • Ids (list) --

      Returns the Id of the session.

      • (string) --

    • Sessions (list) --

      Returns the session object.

      • (dict) --

        The period in which a remote Spark runtime environment is running.

        • Id (string) --

          The ID of the session.

        • CreatedOn (datetime) --

          The time and date when the session was created.

        • Status (string) --

          The session status.

        • ErrorMessage (string) --

          The error message displayed during the session.

        • Description (string) --

          The description of the session.

        • Role (string) --

          The name or Amazon Resource Name (ARN) of the IAM role associated with the Session.

        • Command (dict) --

          The command object.See SessionCommand.

          • Name (string) --

            Specifies the name of the SessionCommand.Can be 'glueetl' or 'gluestreaming'.

          • PythonVersion (string) --

            Specifies the Python version. The Python version indicates the version supported for jobs of type Spark.

        • DefaultArguments (dict) --

          A map array of key-value pairs. Max is 75 pairs.

          • (string) --

            • (string) --

        • Connections (dict) --

          The number of connections used for the session.

          • Connections (list) --

            A list of connections used by the job.

            • (string) --

        • Progress (float) --

          The code execution progress of the session.

        • MaxCapacity (float) --

          The number of AWS Glue data processing units (DPUs) that can be allocated when the job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB memory.

        • SecurityConfiguration (string) --

          The name of the SecurityConfiguration structure to be used with the session.

        • GlueVersion (string) --

          The Glue version determines the versions of Apache Spark and Python that AWS Glue supports. The GlueVersion must be greater than 2.0.

    • NextToken (string) --

      The token for the next set of results, or null if there are no more result.

GetUnfilteredPartitionMetadata (updated) Link ¶
Changes (request)
{'AuditContext': {'AllColumnsRequested': 'boolean',
                  'RequestedColumns': ['string']}}

See also: AWS API Documentation

Request Syntax

client.get_unfiltered_partition_metadata(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    PartitionValues=[
        'string',
    ],
    AuditContext={
        'AdditionalAuditContext': 'string',
        'RequestedColumns': [
            'string',
        ],
        'AllColumnsRequested': True|False
    },
    SupportedPermissionTypes=[
        'COLUMN_PERMISSION'|'CELL_FILTER_PERMISSION',
    ]
)
type CatalogId:

string

param CatalogId:

[REQUIRED]

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

type TableName:

string

param TableName:

[REQUIRED]

type PartitionValues:

list

param PartitionValues:

[REQUIRED]

  • (string) --

type AuditContext:

dict

param AuditContext:

A structure containing information for audit.

  • AdditionalAuditContext (string) --

    The context for the audit..

  • RequestedColumns (list) --

    The requested columns for audit.

    • (string) --

  • AllColumnsRequested (boolean) --

    All columns request for audit.

type SupportedPermissionTypes:

list

param SupportedPermissionTypes:

[REQUIRED]

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'Partition': {
        'Values': [
            'string',
        ],
        'DatabaseName': 'string',
        'TableName': 'string',
        'CreationTime': datetime(2015, 1, 1),
        'LastAccessTime': datetime(2015, 1, 1),
        'StorageDescriptor': {
            'Columns': [
                {
                    'Name': 'string',
                    'Type': 'string',
                    'Comment': 'string',
                    'Parameters': {
                        'string': 'string'
                    }
                },
            ],
            'Location': 'string',
            'AdditionalLocations': [
                'string',
            ],
            'InputFormat': 'string',
            'OutputFormat': 'string',
            'Compressed': True|False,
            'NumberOfBuckets': 123,
            'SerdeInfo': {
                'Name': 'string',
                'SerializationLibrary': 'string',
                'Parameters': {
                    'string': 'string'
                }
            },
            'BucketColumns': [
                'string',
            ],
            'SortColumns': [
                {
                    'Column': 'string',
                    'SortOrder': 123
                },
            ],
            'Parameters': {
                'string': 'string'
            },
            'SkewedInfo': {
                'SkewedColumnNames': [
                    'string',
                ],
                'SkewedColumnValues': [
                    'string',
                ],
                'SkewedColumnValueLocationMaps': {
                    'string': 'string'
                }
            },
            'StoredAsSubDirectories': True|False,
            'SchemaReference': {
                'SchemaId': {
                    'SchemaArn': 'string',
                    'SchemaName': 'string',
                    'RegistryName': 'string'
                },
                'SchemaVersionId': 'string',
                'SchemaVersionNumber': 123
            }
        },
        'Parameters': {
            'string': 'string'
        },
        'LastAnalyzedTime': datetime(2015, 1, 1),
        'CatalogId': 'string'
    },
    'AuthorizedColumns': [
        'string',
    ],
    'IsRegisteredWithLakeFormation': True|False
}

Response Structure

  • (dict) --

    • Partition (dict) --

      Represents a slice of table data.

      • Values (list) --

        The values of the partition.

        • (string) --

      • DatabaseName (string) --

        The name of the catalog database in which to create the partition.

      • TableName (string) --

        The name of the database table in which to create the partition.

      • CreationTime (datetime) --

        The time at which the partition was created.

      • LastAccessTime (datetime) --

        The last time at which the partition was accessed.

      • StorageDescriptor (dict) --

        Provides information about the physical location where the partition is stored.

        • Columns (list) --

          A list of the Columns in the table.

          • (dict) --

            A column in a Table.

            • Name (string) --

              The name of the Column.

            • Type (string) --

              The data type of the Column.

            • Comment (string) --

              A free-form text comment.

            • Parameters (dict) --

              These key-value pairs define properties associated with the column.

              • (string) --

                • (string) --

        • Location (string) --

          The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

        • AdditionalLocations (list) --

          • (string) --

        • InputFormat (string) --

          The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

        • OutputFormat (string) --

          The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

        • Compressed (boolean) --

          True if the data in the table is compressed, or False if not.

        • NumberOfBuckets (integer) --

          Must be specified if the table contains any dimension columns.

        • SerdeInfo (dict) --

          The serialization/deserialization (SerDe) information.

          • Name (string) --

            Name of the SerDe.

          • SerializationLibrary (string) --

            Usually the class that implements the SerDe. An example is org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.

          • Parameters (dict) --

            These key-value pairs define initialization parameters for the SerDe.

            • (string) --

              • (string) --

        • BucketColumns (list) --

          A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

          • (string) --

        • SortColumns (list) --

          A list specifying the sort order of each bucket in the table.

          • (dict) --

            Specifies the sort order of a sorted column.

            • Column (string) --

              The name of the column.

            • SortOrder (integer) --

              Indicates that the column is sorted in ascending order ( == 1), or in descending order ( ==0).

        • Parameters (dict) --

          The user-supplied properties in key-value form.

          • (string) --

            • (string) --

        • SkewedInfo (dict) --

          The information about values that appear frequently in a column (skewed values).

          • SkewedColumnNames (list) --

            A list of names of columns that contain skewed values.

            • (string) --

          • SkewedColumnValues (list) --

            A list of values that appear so frequently as to be considered skewed.

            • (string) --

          • SkewedColumnValueLocationMaps (dict) --

            A mapping of skewed values to the columns that contain them.

            • (string) --

              • (string) --

        • StoredAsSubDirectories (boolean) --

          True if the table data is stored in subdirectories, or False if not.

        • SchemaReference (dict) --

          An object that references a schema stored in the Glue Schema Registry.

          When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

          • SchemaId (dict) --

            A structure that contains schema identity fields. Either this or the SchemaVersionId has to be provided.

            • SchemaArn (string) --

              The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.

            • SchemaName (string) --

              The name of the schema. One of SchemaArn or SchemaName has to be provided.

            • RegistryName (string) --

              The name of the schema registry that contains the schema.

          • SchemaVersionId (string) --

            The unique ID assigned to a version of the schema. Either this or the SchemaId has to be provided.

          • SchemaVersionNumber (integer) --

            The version number of the schema.

      • Parameters (dict) --

        These key-value pairs define partition parameters.

        • (string) --

          • (string) --

      • LastAnalyzedTime (datetime) --

        The last time at which column statistics were computed for this partition.

      • CatalogId (string) --

        The ID of the Data Catalog in which the partition resides.

    • AuthorizedColumns (list) --

      • (string) --

    • IsRegisteredWithLakeFormation (boolean) --

GetUnfilteredPartitionsMetadata (updated) Link ¶
Changes (request)
{'AuditContext': {'AllColumnsRequested': 'boolean',
                  'RequestedColumns': ['string']}}

See also: AWS API Documentation

Request Syntax

client.get_unfiltered_partitions_metadata(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    Expression='string',
    AuditContext={
        'AdditionalAuditContext': 'string',
        'RequestedColumns': [
            'string',
        ],
        'AllColumnsRequested': True|False
    },
    SupportedPermissionTypes=[
        'COLUMN_PERMISSION'|'CELL_FILTER_PERMISSION',
    ],
    NextToken='string',
    Segment={
        'SegmentNumber': 123,
        'TotalSegments': 123
    },
    MaxResults=123
)
type CatalogId:

string

param CatalogId:

[REQUIRED]

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

type TableName:

string

param TableName:

[REQUIRED]

type Expression:

string

param Expression:

type AuditContext:

dict

param AuditContext:

A structure containing information for audit.

  • AdditionalAuditContext (string) --

    The context for the audit..

  • RequestedColumns (list) --

    The requested columns for audit.

    • (string) --

  • AllColumnsRequested (boolean) --

    All columns request for audit.

type SupportedPermissionTypes:

list

param SupportedPermissionTypes:

[REQUIRED]

  • (string) --

type NextToken:

string

param NextToken:

type Segment:

dict

param Segment:

Defines a non-overlapping region of a table's partitions, allowing multiple requests to be run in parallel.

  • SegmentNumber (integer) -- [REQUIRED]

    The zero-based index number of the segment. For example, if the total number of segments is 4, SegmentNumber values range from 0 through 3.

  • TotalSegments (integer) -- [REQUIRED]

    The total number of segments.

type MaxResults:

integer

param MaxResults:

rtype:

dict

returns:

Response Syntax

{
    'UnfilteredPartitions': [
        {
            'Partition': {
                'Values': [
                    'string',
                ],
                'DatabaseName': 'string',
                'TableName': 'string',
                'CreationTime': datetime(2015, 1, 1),
                'LastAccessTime': datetime(2015, 1, 1),
                'StorageDescriptor': {
                    'Columns': [
                        {
                            'Name': 'string',
                            'Type': 'string',
                            'Comment': 'string',
                            'Parameters': {
                                'string': 'string'
                            }
                        },
                    ],
                    'Location': 'string',
                    'AdditionalLocations': [
                        'string',
                    ],
                    'InputFormat': 'string',
                    'OutputFormat': 'string',
                    'Compressed': True|False,
                    'NumberOfBuckets': 123,
                    'SerdeInfo': {
                        'Name': 'string',
                        'SerializationLibrary': 'string',
                        'Parameters': {
                            'string': 'string'
                        }
                    },
                    'BucketColumns': [
                        'string',
                    ],
                    'SortColumns': [
                        {
                            'Column': 'string',
                            'SortOrder': 123
                        },
                    ],
                    'Parameters': {
                        'string': 'string'
                    },
                    'SkewedInfo': {
                        'SkewedColumnNames': [
                            'string',
                        ],
                        'SkewedColumnValues': [
                            'string',
                        ],
                        'SkewedColumnValueLocationMaps': {
                            'string': 'string'
                        }
                    },
                    'StoredAsSubDirectories': True|False,
                    'SchemaReference': {
                        'SchemaId': {
                            'SchemaArn': 'string',
                            'SchemaName': 'string',
                            'RegistryName': 'string'
                        },
                        'SchemaVersionId': 'string',
                        'SchemaVersionNumber': 123
                    }
                },
                'Parameters': {
                    'string': 'string'
                },
                'LastAnalyzedTime': datetime(2015, 1, 1),
                'CatalogId': 'string'
            },
            'AuthorizedColumns': [
                'string',
            ],
            'IsRegisteredWithLakeFormation': True|False
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • UnfilteredPartitions (list) --

      • (dict) --

        • Partition (dict) --

          Represents a slice of table data.

          • Values (list) --

            The values of the partition.

            • (string) --

          • DatabaseName (string) --

            The name of the catalog database in which to create the partition.

          • TableName (string) --

            The name of the database table in which to create the partition.

          • CreationTime (datetime) --

            The time at which the partition was created.

          • LastAccessTime (datetime) --

            The last time at which the partition was accessed.

          • StorageDescriptor (dict) --

            Provides information about the physical location where the partition is stored.

            • Columns (list) --

              A list of the Columns in the table.

              • (dict) --

                A column in a Table.

                • Name (string) --

                  The name of the Column.

                • Type (string) --

                  The data type of the Column.

                • Comment (string) --

                  A free-form text comment.

                • Parameters (dict) --

                  These key-value pairs define properties associated with the column.

                  • (string) --

                    • (string) --

            • Location (string) --

              The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

            • AdditionalLocations (list) --

              • (string) --

            • InputFormat (string) --

              The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

            • OutputFormat (string) --

              The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

            • Compressed (boolean) --

              True if the data in the table is compressed, or False if not.

            • NumberOfBuckets (integer) --

              Must be specified if the table contains any dimension columns.

            • SerdeInfo (dict) --

              The serialization/deserialization (SerDe) information.

              • Name (string) --

                Name of the SerDe.

              • SerializationLibrary (string) --

                Usually the class that implements the SerDe. An example is org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.

              • Parameters (dict) --

                These key-value pairs define initialization parameters for the SerDe.

                • (string) --

                  • (string) --

            • BucketColumns (list) --

              A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

              • (string) --

            • SortColumns (list) --

              A list specifying the sort order of each bucket in the table.

              • (dict) --

                Specifies the sort order of a sorted column.

                • Column (string) --

                  The name of the column.

                • SortOrder (integer) --

                  Indicates that the column is sorted in ascending order ( == 1), or in descending order ( ==0).

            • Parameters (dict) --

              The user-supplied properties in key-value form.

              • (string) --

                • (string) --

            • SkewedInfo (dict) --

              The information about values that appear frequently in a column (skewed values).

              • SkewedColumnNames (list) --

                A list of names of columns that contain skewed values.

                • (string) --

              • SkewedColumnValues (list) --

                A list of values that appear so frequently as to be considered skewed.

                • (string) --

              • SkewedColumnValueLocationMaps (dict) --

                A mapping of skewed values to the columns that contain them.

                • (string) --

                  • (string) --

            • StoredAsSubDirectories (boolean) --

              True if the table data is stored in subdirectories, or False if not.

            • SchemaReference (dict) --

              An object that references a schema stored in the Glue Schema Registry.

              When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

              • SchemaId (dict) --

                A structure that contains schema identity fields. Either this or the SchemaVersionId has to be provided.

                • SchemaArn (string) --

                  The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.

                • SchemaName (string) --

                  The name of the schema. One of SchemaArn or SchemaName has to be provided.

                • RegistryName (string) --

                  The name of the schema registry that contains the schema.

              • SchemaVersionId (string) --

                The unique ID assigned to a version of the schema. Either this or the SchemaId has to be provided.

              • SchemaVersionNumber (integer) --

                The version number of the schema.

          • Parameters (dict) --

            These key-value pairs define partition parameters.

            • (string) --

              • (string) --

          • LastAnalyzedTime (datetime) --

            The last time at which column statistics were computed for this partition.

          • CatalogId (string) --

            The ID of the Data Catalog in which the partition resides.

        • AuthorizedColumns (list) --

          • (string) --

        • IsRegisteredWithLakeFormation (boolean) --

    • NextToken (string) --

GetUnfilteredTableMetadata (updated) Link ¶
Changes (request)
{'AuditContext': {'AllColumnsRequested': 'boolean',
                  'RequestedColumns': ['string']}}

See also: AWS API Documentation

Request Syntax

client.get_unfiltered_table_metadata(
    CatalogId='string',
    DatabaseName='string',
    Name='string',
    AuditContext={
        'AdditionalAuditContext': 'string',
        'RequestedColumns': [
            'string',
        ],
        'AllColumnsRequested': True|False
    },
    SupportedPermissionTypes=[
        'COLUMN_PERMISSION'|'CELL_FILTER_PERMISSION',
    ]
)
type CatalogId:

string

param CatalogId:

[REQUIRED]

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

type Name:

string

param Name:

[REQUIRED]

type AuditContext:

dict

param AuditContext:

A structure containing information for audit.

  • AdditionalAuditContext (string) --

    The context for the audit..

  • RequestedColumns (list) --

    The requested columns for audit.

    • (string) --

  • AllColumnsRequested (boolean) --

    All columns request for audit.

type SupportedPermissionTypes:

list

param SupportedPermissionTypes:

[REQUIRED]

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'Table': {
        'Name': 'string',
        'DatabaseName': 'string',
        'Description': 'string',
        'Owner': 'string',
        'CreateTime': datetime(2015, 1, 1),
        'UpdateTime': datetime(2015, 1, 1),
        'LastAccessTime': datetime(2015, 1, 1),
        'LastAnalyzedTime': datetime(2015, 1, 1),
        'Retention': 123,
        'StorageDescriptor': {
            'Columns': [
                {
                    'Name': 'string',
                    'Type': 'string',
                    'Comment': 'string',
                    'Parameters': {
                        'string': 'string'
                    }
                },
            ],
            'Location': 'string',
            'AdditionalLocations': [
                'string',
            ],
            'InputFormat': 'string',
            'OutputFormat': 'string',
            'Compressed': True|False,
            'NumberOfBuckets': 123,
            'SerdeInfo': {
                'Name': 'string',
                'SerializationLibrary': 'string',
                'Parameters': {
                    'string': 'string'
                }
            },
            'BucketColumns': [
                'string',
            ],
            'SortColumns': [
                {
                    'Column': 'string',
                    'SortOrder': 123
                },
            ],
            'Parameters': {
                'string': 'string'
            },
            'SkewedInfo': {
                'SkewedColumnNames': [
                    'string',
                ],
                'SkewedColumnValues': [
                    'string',
                ],
                'SkewedColumnValueLocationMaps': {
                    'string': 'string'
                }
            },
            'StoredAsSubDirectories': True|False,
            'SchemaReference': {
                'SchemaId': {
                    'SchemaArn': 'string',
                    'SchemaName': 'string',
                    'RegistryName': 'string'
                },
                'SchemaVersionId': 'string',
                'SchemaVersionNumber': 123
            }
        },
        'PartitionKeys': [
            {
                'Name': 'string',
                'Type': 'string',
                'Comment': 'string',
                'Parameters': {
                    'string': 'string'
                }
            },
        ],
        'ViewOriginalText': 'string',
        'ViewExpandedText': 'string',
        'TableType': 'string',
        'Parameters': {
            'string': 'string'
        },
        'CreatedBy': 'string',
        'IsRegisteredWithLakeFormation': True|False,
        'TargetTable': {
            'CatalogId': 'string',
            'DatabaseName': 'string',
            'Name': 'string'
        },
        'CatalogId': 'string',
        'VersionId': 'string'
    },
    'AuthorizedColumns': [
        'string',
    ],
    'IsRegisteredWithLakeFormation': True|False,
    'CellFilters': [
        {
            'ColumnName': 'string',
            'RowFilterExpression': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • Table (dict) --

      Represents a collection of related data organized in columns and rows.

      • Name (string) --

        The table name. For Hive compatibility, this must be entirely lowercase.

      • DatabaseName (string) --

        The name of the database where the table metadata resides. For Hive compatibility, this must be all lowercase.

      • Description (string) --

        A description of the table.

      • Owner (string) --

        The owner of the table.

      • CreateTime (datetime) --

        The time when the table definition was created in the Data Catalog.

      • UpdateTime (datetime) --

        The last time that the table was updated.

      • LastAccessTime (datetime) --

        The last time that the table was accessed. This is usually taken from HDFS, and might not be reliable.

      • LastAnalyzedTime (datetime) --

        The last time that column statistics were computed for this table.

      • Retention (integer) --

        The retention time for this table.

      • StorageDescriptor (dict) --

        A storage descriptor containing information about the physical storage of this table.

        • Columns (list) --

          A list of the Columns in the table.

          • (dict) --

            A column in a Table.

            • Name (string) --

              The name of the Column.

            • Type (string) --

              The data type of the Column.

            • Comment (string) --

              A free-form text comment.

            • Parameters (dict) --

              These key-value pairs define properties associated with the column.

              • (string) --

                • (string) --

        • Location (string) --

          The physical location of the table. By default, this takes the form of the warehouse location, followed by the database location in the warehouse, followed by the table name.

        • AdditionalLocations (list) --

          • (string) --

        • InputFormat (string) --

          The input format: SequenceFileInputFormat (binary), or TextInputFormat, or a custom format.

        • OutputFormat (string) --

          The output format: SequenceFileOutputFormat (binary), or IgnoreKeyTextOutputFormat, or a custom format.

        • Compressed (boolean) --

          True if the data in the table is compressed, or False if not.

        • NumberOfBuckets (integer) --

          Must be specified if the table contains any dimension columns.

        • SerdeInfo (dict) --

          The serialization/deserialization (SerDe) information.

          • Name (string) --

            Name of the SerDe.

          • SerializationLibrary (string) --

            Usually the class that implements the SerDe. An example is org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe.

          • Parameters (dict) --

            These key-value pairs define initialization parameters for the SerDe.

            • (string) --

              • (string) --

        • BucketColumns (list) --

          A list of reducer grouping columns, clustering columns, and bucketing columns in the table.

          • (string) --

        • SortColumns (list) --

          A list specifying the sort order of each bucket in the table.

          • (dict) --

            Specifies the sort order of a sorted column.

            • Column (string) --

              The name of the column.

            • SortOrder (integer) --

              Indicates that the column is sorted in ascending order ( == 1), or in descending order ( ==0).

        • Parameters (dict) --

          The user-supplied properties in key-value form.

          • (string) --

            • (string) --

        • SkewedInfo (dict) --

          The information about values that appear frequently in a column (skewed values).

          • SkewedColumnNames (list) --

            A list of names of columns that contain skewed values.

            • (string) --

          • SkewedColumnValues (list) --

            A list of values that appear so frequently as to be considered skewed.

            • (string) --

          • SkewedColumnValueLocationMaps (dict) --

            A mapping of skewed values to the columns that contain them.

            • (string) --

              • (string) --

        • StoredAsSubDirectories (boolean) --

          True if the table data is stored in subdirectories, or False if not.

        • SchemaReference (dict) --

          An object that references a schema stored in the Glue Schema Registry.

          When creating a table, you can pass an empty list of columns for the schema, and instead use a schema reference.

          • SchemaId (dict) --

            A structure that contains schema identity fields. Either this or the SchemaVersionId has to be provided.

            • SchemaArn (string) --

              The Amazon Resource Name (ARN) of the schema. One of SchemaArn or SchemaName has to be provided.

            • SchemaName (string) --

              The name of the schema. One of SchemaArn or SchemaName has to be provided.

            • RegistryName (string) --

              The name of the schema registry that contains the schema.

          • SchemaVersionId (string) --

            The unique ID assigned to a version of the schema. Either this or the SchemaId has to be provided.

          • SchemaVersionNumber (integer) --

            The version number of the schema.

      • PartitionKeys (list) --

        A list of columns by which the table is partitioned. Only primitive types are supported as partition keys.

        When you create a table used by Amazon Athena, and you do not specify any partitionKeys, you must at least set the value of partitionKeys to an empty list. For example:

        "PartitionKeys": []

        • (dict) --

          A column in a Table.

          • Name (string) --

            The name of the Column.

          • Type (string) --

            The data type of the Column.

          • Comment (string) --

            A free-form text comment.

          • Parameters (dict) --

            These key-value pairs define properties associated with the column.

            • (string) --

              • (string) --

      • ViewOriginalText (string) --

        If the table is a view, the original text of the view; otherwise null.

      • ViewExpandedText (string) --

        If the table is a view, the expanded text of the view; otherwise null.

      • TableType (string) --

        The type of this table ( EXTERNAL_TABLE, VIRTUAL_VIEW, etc.).

      • Parameters (dict) --

        These key-value pairs define properties associated with the table.

        • (string) --

          • (string) --

      • CreatedBy (string) --

        The person or entity who created the table.

      • IsRegisteredWithLakeFormation (boolean) --

        Indicates whether the table has been registered with Lake Formation.

      • TargetTable (dict) --

        A TableIdentifier structure that describes a target table for resource linking.

        • CatalogId (string) --

          The ID of the Data Catalog in which the table resides.

        • DatabaseName (string) --

          The name of the catalog database that contains the target table.

        • Name (string) --

          The name of the target table.

      • CatalogId (string) --

        The ID of the Data Catalog in which the table resides.

      • VersionId (string) --

    • AuthorizedColumns (list) --

      • (string) --

    • IsRegisteredWithLakeFormation (boolean) --

    • CellFilters (list) --

      • (dict) --

        • ColumnName (string) --

        • RowFilterExpression (string) --