AWS Glue

2025/05/22 - AWS Glue - 3 updated api methods

Changes  This release supports additional ConversionSpec parameter as part of IntegrationPartition Structure in CreateIntegrationTableProperty API. This parameter is referred to apply appropriate column transformation for columns that are used for timestamp based partitioning

CreateIntegrationTableProperties (updated) Link ¶
Changes (request)
{'TargetTableConfig': {'PartitionSpec': {'ConversionSpec': 'string'}}}

This API is used to provide optional override properties for the the tables that need to be replicated. These properties can include properties for filtering and partitioning for the source and target tables. To set both source and target properties the same API need to be invoked with the Glue connection ARN as ResourceArn with SourceTableConfig, and the Glue database ARN as ResourceArn with TargetTableConfig respectively.

See also: AWS API Documentation

Request Syntax

client.create_integration_table_properties(
    ResourceArn='string',
    TableName='string',
    SourceTableConfig={
        'Fields': [
            'string',
        ],
        'FilterPredicate': 'string',
        'PrimaryKey': [
            'string',
        ],
        'RecordUpdateField': 'string'
    },
    TargetTableConfig={
        'UnnestSpec': 'TOPLEVEL'|'FULL'|'NOUNNEST',
        'PartitionSpec': [
            {
                'FieldName': 'string',
                'FunctionSpec': 'string',
                'ConversionSpec': 'string'
            },
        ],
        'TargetTableName': 'string'
    }
)
type ResourceArn:

string

param ResourceArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the target table for which to create integration table properties. Currently, this API only supports creating integration table properties for target tables, and the provided ARN should be the ARN of the target table in the Glue Data Catalog. Support for creating integration table properties for source connections (using the connection ARN) is not yet implemented and will be added in a future release.

type TableName:

string

param TableName:

[REQUIRED]

The name of the table to be replicated.

type SourceTableConfig:

dict

param SourceTableConfig:

A structure for the source table configuration. See the SourceTableConfig structure to see list of supported source properties.

  • Fields (list) --

    A list of fields used for column-level filtering. Currently unsupported.

    • (string) --

  • FilterPredicate (string) --

    A condition clause used for row-level filtering. Currently unsupported.

  • PrimaryKey (list) --

    Provide the primary key set for this table. Currently supported specifically for SAP EntityOf entities upon request. Contact Amazon Web Services Support to make this feature available.

    • (string) --

  • RecordUpdateField (string) --

    Incremental pull timestamp-based field. Currently unsupported.

type TargetTableConfig:

dict

param TargetTableConfig:

A structure for the target table configuration.

  • UnnestSpec (string) --

    Specifies how nested objects are flattened to top-level elements. Valid values are: "TOPLEVEL", "FULL", or "NOUNNEST".

  • PartitionSpec (list) --

    Determines the file layout on the target.

    • (dict) --

      A structure that describes how data is partitioned on the target.

      • FieldName (string) --

        The field name used to partition data on the target. Avoid using columns that have unique values for each row (for example, LastModifiedTimestamp, SystemModTimeStamp) as the partition column. These columns are not suitable for partitioning because they create a large number of small partitions, which can lead to performance issues.

      • FunctionSpec (string) --

        Specifies the function used to partition data on the target. The only accepted value for this parameter is 'identity' (string). The 'identity' function ensures that the data partitioning on the target follows the same scheme as the source. In other words, the partitioning structure of the source data is preserved in the target destination.

      • ConversionSpec (string) --

        Specifies the timestamp format of the source data. Valid values are:

        • epoch_sec - Unix epoch timestamp in seconds

        • epoch_milli - Unix epoch timestamp in milliseconds

        • iso - ISO 8601 formatted timestamp

  • TargetTableName (string) --

    The optional name of a target table.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

  • (dict) --

GetIntegrationTableProperties (updated) Link ¶
Changes (response)
{'TargetTableConfig': {'PartitionSpec': {'ConversionSpec': 'string'}}}

This API is used to retrieve optional override properties for the tables that need to be replicated. These properties can include properties for filtering and partition for source and target tables.

See also: AWS API Documentation

Request Syntax

client.get_integration_table_properties(
    ResourceArn='string',
    TableName='string'
)
type ResourceArn:

string

param ResourceArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the target table for which to retrieve integration table properties. Currently, this API only supports retrieving properties for target tables, and the provided ARN should be the ARN of the target table in the Glue Data Catalog. Support for retrieving integration table properties for source connections (using the connection ARN) is not yet implemented and will be added in a future release.

type TableName:

string

param TableName:

[REQUIRED]

The name of the table to be replicated.

rtype:

dict

returns:

Response Syntax

{
    'ResourceArn': 'string',
    'TableName': 'string',
    'SourceTableConfig': {
        'Fields': [
            'string',
        ],
        'FilterPredicate': 'string',
        'PrimaryKey': [
            'string',
        ],
        'RecordUpdateField': 'string'
    },
    'TargetTableConfig': {
        'UnnestSpec': 'TOPLEVEL'|'FULL'|'NOUNNEST',
        'PartitionSpec': [
            {
                'FieldName': 'string',
                'FunctionSpec': 'string',
                'ConversionSpec': 'string'
            },
        ],
        'TargetTableName': 'string'
    }
}

Response Structure

  • (dict) --

    • ResourceArn (string) --

      The Amazon Resource Name (ARN) of the target table for which to retrieve integration table properties. Currently, this API only supports retrieving properties for target tables, and the provided ARN should be the ARN of the target table in the Glue Data Catalog. Support for retrieving integration table properties for source connections (using the connection ARN) is not yet implemented and will be added in a future release.

    • TableName (string) --

      The name of the table to be replicated.

    • SourceTableConfig (dict) --

      A structure for the source table configuration.

      • Fields (list) --

        A list of fields used for column-level filtering. Currently unsupported.

        • (string) --

      • FilterPredicate (string) --

        A condition clause used for row-level filtering. Currently unsupported.

      • PrimaryKey (list) --

        Provide the primary key set for this table. Currently supported specifically for SAP EntityOf entities upon request. Contact Amazon Web Services Support to make this feature available.

        • (string) --

      • RecordUpdateField (string) --

        Incremental pull timestamp-based field. Currently unsupported.

    • TargetTableConfig (dict) --

      A structure for the target table configuration.

      • UnnestSpec (string) --

        Specifies how nested objects are flattened to top-level elements. Valid values are: "TOPLEVEL", "FULL", or "NOUNNEST".

      • PartitionSpec (list) --

        Determines the file layout on the target.

        • (dict) --

          A structure that describes how data is partitioned on the target.

          • FieldName (string) --

            The field name used to partition data on the target. Avoid using columns that have unique values for each row (for example, LastModifiedTimestamp, SystemModTimeStamp) as the partition column. These columns are not suitable for partitioning because they create a large number of small partitions, which can lead to performance issues.

          • FunctionSpec (string) --

            Specifies the function used to partition data on the target. The only accepted value for this parameter is 'identity' (string). The 'identity' function ensures that the data partitioning on the target follows the same scheme as the source. In other words, the partitioning structure of the source data is preserved in the target destination.

          • ConversionSpec (string) --

            Specifies the timestamp format of the source data. Valid values are:

            • epoch_sec - Unix epoch timestamp in seconds

            • epoch_milli - Unix epoch timestamp in milliseconds

            • iso - ISO 8601 formatted timestamp

      • TargetTableName (string) --

        The optional name of a target table.

UpdateIntegrationTableProperties (updated) Link ¶
Changes (request)
{'TargetTableConfig': {'PartitionSpec': {'ConversionSpec': 'string'}}}

This API is used to provide optional override properties for the tables that need to be replicated. These properties can include properties for filtering and partitioning for the source and target tables. To set both source and target properties the same API need to be invoked with the Glue connection ARN as ResourceArn with SourceTableConfig, and the Glue database ARN as ResourceArn with TargetTableConfig respectively.

The override will be reflected across all the integrations using same ResourceArn and source table.

See also: AWS API Documentation

Request Syntax

client.update_integration_table_properties(
    ResourceArn='string',
    TableName='string',
    SourceTableConfig={
        'Fields': [
            'string',
        ],
        'FilterPredicate': 'string',
        'PrimaryKey': [
            'string',
        ],
        'RecordUpdateField': 'string'
    },
    TargetTableConfig={
        'UnnestSpec': 'TOPLEVEL'|'FULL'|'NOUNNEST',
        'PartitionSpec': [
            {
                'FieldName': 'string',
                'FunctionSpec': 'string',
                'ConversionSpec': 'string'
            },
        ],
        'TargetTableName': 'string'
    }
)
type ResourceArn:

string

param ResourceArn:

[REQUIRED]

The connection ARN of the source, or the database ARN of the target.

type TableName:

string

param TableName:

[REQUIRED]

The name of the table to be replicated.

type SourceTableConfig:

dict

param SourceTableConfig:

A structure for the source table configuration.

  • Fields (list) --

    A list of fields used for column-level filtering. Currently unsupported.

    • (string) --

  • FilterPredicate (string) --

    A condition clause used for row-level filtering. Currently unsupported.

  • PrimaryKey (list) --

    Provide the primary key set for this table. Currently supported specifically for SAP EntityOf entities upon request. Contact Amazon Web Services Support to make this feature available.

    • (string) --

  • RecordUpdateField (string) --

    Incremental pull timestamp-based field. Currently unsupported.

type TargetTableConfig:

dict

param TargetTableConfig:

A structure for the target table configuration.

  • UnnestSpec (string) --

    Specifies how nested objects are flattened to top-level elements. Valid values are: "TOPLEVEL", "FULL", or "NOUNNEST".

  • PartitionSpec (list) --

    Determines the file layout on the target.

    • (dict) --

      A structure that describes how data is partitioned on the target.

      • FieldName (string) --

        The field name used to partition data on the target. Avoid using columns that have unique values for each row (for example, LastModifiedTimestamp, SystemModTimeStamp) as the partition column. These columns are not suitable for partitioning because they create a large number of small partitions, which can lead to performance issues.

      • FunctionSpec (string) --

        Specifies the function used to partition data on the target. The only accepted value for this parameter is 'identity' (string). The 'identity' function ensures that the data partitioning on the target follows the same scheme as the source. In other words, the partitioning structure of the source data is preserved in the target destination.

      • ConversionSpec (string) --

        Specifies the timestamp format of the source data. Valid values are:

        • epoch_sec - Unix epoch timestamp in seconds

        • epoch_milli - Unix epoch timestamp in milliseconds

        • iso - ISO 8601 formatted timestamp

  • TargetTableName (string) --

    The optional name of a target table.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

  • (dict) --