AWS Glue

2025/05/16 - AWS Glue - 5 updated api methods

Changes  Changes include (1) Excel as S3 Source type and XML and Tableau's Hyper as S3 Sink types, (2) targeted number of partitions parameter in S3 sinks and (3) new compression types in CSV/JSON and Parquet S3 sinks.

BatchGetJobs (updated) Link ¶
Changes (response)
{'Jobs': {'CodeGenConfigurationNodes': {'S3DeltaDirectTarget': {'Format': {'hyper',
                                                                           'iceberg',
                                                                           'xml'},
                                                                'NumberTargetPartitions': 'string'},
                                        'S3DirectTarget': {'Format': {'hyper',
                                                                      'iceberg',
                                                                      'xml'},
                                                           'NumberTargetPartitions': 'string'},
                                        'S3ExcelSource': {'AdditionalOptions': {'BoundedFiles': 'long',
                                                                                'BoundedSize': 'long',
                                                                                'EnableSamplePath': 'boolean',
                                                                                'SamplePath': 'string'},
                                                          'CompressionType': 'snappy '
                                                                             '| '
                                                                             'lzo '
                                                                             '| '
                                                                             'gzip '
                                                                             '| '
                                                                             'brotli '
                                                                             '| '
                                                                             'lz4 '
                                                                             '| '
                                                                             'uncompressed '
                                                                             '| '
                                                                             'none',
                                                          'Exclusions': ['string'],
                                                          'GroupFiles': 'string',
                                                          'GroupSize': 'string',
                                                          'MaxBand': 'integer',
                                                          'MaxFilesInBand': 'integer',
                                                          'Name': 'string',
                                                          'NumberRows': 'long',
                                                          'OutputSchemas': [{'Columns': [{'Name': 'string',
                                                                                          'Type': 'string'}]}],
                                                          'Paths': ['string'],
                                                          'Recurse': 'boolean',
                                                          'SkipFooter': 'integer'},
                                        'S3GlueParquetTarget': {'Compression': {'brotli',
                                                                                'lz4'},
                                                                'NumberTargetPartitions': 'string'},
                                        'S3HudiDirectTarget': {'Format': {'hyper',
                                                                          'iceberg',
                                                                          'xml'},
                                                               'NumberTargetPartitions': 'string'},
                                        'S3HyperDirectTarget': {'Compression': 'uncompressed',
                                                                'Inputs': ['string'],
                                                                'Name': 'string',
                                                                'PartitionKeys': [['string']],
                                                                'Path': 'string',
                                                                'SchemaChangePolicy': {'Database': 'string',
                                                                                       'EnableUpdateCatalog': 'boolean',
                                                                                       'Table': 'string',
                                                                                       'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                         '| '
                                                                                                         'LOG'}},
                                        'S3IcebergDirectTarget': {'AdditionalOptions': {'string': 'string'},
                                                                  'Compression': 'gzip '
                                                                                 '| '
                                                                                 'lzo '
                                                                                 '| '
                                                                                 'uncompressed '
                                                                                 '| '
                                                                                 'snappy',
                                                                  'Format': 'json '
                                                                            '| '
                                                                            'csv '
                                                                            '| '
                                                                            'avro '
                                                                            '| '
                                                                            'orc '
                                                                            '| '
                                                                            'parquet '
                                                                            '| '
                                                                            'hudi '
                                                                            '| '
                                                                            'delta '
                                                                            '| '
                                                                            'iceberg '
                                                                            '| '
                                                                            'hyper '
                                                                            '| '
                                                                            'xml',
                                                                  'Inputs': ['string'],
                                                                  'Name': 'string',
                                                                  'NumberTargetPartitions': 'string',
                                                                  'PartitionKeys': [['string']],
                                                                  'Path': 'string',
                                                                  'SchemaChangePolicy': {'Database': 'string',
                                                                                         'EnableUpdateCatalog': 'boolean',
                                                                                         'Table': 'string',
                                                                                         'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                           '| '
                                                                                                           'LOG'}},
                                        'S3ParquetSource': {'CompressionType': {'brotli',
                                                                                'lz4'}}}}}

Returns a list of resource metadata for a given list of job names. After calling the ListJobs operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

See also: AWS API Documentation

Request Syntax

client.batch_get_jobs(
    JobNames=[
        'string',
    ]
)
type JobNames:

list

param JobNames:

[REQUIRED]

A list of job names, which might be the names returned from the ListJobs operation.

  • (string) --

rtype:

dict

returns:

Response Syntax

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

Response Structure

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

CreateJob (updated) Link ¶
Changes (request)
{'CodeGenConfigurationNodes': {'S3DeltaDirectTarget': {'Format': {'hyper',
                                                                  'iceberg',
                                                                  'xml'},
                                                       'NumberTargetPartitions': 'string'},
                               'S3DirectTarget': {'Format': {'hyper',
                                                             'iceberg',
                                                             'xml'},
                                                  'NumberTargetPartitions': 'string'},
                               'S3ExcelSource': {'AdditionalOptions': {'BoundedFiles': 'long',
                                                                       'BoundedSize': 'long',
                                                                       'EnableSamplePath': 'boolean',
                                                                       'SamplePath': 'string'},
                                                 'CompressionType': 'snappy | '
                                                                    'lzo | '
                                                                    'gzip | '
                                                                    'brotli | '
                                                                    'lz4 | '
                                                                    'uncompressed '
                                                                    '| none',
                                                 'Exclusions': ['string'],
                                                 'GroupFiles': 'string',
                                                 'GroupSize': 'string',
                                                 'MaxBand': 'integer',
                                                 'MaxFilesInBand': 'integer',
                                                 'Name': 'string',
                                                 'NumberRows': 'long',
                                                 'OutputSchemas': [{'Columns': [{'Name': 'string',
                                                                                 'Type': 'string'}]}],
                                                 'Paths': ['string'],
                                                 'Recurse': 'boolean',
                                                 'SkipFooter': 'integer'},
                               'S3GlueParquetTarget': {'Compression': {'brotli',
                                                                       'lz4'},
                                                       'NumberTargetPartitions': 'string'},
                               'S3HudiDirectTarget': {'Format': {'hyper',
                                                                 'iceberg',
                                                                 'xml'},
                                                      'NumberTargetPartitions': 'string'},
                               'S3HyperDirectTarget': {'Compression': 'uncompressed',
                                                       'Inputs': ['string'],
                                                       'Name': 'string',
                                                       'PartitionKeys': [['string']],
                                                       'Path': 'string',
                                                       'SchemaChangePolicy': {'Database': 'string',
                                                                              'EnableUpdateCatalog': 'boolean',
                                                                              'Table': 'string',
                                                                              'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                '| '
                                                                                                'LOG'}},
                               'S3IcebergDirectTarget': {'AdditionalOptions': {'string': 'string'},
                                                         'Compression': 'gzip '
                                                                        '| lzo '
                                                                        '| '
                                                                        'uncompressed '
                                                                        '| '
                                                                        'snappy',
                                                         'Format': 'json | csv '
                                                                   '| avro | '
                                                                   'orc | '
                                                                   'parquet | '
                                                                   'hudi | '
                                                                   'delta | '
                                                                   'iceberg | '
                                                                   'hyper | '
                                                                   'xml',
                                                         'Inputs': ['string'],
                                                         'Name': 'string',
                                                         'NumberTargetPartitions': 'string',
                                                         'PartitionKeys': [['string']],
                                                         'Path': 'string',
                                                         'SchemaChangePolicy': {'Database': 'string',
                                                                                'EnableUpdateCatalog': 'boolean',
                                                                                'Table': 'string',
                                                                                'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                  '| '
                                                                                                  'LOG'}},
                               'S3ParquetSource': {'CompressionType': {'brotli',
                                                                       'lz4'}}}}

Creates a new job definition.

See also: AWS API Documentation

Request Syntax

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

Parameters

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

rtype:

dict

returns:

Response Syntax

{
    'Name': 'string'
}

Response Structure

  • (dict) --

    • Name (string) --

      The unique name that was provided for this job definition.

GetJob (updated) Link ¶
Changes (response)
{'Job': {'CodeGenConfigurationNodes': {'S3DeltaDirectTarget': {'Format': {'hyper',
                                                                          'iceberg',
                                                                          'xml'},
                                                               'NumberTargetPartitions': 'string'},
                                       'S3DirectTarget': {'Format': {'hyper',
                                                                     'iceberg',
                                                                     'xml'},
                                                          'NumberTargetPartitions': 'string'},
                                       'S3ExcelSource': {'AdditionalOptions': {'BoundedFiles': 'long',
                                                                               'BoundedSize': 'long',
                                                                               'EnableSamplePath': 'boolean',
                                                                               'SamplePath': 'string'},
                                                         'CompressionType': 'snappy '
                                                                            '| '
                                                                            'lzo '
                                                                            '| '
                                                                            'gzip '
                                                                            '| '
                                                                            'brotli '
                                                                            '| '
                                                                            'lz4 '
                                                                            '| '
                                                                            'uncompressed '
                                                                            '| '
                                                                            'none',
                                                         'Exclusions': ['string'],
                                                         'GroupFiles': 'string',
                                                         'GroupSize': 'string',
                                                         'MaxBand': 'integer',
                                                         'MaxFilesInBand': 'integer',
                                                         'Name': 'string',
                                                         'NumberRows': 'long',
                                                         'OutputSchemas': [{'Columns': [{'Name': 'string',
                                                                                         'Type': 'string'}]}],
                                                         'Paths': ['string'],
                                                         'Recurse': 'boolean',
                                                         'SkipFooter': 'integer'},
                                       'S3GlueParquetTarget': {'Compression': {'brotli',
                                                                               'lz4'},
                                                               'NumberTargetPartitions': 'string'},
                                       'S3HudiDirectTarget': {'Format': {'hyper',
                                                                         'iceberg',
                                                                         'xml'},
                                                              'NumberTargetPartitions': 'string'},
                                       'S3HyperDirectTarget': {'Compression': 'uncompressed',
                                                               'Inputs': ['string'],
                                                               'Name': 'string',
                                                               'PartitionKeys': [['string']],
                                                               'Path': 'string',
                                                               'SchemaChangePolicy': {'Database': 'string',
                                                                                      'EnableUpdateCatalog': 'boolean',
                                                                                      'Table': 'string',
                                                                                      'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                        '| '
                                                                                                        'LOG'}},
                                       'S3IcebergDirectTarget': {'AdditionalOptions': {'string': 'string'},
                                                                 'Compression': 'gzip '
                                                                                '| '
                                                                                'lzo '
                                                                                '| '
                                                                                'uncompressed '
                                                                                '| '
                                                                                'snappy',
                                                                 'Format': 'json '
                                                                           '| '
                                                                           'csv '
                                                                           '| '
                                                                           'avro '
                                                                           '| '
                                                                           'orc '
                                                                           '| '
                                                                           'parquet '
                                                                           '| '
                                                                           'hudi '
                                                                           '| '
                                                                           'delta '
                                                                           '| '
                                                                           'iceberg '
                                                                           '| '
                                                                           'hyper '
                                                                           '| '
                                                                           'xml',
                                                                 'Inputs': ['string'],
                                                                 'Name': 'string',
                                                                 'NumberTargetPartitions': 'string',
                                                                 'PartitionKeys': [['string']],
                                                                 'Path': 'string',
                                                                 'SchemaChangePolicy': {'Database': 'string',
                                                                                        'EnableUpdateCatalog': 'boolean',
                                                                                        'Table': 'string',
                                                                                        'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                          '| '
                                                                                                          'LOG'}},
                                       'S3ParquetSource': {'CompressionType': {'brotli',
                                                                               'lz4'}}}}}

Retrieves an existing job definition.

See also: AWS API Documentation

Request Syntax

client.get_job(
    JobName='string'
)
type JobName:

string

param JobName:

[REQUIRED]

The name of the job definition to retrieve.

rtype:

dict

returns:

Response Syntax

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

Response Structure

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

GetJobs (updated) Link ¶
Changes (response)
{'Jobs': {'CodeGenConfigurationNodes': {'S3DeltaDirectTarget': {'Format': {'hyper',
                                                                           'iceberg',
                                                                           'xml'},
                                                                'NumberTargetPartitions': 'string'},
                                        'S3DirectTarget': {'Format': {'hyper',
                                                                      'iceberg',
                                                                      'xml'},
                                                           'NumberTargetPartitions': 'string'},
                                        'S3ExcelSource': {'AdditionalOptions': {'BoundedFiles': 'long',
                                                                                'BoundedSize': 'long',
                                                                                'EnableSamplePath': 'boolean',
                                                                                'SamplePath': 'string'},
                                                          'CompressionType': 'snappy '
                                                                             '| '
                                                                             'lzo '
                                                                             '| '
                                                                             'gzip '
                                                                             '| '
                                                                             'brotli '
                                                                             '| '
                                                                             'lz4 '
                                                                             '| '
                                                                             'uncompressed '
                                                                             '| '
                                                                             'none',
                                                          'Exclusions': ['string'],
                                                          'GroupFiles': 'string',
                                                          'GroupSize': 'string',
                                                          'MaxBand': 'integer',
                                                          'MaxFilesInBand': 'integer',
                                                          'Name': 'string',
                                                          'NumberRows': 'long',
                                                          'OutputSchemas': [{'Columns': [{'Name': 'string',
                                                                                          'Type': 'string'}]}],
                                                          'Paths': ['string'],
                                                          'Recurse': 'boolean',
                                                          'SkipFooter': 'integer'},
                                        'S3GlueParquetTarget': {'Compression': {'brotli',
                                                                                'lz4'},
                                                                'NumberTargetPartitions': 'string'},
                                        'S3HudiDirectTarget': {'Format': {'hyper',
                                                                          'iceberg',
                                                                          'xml'},
                                                               'NumberTargetPartitions': 'string'},
                                        'S3HyperDirectTarget': {'Compression': 'uncompressed',
                                                                'Inputs': ['string'],
                                                                'Name': 'string',
                                                                'PartitionKeys': [['string']],
                                                                'Path': 'string',
                                                                'SchemaChangePolicy': {'Database': 'string',
                                                                                       'EnableUpdateCatalog': 'boolean',
                                                                                       'Table': 'string',
                                                                                       'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                         '| '
                                                                                                         'LOG'}},
                                        'S3IcebergDirectTarget': {'AdditionalOptions': {'string': 'string'},
                                                                  'Compression': 'gzip '
                                                                                 '| '
                                                                                 'lzo '
                                                                                 '| '
                                                                                 'uncompressed '
                                                                                 '| '
                                                                                 'snappy',
                                                                  'Format': 'json '
                                                                            '| '
                                                                            'csv '
                                                                            '| '
                                                                            'avro '
                                                                            '| '
                                                                            'orc '
                                                                            '| '
                                                                            'parquet '
                                                                            '| '
                                                                            'hudi '
                                                                            '| '
                                                                            'delta '
                                                                            '| '
                                                                            'iceberg '
                                                                            '| '
                                                                            'hyper '
                                                                            '| '
                                                                            'xml',
                                                                  'Inputs': ['string'],
                                                                  'Name': 'string',
                                                                  'NumberTargetPartitions': 'string',
                                                                  'PartitionKeys': [['string']],
                                                                  'Path': 'string',
                                                                  'SchemaChangePolicy': {'Database': 'string',
                                                                                         'EnableUpdateCatalog': 'boolean',
                                                                                         'Table': 'string',
                                                                                         'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                           '| '
                                                                                                           'LOG'}},
                                        'S3ParquetSource': {'CompressionType': {'brotli',
                                                                                'lz4'}}}}}

Retrieves all current job definitions.

See also: AWS API Documentation

Request Syntax

client.get_jobs(
    NextToken='string',
    MaxResults=123
)
type NextToken:

string

param NextToken:

A continuation token, if this is a continuation call.

type MaxResults:

integer

param MaxResults:

The maximum size of the response.

rtype:

dict

returns:

Response Syntax

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

Response Structure

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

UpdateJob (updated) Link ¶
Changes (request)
{'JobUpdate': {'CodeGenConfigurationNodes': {'S3DeltaDirectTarget': {'Format': {'hyper',
                                                                                'iceberg',
                                                                                'xml'},
                                                                     'NumberTargetPartitions': 'string'},
                                             'S3DirectTarget': {'Format': {'hyper',
                                                                           'iceberg',
                                                                           'xml'},
                                                                'NumberTargetPartitions': 'string'},
                                             'S3ExcelSource': {'AdditionalOptions': {'BoundedFiles': 'long',
                                                                                     'BoundedSize': 'long',
                                                                                     'EnableSamplePath': 'boolean',
                                                                                     'SamplePath': 'string'},
                                                               'CompressionType': 'snappy '
                                                                                  '| '
                                                                                  'lzo '
                                                                                  '| '
                                                                                  'gzip '
                                                                                  '| '
                                                                                  'brotli '
                                                                                  '| '
                                                                                  'lz4 '
                                                                                  '| '
                                                                                  'uncompressed '
                                                                                  '| '
                                                                                  'none',
                                                               'Exclusions': ['string'],
                                                               'GroupFiles': 'string',
                                                               'GroupSize': 'string',
                                                               'MaxBand': 'integer',
                                                               'MaxFilesInBand': 'integer',
                                                               'Name': 'string',
                                                               'NumberRows': 'long',
                                                               'OutputSchemas': [{'Columns': [{'Name': 'string',
                                                                                               'Type': 'string'}]}],
                                                               'Paths': ['string'],
                                                               'Recurse': 'boolean',
                                                               'SkipFooter': 'integer'},
                                             'S3GlueParquetTarget': {'Compression': {'brotli',
                                                                                     'lz4'},
                                                                     'NumberTargetPartitions': 'string'},
                                             'S3HudiDirectTarget': {'Format': {'hyper',
                                                                               'iceberg',
                                                                               'xml'},
                                                                    'NumberTargetPartitions': 'string'},
                                             'S3HyperDirectTarget': {'Compression': 'uncompressed',
                                                                     'Inputs': ['string'],
                                                                     'Name': 'string',
                                                                     'PartitionKeys': [['string']],
                                                                     'Path': 'string',
                                                                     'SchemaChangePolicy': {'Database': 'string',
                                                                                            'EnableUpdateCatalog': 'boolean',
                                                                                            'Table': 'string',
                                                                                            'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                              '| '
                                                                                                              'LOG'}},
                                             'S3IcebergDirectTarget': {'AdditionalOptions': {'string': 'string'},
                                                                       'Compression': 'gzip '
                                                                                      '| '
                                                                                      'lzo '
                                                                                      '| '
                                                                                      'uncompressed '
                                                                                      '| '
                                                                                      'snappy',
                                                                       'Format': 'json '
                                                                                 '| '
                                                                                 'csv '
                                                                                 '| '
                                                                                 'avro '
                                                                                 '| '
                                                                                 'orc '
                                                                                 '| '
                                                                                 'parquet '
                                                                                 '| '
                                                                                 'hudi '
                                                                                 '| '
                                                                                 'delta '
                                                                                 '| '
                                                                                 'iceberg '
                                                                                 '| '
                                                                                 'hyper '
                                                                                 '| '
                                                                                 'xml',
                                                                       'Inputs': ['string'],
                                                                       'Name': 'string',
                                                                       'NumberTargetPartitions': 'string',
                                                                       'PartitionKeys': [['string']],
                                                                       'Path': 'string',
                                                                       'SchemaChangePolicy': {'Database': 'string',
                                                                                              'EnableUpdateCatalog': 'boolean',
                                                                                              'Table': 'string',
                                                                                              'UpdateBehavior': 'UPDATE_IN_DATABASE '
                                                                                                                '| '
                                                                                                                'LOG'}},
                                             'S3ParquetSource': {'CompressionType': {'brotli',
                                                                                     'lz4'}}}}}

Updates an existing job definition. The previous job definition is completely overwritten by this information.

See also: AWS API Documentation

Request Syntax

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

Parameters

# This section is too large to render.
# Please see the AWS API Documentation linked below.

AWS API Documentation

rtype:

dict

returns:

Response Syntax

{
    'JobName': 'string'
}

Response Structure

  • (dict) --

    • JobName (string) --

      Returns the name of the updated job definition.