AWS Glue

2020/06/25 - AWS Glue - 6 new api methods

Changes  Update glue client to latest version

DeleteColumnStatisticsForTable (new) Link ¶

Retrieves table statistics of columns.

See also: AWS API Documentation

Request Syntax

client.delete_column_statistics_for_table(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    ColumnName='string'
)
type CatalogId:

string

param CatalogId:

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

The name of the catalog database where the partitions reside.

type TableName:

string

param TableName:

[REQUIRED]

The name of the partitions' table.

type ColumnName:

string

param ColumnName:

[REQUIRED]

The name of the column.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

  • (dict) --

UpdateColumnStatisticsForTable (new) Link ¶

Creates or updates table statistics of columns.

See also: AWS API Documentation

Request Syntax

client.update_column_statistics_for_table(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    ColumnStatisticsList=[
        {
            'ColumnName': 'string',
            'ColumnType': 'string',
            'AnalyzedTime': datetime(2015, 1, 1),
            'StatisticsData': {
                'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                'BooleanColumnStatisticsData': {
                    'NumberOfTrues': 123,
                    'NumberOfFalses': 123,
                    'NumberOfNulls': 123
                },
                'DateColumnStatisticsData': {
                    'MinimumValue': datetime(2015, 1, 1),
                    'MaximumValue': datetime(2015, 1, 1),
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DecimalColumnStatisticsData': {
                    'MinimumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'MaximumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DoubleColumnStatisticsData': {
                    'MinimumValue': 123.0,
                    'MaximumValue': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'LongColumnStatisticsData': {
                    'MinimumValue': 123,
                    'MaximumValue': 123,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'StringColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'BinaryColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123
                }
            }
        },
    ]
)
type CatalogId:

string

param CatalogId:

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

The name of the catalog database where the partitions reside.

type TableName:

string

param TableName:

[REQUIRED]

The name of the partitions' table.

type ColumnStatisticsList:

list

param ColumnStatisticsList:

[REQUIRED]

A list of the column statistics.

  • (dict) --

    Defines a column statistics.

    • ColumnName (string) -- [REQUIRED]

      The name of the column.

    • ColumnType (string) -- [REQUIRED]

      The type of the column.

    • AnalyzedTime (datetime) -- [REQUIRED]

      The analyzed time of the column statistics.

    • StatisticsData (dict) -- [REQUIRED]

      The statistics of the column.

      • Type (string) -- [REQUIRED]

        The name of the column.

      • BooleanColumnStatisticsData (dict) --

        Boolean Column Statistics Data.

        • NumberOfTrues (integer) -- [REQUIRED]

          Number of true value.

        • NumberOfFalses (integer) -- [REQUIRED]

          Number of false value.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

      • DateColumnStatisticsData (dict) --

        Date Column Statistics Data.

        • MinimumValue (datetime) --

          Minimum value of the column.

        • MaximumValue (datetime) --

          Maximum value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • DecimalColumnStatisticsData (dict) --

        Decimal Column Statistics Data.

        • MinimumValue (dict) --

          Minimum value of the column.

          • UnscaledValue (bytes) -- [REQUIRED]

            The unscaled numeric value.

          • Scale (integer) -- [REQUIRED]

            The scale that determines where the decimal point falls in the unscaled value.

        • MaximumValue (dict) --

          Maximum value of the column.

          • UnscaledValue (bytes) -- [REQUIRED]

            The unscaled numeric value.

          • Scale (integer) -- [REQUIRED]

            The scale that determines where the decimal point falls in the unscaled value.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • DoubleColumnStatisticsData (dict) --

        Double Column Statistics Data.

        • MinimumValue (float) --

          Minimum value of the column.

        • MaximumValue (float) --

          Maximum value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • LongColumnStatisticsData (dict) --

        Long Column Statistics Data.

        • MinimumValue (integer) --

          Minimum value of the column.

        • MaximumValue (integer) --

          Maximum value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • StringColumnStatisticsData (dict) --

        String Column Statistics Data.

        • MaximumLength (integer) -- [REQUIRED]

          Maximum value of the column.

        • AverageLength (float) -- [REQUIRED]

          Average value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • BinaryColumnStatisticsData (dict) --

        Binary Column Statistics Data.

        • MaximumLength (integer) -- [REQUIRED]

          Maximum length of the column.

        • AverageLength (float) -- [REQUIRED]

          Average length of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

rtype:

dict

returns:

Response Syntax

{
    'Errors': [
        {
            'ColumnStatistics': {
                'ColumnName': 'string',
                'ColumnType': 'string',
                'AnalyzedTime': datetime(2015, 1, 1),
                'StatisticsData': {
                    'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                    'BooleanColumnStatisticsData': {
                        'NumberOfTrues': 123,
                        'NumberOfFalses': 123,
                        'NumberOfNulls': 123
                    },
                    'DateColumnStatisticsData': {
                        'MinimumValue': datetime(2015, 1, 1),
                        'MaximumValue': datetime(2015, 1, 1),
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'DecimalColumnStatisticsData': {
                        'MinimumValue': {
                            'UnscaledValue': b'bytes',
                            'Scale': 123
                        },
                        'MaximumValue': {
                            'UnscaledValue': b'bytes',
                            'Scale': 123
                        },
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'DoubleColumnStatisticsData': {
                        'MinimumValue': 123.0,
                        'MaximumValue': 123.0,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'LongColumnStatisticsData': {
                        'MinimumValue': 123,
                        'MaximumValue': 123,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'StringColumnStatisticsData': {
                        'MaximumLength': 123,
                        'AverageLength': 123.0,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'BinaryColumnStatisticsData': {
                        'MaximumLength': 123,
                        'AverageLength': 123.0,
                        'NumberOfNulls': 123
                    }
                }
            },
            'Error': {
                'ErrorCode': 'string',
                'ErrorMessage': 'string'
            }
        },
    ]
}

Response Structure

  • (dict) --

    • Errors (list) --

      List of ColumnStatisticsErrors.

      • (dict) --

        Defines a column containing error.

        • ColumnStatistics (dict) --

          The ColumnStatistics of the column.

          • ColumnName (string) --

            The name of the column.

          • ColumnType (string) --

            The type of the column.

          • AnalyzedTime (datetime) --

            The analyzed time of the column statistics.

          • StatisticsData (dict) --

            The statistics of the column.

            • Type (string) --

              The name of the column.

            • BooleanColumnStatisticsData (dict) --

              Boolean Column Statistics Data.

              • NumberOfTrues (integer) --

                Number of true value.

              • NumberOfFalses (integer) --

                Number of false value.

              • NumberOfNulls (integer) --

                Number of nulls.

            • DateColumnStatisticsData (dict) --

              Date Column Statistics Data.

              • MinimumValue (datetime) --

                Minimum value of the column.

              • MaximumValue (datetime) --

                Maximum value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • DecimalColumnStatisticsData (dict) --

              Decimal Column Statistics Data.

              • MinimumValue (dict) --

                Minimum value of the column.

                • UnscaledValue (bytes) --

                  The unscaled numeric value.

                • Scale (integer) --

                  The scale that determines where the decimal point falls in the unscaled value.

              • MaximumValue (dict) --

                Maximum value of the column.

                • UnscaledValue (bytes) --

                  The unscaled numeric value.

                • Scale (integer) --

                  The scale that determines where the decimal point falls in the unscaled value.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • DoubleColumnStatisticsData (dict) --

              Double Column Statistics Data.

              • MinimumValue (float) --

                Minimum value of the column.

              • MaximumValue (float) --

                Maximum value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • LongColumnStatisticsData (dict) --

              Long Column Statistics Data.

              • MinimumValue (integer) --

                Minimum value of the column.

              • MaximumValue (integer) --

                Maximum value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • StringColumnStatisticsData (dict) --

              String Column Statistics Data.

              • MaximumLength (integer) --

                Maximum value of the column.

              • AverageLength (float) --

                Average value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • BinaryColumnStatisticsData (dict) --

              Binary Column Statistics Data.

              • MaximumLength (integer) --

                Maximum length of the column.

              • AverageLength (float) --

                Average length of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

        • Error (dict) --

          The error message occurred during operation.

          • ErrorCode (string) --

            The code associated with this error.

          • ErrorMessage (string) --

            A message describing the error.

GetColumnStatisticsForPartition (new) Link ¶

Retrieves partition statistics of columns.

See also: AWS API Documentation

Request Syntax

client.get_column_statistics_for_partition(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    PartitionValues=[
        'string',
    ],
    ColumnNames=[
        'string',
    ]
)
type CatalogId:

string

param CatalogId:

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

The name of the catalog database where the partitions reside.

type TableName:

string

param TableName:

[REQUIRED]

The name of the partitions' table.

type PartitionValues:

list

param PartitionValues:

[REQUIRED]

A list of partition values identifying the partition.

  • (string) --

type ColumnNames:

list

param ColumnNames:

[REQUIRED]

A list of the column names.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'ColumnStatisticsList': [
        {
            'ColumnName': 'string',
            'ColumnType': 'string',
            'AnalyzedTime': datetime(2015, 1, 1),
            'StatisticsData': {
                'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                'BooleanColumnStatisticsData': {
                    'NumberOfTrues': 123,
                    'NumberOfFalses': 123,
                    'NumberOfNulls': 123
                },
                'DateColumnStatisticsData': {
                    'MinimumValue': datetime(2015, 1, 1),
                    'MaximumValue': datetime(2015, 1, 1),
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DecimalColumnStatisticsData': {
                    'MinimumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'MaximumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DoubleColumnStatisticsData': {
                    'MinimumValue': 123.0,
                    'MaximumValue': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'LongColumnStatisticsData': {
                    'MinimumValue': 123,
                    'MaximumValue': 123,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'StringColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'BinaryColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123
                }
            }
        },
    ],
    'Errors': [
        {
            'ColumnName': 'string',
            'Error': {
                'ErrorCode': 'string',
                'ErrorMessage': 'string'
            }
        },
    ]
}

Response Structure

  • (dict) --

    • ColumnStatisticsList (list) --

      List of ColumnStatistics that failed to be retrieved.

      • (dict) --

        Defines a column statistics.

        • ColumnName (string) --

          The name of the column.

        • ColumnType (string) --

          The type of the column.

        • AnalyzedTime (datetime) --

          The analyzed time of the column statistics.

        • StatisticsData (dict) --

          The statistics of the column.

          • Type (string) --

            The name of the column.

          • BooleanColumnStatisticsData (dict) --

            Boolean Column Statistics Data.

            • NumberOfTrues (integer) --

              Number of true value.

            • NumberOfFalses (integer) --

              Number of false value.

            • NumberOfNulls (integer) --

              Number of nulls.

          • DateColumnStatisticsData (dict) --

            Date Column Statistics Data.

            • MinimumValue (datetime) --

              Minimum value of the column.

            • MaximumValue (datetime) --

              Maximum value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • DecimalColumnStatisticsData (dict) --

            Decimal Column Statistics Data.

            • MinimumValue (dict) --

              Minimum value of the column.

              • UnscaledValue (bytes) --

                The unscaled numeric value.

              • Scale (integer) --

                The scale that determines where the decimal point falls in the unscaled value.

            • MaximumValue (dict) --

              Maximum value of the column.

              • UnscaledValue (bytes) --

                The unscaled numeric value.

              • Scale (integer) --

                The scale that determines where the decimal point falls in the unscaled value.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • DoubleColumnStatisticsData (dict) --

            Double Column Statistics Data.

            • MinimumValue (float) --

              Minimum value of the column.

            • MaximumValue (float) --

              Maximum value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • LongColumnStatisticsData (dict) --

            Long Column Statistics Data.

            • MinimumValue (integer) --

              Minimum value of the column.

            • MaximumValue (integer) --

              Maximum value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • StringColumnStatisticsData (dict) --

            String Column Statistics Data.

            • MaximumLength (integer) --

              Maximum value of the column.

            • AverageLength (float) --

              Average value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • BinaryColumnStatisticsData (dict) --

            Binary Column Statistics Data.

            • MaximumLength (integer) --

              Maximum length of the column.

            • AverageLength (float) --

              Average length of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

    • Errors (list) --

      Error occurred during retrieving column statistics data.

      • (dict) --

        Defines a column containing error.

        • ColumnName (string) --

          The name of the column.

        • Error (dict) --

          The error message occurred during operation.

          • ErrorCode (string) --

            The code associated with this error.

          • ErrorMessage (string) --

            A message describing the error.

GetColumnStatisticsForTable (new) Link ¶

Retrieves table statistics of columns.

See also: AWS API Documentation

Request Syntax

client.get_column_statistics_for_table(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    ColumnNames=[
        'string',
    ]
)
type CatalogId:

string

param CatalogId:

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

The name of the catalog database where the partitions reside.

type TableName:

string

param TableName:

[REQUIRED]

The name of the partitions' table.

type ColumnNames:

list

param ColumnNames:

[REQUIRED]

A list of the column names.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'ColumnStatisticsList': [
        {
            'ColumnName': 'string',
            'ColumnType': 'string',
            'AnalyzedTime': datetime(2015, 1, 1),
            'StatisticsData': {
                'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                'BooleanColumnStatisticsData': {
                    'NumberOfTrues': 123,
                    'NumberOfFalses': 123,
                    'NumberOfNulls': 123
                },
                'DateColumnStatisticsData': {
                    'MinimumValue': datetime(2015, 1, 1),
                    'MaximumValue': datetime(2015, 1, 1),
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DecimalColumnStatisticsData': {
                    'MinimumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'MaximumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DoubleColumnStatisticsData': {
                    'MinimumValue': 123.0,
                    'MaximumValue': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'LongColumnStatisticsData': {
                    'MinimumValue': 123,
                    'MaximumValue': 123,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'StringColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'BinaryColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123
                }
            }
        },
    ],
    'Errors': [
        {
            'ColumnName': 'string',
            'Error': {
                'ErrorCode': 'string',
                'ErrorMessage': 'string'
            }
        },
    ]
}

Response Structure

  • (dict) --

    • ColumnStatisticsList (list) --

      List of ColumnStatistics that failed to be retrieved.

      • (dict) --

        Defines a column statistics.

        • ColumnName (string) --

          The name of the column.

        • ColumnType (string) --

          The type of the column.

        • AnalyzedTime (datetime) --

          The analyzed time of the column statistics.

        • StatisticsData (dict) --

          The statistics of the column.

          • Type (string) --

            The name of the column.

          • BooleanColumnStatisticsData (dict) --

            Boolean Column Statistics Data.

            • NumberOfTrues (integer) --

              Number of true value.

            • NumberOfFalses (integer) --

              Number of false value.

            • NumberOfNulls (integer) --

              Number of nulls.

          • DateColumnStatisticsData (dict) --

            Date Column Statistics Data.

            • MinimumValue (datetime) --

              Minimum value of the column.

            • MaximumValue (datetime) --

              Maximum value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • DecimalColumnStatisticsData (dict) --

            Decimal Column Statistics Data.

            • MinimumValue (dict) --

              Minimum value of the column.

              • UnscaledValue (bytes) --

                The unscaled numeric value.

              • Scale (integer) --

                The scale that determines where the decimal point falls in the unscaled value.

            • MaximumValue (dict) --

              Maximum value of the column.

              • UnscaledValue (bytes) --

                The unscaled numeric value.

              • Scale (integer) --

                The scale that determines where the decimal point falls in the unscaled value.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • DoubleColumnStatisticsData (dict) --

            Double Column Statistics Data.

            • MinimumValue (float) --

              Minimum value of the column.

            • MaximumValue (float) --

              Maximum value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • LongColumnStatisticsData (dict) --

            Long Column Statistics Data.

            • MinimumValue (integer) --

              Minimum value of the column.

            • MaximumValue (integer) --

              Maximum value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • StringColumnStatisticsData (dict) --

            String Column Statistics Data.

            • MaximumLength (integer) --

              Maximum value of the column.

            • AverageLength (float) --

              Average value of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

            • NumberOfDistinctValues (integer) --

              Number of distinct values.

          • BinaryColumnStatisticsData (dict) --

            Binary Column Statistics Data.

            • MaximumLength (integer) --

              Maximum length of the column.

            • AverageLength (float) --

              Average length of the column.

            • NumberOfNulls (integer) --

              Number of nulls.

    • Errors (list) --

      List of ColumnStatistics that failed to be retrieved.

      • (dict) --

        Defines a column containing error.

        • ColumnName (string) --

          The name of the column.

        • Error (dict) --

          The error message occurred during operation.

          • ErrorCode (string) --

            The code associated with this error.

          • ErrorMessage (string) --

            A message describing the error.

DeleteColumnStatisticsForPartition (new) Link ¶

Delete the partition column statistics of a column.

See also: AWS API Documentation

Request Syntax

client.delete_column_statistics_for_partition(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    PartitionValues=[
        'string',
    ],
    ColumnName='string'
)
type CatalogId:

string

param CatalogId:

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

The name of the catalog database where the partitions reside.

type TableName:

string

param TableName:

[REQUIRED]

The name of the partitions' table.

type PartitionValues:

list

param PartitionValues:

[REQUIRED]

A list of partition values identifying the partition.

  • (string) --

type ColumnName:

string

param ColumnName:

[REQUIRED]

Name of the column.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

  • (dict) --

UpdateColumnStatisticsForPartition (new) Link ¶

Creates or updates partition statistics of columns.

See also: AWS API Documentation

Request Syntax

client.update_column_statistics_for_partition(
    CatalogId='string',
    DatabaseName='string',
    TableName='string',
    PartitionValues=[
        'string',
    ],
    ColumnStatisticsList=[
        {
            'ColumnName': 'string',
            'ColumnType': 'string',
            'AnalyzedTime': datetime(2015, 1, 1),
            'StatisticsData': {
                'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                'BooleanColumnStatisticsData': {
                    'NumberOfTrues': 123,
                    'NumberOfFalses': 123,
                    'NumberOfNulls': 123
                },
                'DateColumnStatisticsData': {
                    'MinimumValue': datetime(2015, 1, 1),
                    'MaximumValue': datetime(2015, 1, 1),
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DecimalColumnStatisticsData': {
                    'MinimumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'MaximumValue': {
                        'UnscaledValue': b'bytes',
                        'Scale': 123
                    },
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'DoubleColumnStatisticsData': {
                    'MinimumValue': 123.0,
                    'MaximumValue': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'LongColumnStatisticsData': {
                    'MinimumValue': 123,
                    'MaximumValue': 123,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'StringColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123,
                    'NumberOfDistinctValues': 123
                },
                'BinaryColumnStatisticsData': {
                    'MaximumLength': 123,
                    'AverageLength': 123.0,
                    'NumberOfNulls': 123
                }
            }
        },
    ]
)
type CatalogId:

string

param CatalogId:

The ID of the Data Catalog where the partitions in question reside. If none is supplied, the AWS account ID is used by default.

type DatabaseName:

string

param DatabaseName:

[REQUIRED]

The name of the catalog database where the partitions reside.

type TableName:

string

param TableName:

[REQUIRED]

The name of the partitions' table.

type PartitionValues:

list

param PartitionValues:

[REQUIRED]

A list of partition values identifying the partition.

  • (string) --

type ColumnStatisticsList:

list

param ColumnStatisticsList:

[REQUIRED]

A list of the column statistics.

  • (dict) --

    Defines a column statistics.

    • ColumnName (string) -- [REQUIRED]

      The name of the column.

    • ColumnType (string) -- [REQUIRED]

      The type of the column.

    • AnalyzedTime (datetime) -- [REQUIRED]

      The analyzed time of the column statistics.

    • StatisticsData (dict) -- [REQUIRED]

      The statistics of the column.

      • Type (string) -- [REQUIRED]

        The name of the column.

      • BooleanColumnStatisticsData (dict) --

        Boolean Column Statistics Data.

        • NumberOfTrues (integer) -- [REQUIRED]

          Number of true value.

        • NumberOfFalses (integer) -- [REQUIRED]

          Number of false value.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

      • DateColumnStatisticsData (dict) --

        Date Column Statistics Data.

        • MinimumValue (datetime) --

          Minimum value of the column.

        • MaximumValue (datetime) --

          Maximum value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • DecimalColumnStatisticsData (dict) --

        Decimal Column Statistics Data.

        • MinimumValue (dict) --

          Minimum value of the column.

          • UnscaledValue (bytes) -- [REQUIRED]

            The unscaled numeric value.

          • Scale (integer) -- [REQUIRED]

            The scale that determines where the decimal point falls in the unscaled value.

        • MaximumValue (dict) --

          Maximum value of the column.

          • UnscaledValue (bytes) -- [REQUIRED]

            The unscaled numeric value.

          • Scale (integer) -- [REQUIRED]

            The scale that determines where the decimal point falls in the unscaled value.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • DoubleColumnStatisticsData (dict) --

        Double Column Statistics Data.

        • MinimumValue (float) --

          Minimum value of the column.

        • MaximumValue (float) --

          Maximum value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • LongColumnStatisticsData (dict) --

        Long Column Statistics Data.

        • MinimumValue (integer) --

          Minimum value of the column.

        • MaximumValue (integer) --

          Maximum value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • StringColumnStatisticsData (dict) --

        String Column Statistics Data.

        • MaximumLength (integer) -- [REQUIRED]

          Maximum value of the column.

        • AverageLength (float) -- [REQUIRED]

          Average value of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

        • NumberOfDistinctValues (integer) -- [REQUIRED]

          Number of distinct values.

      • BinaryColumnStatisticsData (dict) --

        Binary Column Statistics Data.

        • MaximumLength (integer) -- [REQUIRED]

          Maximum length of the column.

        • AverageLength (float) -- [REQUIRED]

          Average length of the column.

        • NumberOfNulls (integer) -- [REQUIRED]

          Number of nulls.

rtype:

dict

returns:

Response Syntax

{
    'Errors': [
        {
            'ColumnStatistics': {
                'ColumnName': 'string',
                'ColumnType': 'string',
                'AnalyzedTime': datetime(2015, 1, 1),
                'StatisticsData': {
                    'Type': 'BOOLEAN'|'DATE'|'DECIMAL'|'DOUBLE'|'LONG'|'STRING'|'BINARY',
                    'BooleanColumnStatisticsData': {
                        'NumberOfTrues': 123,
                        'NumberOfFalses': 123,
                        'NumberOfNulls': 123
                    },
                    'DateColumnStatisticsData': {
                        'MinimumValue': datetime(2015, 1, 1),
                        'MaximumValue': datetime(2015, 1, 1),
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'DecimalColumnStatisticsData': {
                        'MinimumValue': {
                            'UnscaledValue': b'bytes',
                            'Scale': 123
                        },
                        'MaximumValue': {
                            'UnscaledValue': b'bytes',
                            'Scale': 123
                        },
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'DoubleColumnStatisticsData': {
                        'MinimumValue': 123.0,
                        'MaximumValue': 123.0,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'LongColumnStatisticsData': {
                        'MinimumValue': 123,
                        'MaximumValue': 123,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'StringColumnStatisticsData': {
                        'MaximumLength': 123,
                        'AverageLength': 123.0,
                        'NumberOfNulls': 123,
                        'NumberOfDistinctValues': 123
                    },
                    'BinaryColumnStatisticsData': {
                        'MaximumLength': 123,
                        'AverageLength': 123.0,
                        'NumberOfNulls': 123
                    }
                }
            },
            'Error': {
                'ErrorCode': 'string',
                'ErrorMessage': 'string'
            }
        },
    ]
}

Response Structure

  • (dict) --

    • Errors (list) --

      Error occurred during updating column statistics data.

      • (dict) --

        Defines a column containing error.

        • ColumnStatistics (dict) --

          The ColumnStatistics of the column.

          • ColumnName (string) --

            The name of the column.

          • ColumnType (string) --

            The type of the column.

          • AnalyzedTime (datetime) --

            The analyzed time of the column statistics.

          • StatisticsData (dict) --

            The statistics of the column.

            • Type (string) --

              The name of the column.

            • BooleanColumnStatisticsData (dict) --

              Boolean Column Statistics Data.

              • NumberOfTrues (integer) --

                Number of true value.

              • NumberOfFalses (integer) --

                Number of false value.

              • NumberOfNulls (integer) --

                Number of nulls.

            • DateColumnStatisticsData (dict) --

              Date Column Statistics Data.

              • MinimumValue (datetime) --

                Minimum value of the column.

              • MaximumValue (datetime) --

                Maximum value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • DecimalColumnStatisticsData (dict) --

              Decimal Column Statistics Data.

              • MinimumValue (dict) --

                Minimum value of the column.

                • UnscaledValue (bytes) --

                  The unscaled numeric value.

                • Scale (integer) --

                  The scale that determines where the decimal point falls in the unscaled value.

              • MaximumValue (dict) --

                Maximum value of the column.

                • UnscaledValue (bytes) --

                  The unscaled numeric value.

                • Scale (integer) --

                  The scale that determines where the decimal point falls in the unscaled value.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • DoubleColumnStatisticsData (dict) --

              Double Column Statistics Data.

              • MinimumValue (float) --

                Minimum value of the column.

              • MaximumValue (float) --

                Maximum value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • LongColumnStatisticsData (dict) --

              Long Column Statistics Data.

              • MinimumValue (integer) --

                Minimum value of the column.

              • MaximumValue (integer) --

                Maximum value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • StringColumnStatisticsData (dict) --

              String Column Statistics Data.

              • MaximumLength (integer) --

                Maximum value of the column.

              • AverageLength (float) --

                Average value of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

              • NumberOfDistinctValues (integer) --

                Number of distinct values.

            • BinaryColumnStatisticsData (dict) --

              Binary Column Statistics Data.

              • MaximumLength (integer) --

                Maximum length of the column.

              • AverageLength (float) --

                Average length of the column.

              • NumberOfNulls (integer) --

                Number of nulls.

        • Error (dict) --

          The error message occurred during operation.

          • ErrorCode (string) --

            The code associated with this error.

          • ErrorMessage (string) --

            A message describing the error.