AWS IoT Analytics

2021/06/14 - AWS IoT Analytics - 3 updated api methods

Changes  Adds support for data store partitions.

CreateDatastore (updated) Link ¶
Changes (request)
{'datastorePartitions': {'partitions': [{'attributePartition': {'attributeName': 'string'},
                                         'timestampPartition': {'attributeName': 'string',
                                                                'timestampFormat': 'string'}}]}}

Creates a data store, which is a repository for messages. Only data stores that are used to save pipeline data can be configured with ParquetConfiguration.

See also: AWS API Documentation

Request Syntax

client.create_datastore(
    datastoreName='string',
    datastoreStorage={
        'serviceManagedS3': {}
        ,
        'customerManagedS3': {
            'bucket': 'string',
            'keyPrefix': 'string',
            'roleArn': 'string'
        }
    },
    retentionPeriod={
        'unlimited': True|False,
        'numberOfDays': 123
    },
    tags=[
        {
            'key': 'string',
            'value': 'string'
        },
    ],
    fileFormatConfiguration={
        'jsonConfiguration': {}
        ,
        'parquetConfiguration': {
            'schemaDefinition': {
                'columns': [
                    {
                        'name': 'string',
                        'type': 'string'
                    },
                ]
            }
        }
    },
    datastorePartitions={
        'partitions': [
            {
                'attributePartition': {
                    'attributeName': 'string'
                },
                'timestampPartition': {
                    'attributeName': 'string',
                    'timestampFormat': 'string'
                }
            },
        ]
    }
)
type datastoreName:

string

param datastoreName:

[REQUIRED]

The name of the data store.

type datastoreStorage:

dict

param datastoreStorage:

Where data store data is stored. You can choose one of serviceManagedS3 or customerManagedS3 storage. If not specified, the default is serviceManagedS3. You cannot change this storage option after the data store is created.

  • serviceManagedS3 (dict) --

    Use this to store data store data in an S3 bucket managed by AWS IoT Analytics. You cannot change the choice of service-managed or customer-managed S3 storage after the data store is created.

  • customerManagedS3 (dict) --

    Use this to store data store data in an S3 bucket that you manage. When customer managed storage is selected, the retentionPeriod parameter is ignored. The choice of service-managed or customer-managed S3 storage cannot be changed after creation of the data store.

    • bucket (string) -- [REQUIRED]

      The name of the S3 bucket in which data store data is stored.

    • keyPrefix (string) --

      Optional. The prefix used to create the keys of the data store data objects. Each object in an S3 bucket has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/).

    • roleArn (string) -- [REQUIRED]

      The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

type retentionPeriod:

dict

param retentionPeriod:

How long, in days, message data is kept for the data store. When customerManagedS3 storage is selected, this parameter is ignored.

  • unlimited (boolean) --

    If true, message data is kept indefinitely.

  • numberOfDays (integer) --

    The number of days that message data is kept. The unlimited parameter must be false.

type tags:

list

param tags:

Metadata which can be used to manage the data store.

  • (dict) --

    A set of key-value pairs that are used to manage the resource.

    • key (string) -- [REQUIRED]

      The tag's key.

    • value (string) -- [REQUIRED]

      The tag's value.

type fileFormatConfiguration:

dict

param fileFormatConfiguration:

Contains the configuration information of file formats. AWS IoT Analytics data stores support JSON and Parquet.

The default file format is JSON. You can specify only one format.

You can't change the file format after you create the data store.

  • jsonConfiguration (dict) --

    Contains the configuration information of the JSON format.

  • parquetConfiguration (dict) --

    Contains the configuration information of the Parquet format.

    • schemaDefinition (dict) --

      Information needed to define a schema.

      • columns (list) --

        Specifies one or more columns that store your data.

        Each schema can have up to 100 columns. Each column can have up to 100 nested types.

        • (dict) --

          Contains information about a column that stores your data.

          • name (string) -- [REQUIRED]

            The name of the column.

          • type (string) -- [REQUIRED]

            The type of data. For more information about the supported data types, see Common data types in the AWS Glue Developer Guide.

type datastorePartitions:

dict

param datastorePartitions:

Contains information about the partitions in a data store.

  • partitions (list) --

    A list of partitions in a data store.

    • (dict) --

      A single partition in a data store.

      • attributePartition (dict) --

        A partition defined by an attributeName.

        • attributeName (string) -- [REQUIRED]

          The attribute name of the partition.

      • timestampPartition (dict) --

        A partition defined by an attributeName and a timestamp format.

        • attributeName (string) -- [REQUIRED]

          The attribute name of the partition defined by a timestamp.

        • timestampFormat (string) --

          The timestamp format of a partition defined by a timestamp.

rtype:

dict

returns:

Response Syntax

{
    'datastoreName': 'string',
    'datastoreArn': 'string',
    'retentionPeriod': {
        'unlimited': True|False,
        'numberOfDays': 123
    }
}

Response Structure

  • (dict) --

    • datastoreName (string) --

      The name of the data store.

    • datastoreArn (string) --

      The ARN of the data store.

    • retentionPeriod (dict) --

      How long, in days, message data is kept for the data store.

      • unlimited (boolean) --

        If true, message data is kept indefinitely.

      • numberOfDays (integer) --

        The number of days that message data is kept. The unlimited parameter must be false.

DescribeDatastore (updated) Link ¶
Changes (response)
{'datastore': {'datastorePartitions': {'partitions': [{'attributePartition': {'attributeName': 'string'},
                                                       'timestampPartition': {'attributeName': 'string',
                                                                              'timestampFormat': 'string'}}]}}}

Retrieves information about a data store.

See also: AWS API Documentation

Request Syntax

client.describe_datastore(
    datastoreName='string',
    includeStatistics=True|False
)
type datastoreName:

string

param datastoreName:

[REQUIRED]

The name of the data store

type includeStatistics:

boolean

param includeStatistics:

If true, additional statistical information about the data store is included in the response. This feature cannot be used with a data store whose S3 storage is customer-managed.

rtype:

dict

returns:

Response Syntax

{
    'datastore': {
        'name': 'string',
        'storage': {
            'serviceManagedS3': {},
            'customerManagedS3': {
                'bucket': 'string',
                'keyPrefix': 'string',
                'roleArn': 'string'
            }
        },
        'arn': 'string',
        'status': 'CREATING'|'ACTIVE'|'DELETING',
        'retentionPeriod': {
            'unlimited': True|False,
            'numberOfDays': 123
        },
        'creationTime': datetime(2015, 1, 1),
        'lastUpdateTime': datetime(2015, 1, 1),
        'lastMessageArrivalTime': datetime(2015, 1, 1),
        'fileFormatConfiguration': {
            'jsonConfiguration': {},
            'parquetConfiguration': {
                'schemaDefinition': {
                    'columns': [
                        {
                            'name': 'string',
                            'type': 'string'
                        },
                    ]
                }
            }
        },
        'datastorePartitions': {
            'partitions': [
                {
                    'attributePartition': {
                        'attributeName': 'string'
                    },
                    'timestampPartition': {
                        'attributeName': 'string',
                        'timestampFormat': 'string'
                    }
                },
            ]
        }
    },
    'statistics': {
        'size': {
            'estimatedSizeInBytes': 123.0,
            'estimatedOn': datetime(2015, 1, 1)
        }
    }
}

Response Structure

  • (dict) --

    • datastore (dict) --

      Information about the data store.

      • name (string) --

        The name of the data store.

      • storage (dict) --

        Where data store data is stored. You can choose one of serviceManagedS3 or customerManagedS3 storage. If not specified, the default is serviceManagedS3. You cannot change this storage option after the data store is created.

        • serviceManagedS3 (dict) --

          Use this to store data store data in an S3 bucket managed by AWS IoT Analytics. You cannot change the choice of service-managed or customer-managed S3 storage after the data store is created.

        • customerManagedS3 (dict) --

          Use this to store data store data in an S3 bucket that you manage. When customer managed storage is selected, the retentionPeriod parameter is ignored. The choice of service-managed or customer-managed S3 storage cannot be changed after creation of the data store.

          • bucket (string) --

            The name of the S3 bucket in which data store data is stored.

          • keyPrefix (string) --

            Optional. The prefix used to create the keys of the data store data objects. Each object in an S3 bucket has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/).

          • roleArn (string) --

            The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

      • arn (string) --

        The ARN of the data store.

      • status (string) --

        The status of a data store:

        CREATING

        The data store is being created.

        ACTIVE

        The data store has been created and can be used.

        DELETING

        The data store is being deleted.

      • retentionPeriod (dict) --

        How long, in days, message data is kept for the data store. When customerManagedS3 storage is selected, this parameter is ignored.

        • unlimited (boolean) --

          If true, message data is kept indefinitely.

        • numberOfDays (integer) --

          The number of days that message data is kept. The unlimited parameter must be false.

      • creationTime (datetime) --

        When the data store was created.

      • lastUpdateTime (datetime) --

        The last time the data store was updated.

      • lastMessageArrivalTime (datetime) --

        The last time when a new message arrived in the data store.

        AWS IoT Analytics updates this value at most once per minute for one data store. Hence, the lastMessageArrivalTime value is an approximation.

        This feature only applies to messages that arrived in the data store after October 23, 2020.

      • fileFormatConfiguration (dict) --

        Contains the configuration information of file formats. AWS IoT Analytics data stores support JSON and Parquet.

        The default file format is JSON. You can specify only one format.

        You can't change the file format after you create the data store.

        • jsonConfiguration (dict) --

          Contains the configuration information of the JSON format.

        • parquetConfiguration (dict) --

          Contains the configuration information of the Parquet format.

          • schemaDefinition (dict) --

            Information needed to define a schema.

            • columns (list) --

              Specifies one or more columns that store your data.

              Each schema can have up to 100 columns. Each column can have up to 100 nested types.

              • (dict) --

                Contains information about a column that stores your data.

                • name (string) --

                  The name of the column.

                • type (string) --

                  The type of data. For more information about the supported data types, see Common data types in the AWS Glue Developer Guide.

      • datastorePartitions (dict) --

        Contains information about the partitions in a data store.

        • partitions (list) --

          A list of partitions in a data store.

          • (dict) --

            A single partition in a data store.

            • attributePartition (dict) --

              A partition defined by an attributeName.

              • attributeName (string) --

                The attribute name of the partition.

            • timestampPartition (dict) --

              A partition defined by an attributeName and a timestamp format.

              • attributeName (string) --

                The attribute name of the partition defined by a timestamp.

              • timestampFormat (string) --

                The timestamp format of a partition defined by a timestamp.

    • statistics (dict) --

      Additional statistical information about the data store. Included if the includeStatistics parameter is set to true in the request.

      • size (dict) --

        The estimated size of the data store.

        • estimatedSizeInBytes (float) --

          The estimated size of the resource, in bytes.

        • estimatedOn (datetime) --

          The time when the estimate of the size of the resource was made.

ListDatastores (updated) Link ¶
Changes (response)
{'datastoreSummaries': {'datastorePartitions': {'partitions': [{'attributePartition': {'attributeName': 'string'},
                                                                'timestampPartition': {'attributeName': 'string',
                                                                                       'timestampFormat': 'string'}}]}}}

Retrieves a list of data stores.

See also: AWS API Documentation

Request Syntax

client.list_datastores(
    nextToken='string',
    maxResults=123
)
type nextToken:

string

param nextToken:

The token for the next set of results.

type maxResults:

integer

param maxResults:

The maximum number of results to return in this request.

The default value is 100.

rtype:

dict

returns:

Response Syntax

{
    'datastoreSummaries': [
        {
            'datastoreName': 'string',
            'datastoreStorage': {
                'serviceManagedS3': {},
                'customerManagedS3': {
                    'bucket': 'string',
                    'keyPrefix': 'string',
                    'roleArn': 'string'
                }
            },
            'status': 'CREATING'|'ACTIVE'|'DELETING',
            'creationTime': datetime(2015, 1, 1),
            'lastUpdateTime': datetime(2015, 1, 1),
            'lastMessageArrivalTime': datetime(2015, 1, 1),
            'fileFormatType': 'JSON'|'PARQUET',
            'datastorePartitions': {
                'partitions': [
                    {
                        'attributePartition': {
                            'attributeName': 'string'
                        },
                        'timestampPartition': {
                            'attributeName': 'string',
                            'timestampFormat': 'string'
                        }
                    },
                ]
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • datastoreSummaries (list) --

      A list of DatastoreSummary objects.

      • (dict) --

        A summary of information about a data store.

        • datastoreName (string) --

          The name of the data store.

        • datastoreStorage (dict) --

          Where data store data is stored.

          • serviceManagedS3 (dict) --

            Used to store data store data in an S3 bucket managed by AWS IoT Analytics.

          • customerManagedS3 (dict) --

            Used to store data store data in an S3 bucket that you manage.

            • bucket (string) --

              The name of the S3 bucket in which data store data is stored.

            • keyPrefix (string) --

              Optional. The prefix used to create the keys of the data store data objects. Each object in an S3 bucket has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/).

            • roleArn (string) --

              The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

        • status (string) --

          The status of the data store.

        • creationTime (datetime) --

          When the data store was created.

        • lastUpdateTime (datetime) --

          The last time the data store was updated.

        • lastMessageArrivalTime (datetime) --

          The last time when a new message arrived in the data store.

          AWS IoT Analytics updates this value at most once per minute for one data store. Hence, the lastMessageArrivalTime value is an approximation.

          This feature only applies to messages that arrived in the data store after October 23, 2020.

        • fileFormatType (string) --

          The file format of the data in the data store.

        • datastorePartitions (dict) --

          Contains information about the partitions in a data store.

          • partitions (list) --

            A list of partitions in a data store.

            • (dict) --

              A single partition in a data store.

              • attributePartition (dict) --

                A partition defined by an attributeName.

                • attributeName (string) --

                  The attribute name of the partition.

              • timestampPartition (dict) --

                A partition defined by an attributeName and a timestamp format.

                • attributeName (string) --

                  The attribute name of the partition defined by a timestamp.

                • timestampFormat (string) --

                  The timestamp format of a partition defined by a timestamp.

    • nextToken (string) --

      The token to retrieve the next set of results, or null if there are no more results.