AWS IoT Analytics

2020/11/09 - AWS IoT Analytics - 8 updated api methods

Changes  AWS IoT Analytics now supports Late Data Notifications for datasets, dataset content creation using previous version IDs, and includes the LastMessageArrivalTime attribute for channels and datastores.

CreateDataset (updated) Link ¶
Changes (request)
{'lateDataRules': [{'ruleConfiguration': {'deltaTimeSessionWindowConfiguration': {'timeoutInMinutes': 'integer'}},
                    'ruleName': 'string'}]}

Creates a dataset. A dataset stores data retrieved from a data store by applying a queryAction (a SQL query) or a containerAction (executing a containerized application). This operation creates the skeleton of a dataset. The dataset can be populated manually by calling CreateDatasetContent or automatically according to a trigger you specify.

See also: AWS API Documentation

Request Syntax

client.create_dataset(
    datasetName='string',
    actions=[
        {
            'actionName': 'string',
            'queryAction': {
                'sqlQuery': 'string',
                'filters': [
                    {
                        'deltaTime': {
                            'offsetSeconds': 123,
                            'timeExpression': 'string'
                        }
                    },
                ]
            },
            'containerAction': {
                'image': 'string',
                'executionRoleArn': 'string',
                'resourceConfiguration': {
                    'computeType': 'ACU_1'|'ACU_2',
                    'volumeSizeInGB': 123
                },
                'variables': [
                    {
                        'name': 'string',
                        'stringValue': 'string',
                        'doubleValue': 123.0,
                        'datasetContentVersionValue': {
                            'datasetName': 'string'
                        },
                        'outputFileUriValue': {
                            'fileName': 'string'
                        }
                    },
                ]
            }
        },
    ],
    triggers=[
        {
            'schedule': {
                'expression': 'string'
            },
            'dataset': {
                'name': 'string'
            }
        },
    ],
    contentDeliveryRules=[
        {
            'entryName': 'string',
            'destination': {
                'iotEventsDestinationConfiguration': {
                    'inputName': 'string',
                    'roleArn': 'string'
                },
                's3DestinationConfiguration': {
                    'bucket': 'string',
                    'key': 'string',
                    'glueConfiguration': {
                        'tableName': 'string',
                        'databaseName': 'string'
                    },
                    'roleArn': 'string'
                }
            }
        },
    ],
    retentionPeriod={
        'unlimited': True|False,
        'numberOfDays': 123
    },
    versioningConfiguration={
        'unlimited': True|False,
        'maxVersions': 123
    },
    tags=[
        {
            'key': 'string',
            'value': 'string'
        },
    ],
    lateDataRules=[
        {
            'ruleName': 'string',
            'ruleConfiguration': {
                'deltaTimeSessionWindowConfiguration': {
                    'timeoutInMinutes': 123
                }
            }
        },
    ]
)
type datasetName

string

param datasetName

[REQUIRED]

The name of the data set.

type actions

list

param actions

[REQUIRED]

A list of actions that create the data set contents.

  • (dict) --

    A DatasetAction object that specifies how data set contents are automatically created.

    • actionName (string) --

      The name of the data set action by which data set contents are automatically created.

    • queryAction (dict) --

      An SqlQueryDatasetAction object that uses an SQL query to automatically create data set contents.

      • sqlQuery (string) -- [REQUIRED]

        A SQL query string.

      • filters (list) --

        Prefilters applied to message data.

        • (dict) --

          Information that is used to filter message data, to segregate it according to the timeframe in which it arrives.

          • deltaTime (dict) --

            Used to limit data to that which has arrived since the last execution of the action.

            • offsetSeconds (integer) -- [REQUIRED]

              The number of seconds of estimated in-flight lag time of message data. When you create dataset contents using message data from a specified timeframe, some message data might still be in flight when processing begins, and so do not arrive in time to be processed. Use this field to make allowances for the in flight time of your message data, so that data not processed from a previous timeframe is included with the next timeframe. Otherwise, missed message data would be excluded from processing during the next timeframe too, because its timestamp places it within the previous timeframe.

            • timeExpression (string) -- [REQUIRED]

              An expression by which the time of the message data might be determined. This can be the name of a timestamp field or a SQL expression that is used to derive the time the message data was generated.

    • containerAction (dict) --

      Information that allows the system to run a containerized application to create the dataset contents. The application must be in a Docker container along with any required support libraries.

      • image (string) -- [REQUIRED]

        The ARN of the Docker container stored in your account. The Docker container contains an application and required support libraries and is used to generate dataset contents.

      • executionRoleArn (string) -- [REQUIRED]

        The ARN of the role that gives permission to the system to access required resources to run the containerAction . This includes, at minimum, permission to retrieve the dataset contents that are the input to the containerized application.

      • resourceConfiguration (dict) -- [REQUIRED]

        Configuration of the resource that executes the containerAction .

        • computeType (string) -- [REQUIRED]

          The type of the compute resource used to execute the containerAction . Possible values are: ACU_1 (vCPU=4, memory=16 GiB) or ACU_2 (vCPU=8, memory=32 GiB).

        • volumeSizeInGB (integer) -- [REQUIRED]

          The size, in GB, of the persistent storage available to the resource instance used to execute the containerAction (min: 1, max: 50).

      • variables (list) --

        The values of variables used in the context of the execution of the containerized application (basically, parameters passed to the application). Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue .

        • (dict) --

          An instance of a variable to be passed to the containerAction execution. Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue .

          • name (string) -- [REQUIRED]

            The name of the variable.

          • stringValue (string) --

            The value of the variable as a string.

          • doubleValue (float) --

            The value of the variable as a double (numeric).

          • datasetContentVersionValue (dict) --

            The value of the variable as a structure that specifies a dataset content version.

            • datasetName (string) -- [REQUIRED]

              The name of the dataset whose latest contents are used as input to the notebook or application.

          • outputFileUriValue (dict) --

            The value of the variable as a structure that specifies an output file URI.

            • fileName (string) -- [REQUIRED]

              The URI of the location where dataset contents are stored, usually the URI of a file in an S3 bucket.

type triggers

list

param triggers

A list of triggers. A trigger causes data set contents to be populated at a specified time interval or when another data set's contents are created. The list of triggers can be empty or contain up to five DataSetTrigger objects.

  • (dict) --

    The DatasetTrigger that specifies when the data set is automatically updated.

    • schedule (dict) --

      The Schedule when the trigger is initiated.

      • expression (string) --

        The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch Events User Guide .

    • dataset (dict) --

      The data set whose content creation triggers the creation of this data set's contents.

      • name (string) -- [REQUIRED]

        The name of the dataset whose content generation triggers the new dataset content generation.

type contentDeliveryRules

list

param contentDeliveryRules

When dataset contents are created, they are delivered to destinations specified here.

  • (dict) --

    When dataset contents are created, they are delivered to destination specified here.

    • entryName (string) --

      The name of the dataset content delivery rules entry.

    • destination (dict) -- [REQUIRED]

      The destination to which dataset contents are delivered.

      • iotEventsDestinationConfiguration (dict) --

        Configuration information for delivery of dataset contents to AWS IoT Events.

        • inputName (string) -- [REQUIRED]

          The name of the AWS IoT Events input to which dataset contents are delivered.

        • roleArn (string) -- [REQUIRED]

          The ARN of the role that grants AWS IoT Analytics permission to deliver dataset contents to an AWS IoT Events input.

      • s3DestinationConfiguration (dict) --

        Configuration information for delivery of dataset contents to Amazon S3.

        • bucket (string) -- [REQUIRED]

          The name of the S3 bucket to which dataset contents are delivered.

        • key (string) -- [REQUIRED]

          The key of the dataset contents object in an S3 bucket. Each object has a key that is a unique identifier. Each object has exactly one key.

          You can create a unique key with the following options:

          • Use !{iotanalytics:scheduleTime} to insert the time of a scheduled SQL query run.

          • Use !{iotanalytics:versionId} to insert a unique hash that identifies a dataset content.

          • Use !{iotanalytics:creationTime} to insert the creation time of a dataset content.

          The following example creates a unique key for a CSV file: dataset/mydataset/!{iotanalytics:scheduleTime}/!{iotanalytics:versionId}.csv

          Note

          If you don't use !{iotanalytics:versionId} to specify the key, you might get duplicate keys. For example, you might have two dataset contents with the same scheduleTime but different versionId s. This means that one dataset content overwrites the other.

        • glueConfiguration (dict) --

          Configuration information for coordination with AWS Glue, a fully managed extract, transform and load (ETL) service.

          • tableName (string) -- [REQUIRED]

            The name of the table in your AWS Glue Data Catalog that is used to perform the ETL operations. An AWS Glue Data Catalog table contains partitioned data and descriptions of data sources and targets.

          • databaseName (string) -- [REQUIRED]

            The name of the database in your AWS Glue Data Catalog in which the table is located. An AWS Glue Data Catalog database contains metadata tables.

        • roleArn (string) -- [REQUIRED]

          The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 and AWS Glue resources.

type retentionPeriod

dict

param retentionPeriod

Optional. How long, in days, versions of dataset contents are kept for the dataset. If not specified or set to null , versions of dataset contents are retained for at most 90 days. The number of versions of dataset contents retained is determined by the versioningConfiguration parameter. For more information, see Keeping Multiple Versions of AWS IoT Analytics Data Sets in the AWS IoT Analytics User Guide .

  • unlimited (boolean) --

    If true, message data is kept indefinitely.

  • numberOfDays (integer) --

    The number of days that message data is kept. The unlimited parameter must be false.

type versioningConfiguration

dict

param versioningConfiguration

Optional. How many versions of dataset contents are kept. If not specified or set to null, only the latest version plus the latest succeeded version (if they are different) are kept for the time period specified by the retentionPeriod parameter. For more information, see Keeping Multiple Versions of AWS IoT Analytics Data Sets in the AWS IoT Analytics User Guide .

  • unlimited (boolean) --

    If true, unlimited versions of dataset contents are kept.

  • maxVersions (integer) --

    How many versions of dataset contents are kept. The unlimited parameter must be false .

type tags

list

param tags

Metadata which can be used to manage the data set.

  • (dict) --

    A set of key-value pairs that are used to manage the resource.

    • key (string) -- [REQUIRED]

      The tag's key.

    • value (string) -- [REQUIRED]

      The tag's value.

type lateDataRules

list

param lateDataRules

A list of data rules that send notifications to Amazon CloudWatch, when data arrives late. To specify lateDataRules , the dataset must use a DeltaTimer filter.

  • (dict) --

    A structure that contains the name and configuration information of a late data rule.

    • ruleName (string) --

      The name of the late data rule.

    • ruleConfiguration (dict) -- [REQUIRED]

      The information needed to configure the late data rule.

      • deltaTimeSessionWindowConfiguration (dict) --

        The information needed to configure a delta time session window.

        • timeoutInMinutes (integer) -- [REQUIRED]

          A time interval. You can use timeoutInMinutes so that AWS IoT Analytics can batch up late data notifications that have been generated since the last execution. AWS IoT Analytics sends one batch of notifications to Amazon CloudWatch Events at one time.

          For more information about how to write a timestamp expression, see Date and Time Functions and Operators, in the Presto 0.172 Documentation .

rtype

dict

returns

Response Syntax

{
    'datasetName': 'string',
    'datasetArn': 'string',
    'retentionPeriod': {
        'unlimited': True|False,
        'numberOfDays': 123
    }
}

Response Structure

  • (dict) --

    • datasetName (string) --

      The name of the dataset.

    • datasetArn (string) --

      The ARN of the dataset.

    • retentionPeriod (dict) --

      How long, in days, dataset contents are kept for the dataset.

      • unlimited (boolean) --

        If true, message data is kept indefinitely.

      • numberOfDays (integer) --

        The number of days that message data is kept. The unlimited parameter must be false.

CreateDatasetContent (updated) Link ¶
Changes (request)
{'versionId': 'string'}

Creates the content of a data set by applying a queryAction (a SQL query) or a containerAction (executing a containerized application).

See also: AWS API Documentation

Request Syntax

client.create_dataset_content(
    datasetName='string',
    versionId='string'
)
type datasetName

string

param datasetName

[REQUIRED]

The name of the dataset.

type versionId

string

param versionId

The version ID of the dataset content. To specify versionId for a dataset content, the dataset must use a DeltaTimer filter.

rtype

dict

returns

Response Syntax

{
    'versionId': 'string'
}

Response Structure

  • (dict) --

    • versionId (string) --

      The version ID of the dataset contents that are being created.

DescribeChannel (updated) Link ¶
Changes (response)
{'channel': {'lastMessageArrivalTime': 'timestamp'}}

Retrieves information about a channel.

See also: AWS API Documentation

Request Syntax

client.describe_channel(
    channelName='string',
    includeStatistics=True|False
)
type channelName

string

param channelName

[REQUIRED]

The name of the channel whose information is retrieved.

type includeStatistics

boolean

param includeStatistics

If true, additional statistical information about the channel is included in the response. This feature cannot be used with a channel whose S3 storage is customer-managed.

rtype

dict

returns

Response Syntax

{
    'channel': {
        'name': 'string',
        'storage': {
            'serviceManagedS3': {},
            'customerManagedS3': {
                'bucket': 'string',
                'keyPrefix': 'string',
                'roleArn': 'string'
            }
        },
        'arn': 'string',
        'status': 'CREATING'|'ACTIVE'|'DELETING',
        'retentionPeriod': {
            'unlimited': True|False,
            'numberOfDays': 123
        },
        'creationTime': datetime(2015, 1, 1),
        'lastUpdateTime': datetime(2015, 1, 1),
        'lastMessageArrivalTime': datetime(2015, 1, 1)
    },
    'statistics': {
        'size': {
            'estimatedSizeInBytes': 123.0,
            'estimatedOn': datetime(2015, 1, 1)
        }
    }
}

Response Structure

  • (dict) --

    • channel (dict) --

      An object that contains information about the channel.

      • name (string) --

        The name of the channel.

      • storage (dict) --

        Where channel data is stored. You can choose one of serviceManagedS3 or customerManagedS3 storage. If not specified, the default is serviceManagedS3 . You cannot change this storage option after the channel is created.

        • serviceManagedS3 (dict) --

          Use this to store channel data in an S3 bucket managed by AWS IoT Analytics. You cannot change the choice of service-managed or customer-managed S3 storage after the channel is created.

        • customerManagedS3 (dict) --

          Use this to store channel data in an S3 bucket that you manage. If customer managed storage is selected, the retentionPeriod parameter is ignored. You cannot change the choice of service-managed or customer-managed S3 storage after the channel is created.

          • bucket (string) --

            The name of the S3 bucket in which channel data is stored.

          • keyPrefix (string) --

            Optional. The prefix used to create the keys of the channel data objects. Each object in an S3 bucket has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/).

          • roleArn (string) --

            The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

      • arn (string) --

        The ARN of the channel.

      • status (string) --

        The status of the channel.

      • retentionPeriod (dict) --

        How long, in days, message data is kept for the channel.

        • unlimited (boolean) --

          If true, message data is kept indefinitely.

        • numberOfDays (integer) --

          The number of days that message data is kept. The unlimited parameter must be false.

      • creationTime (datetime) --

        When the channel was created.

      • lastUpdateTime (datetime) --

        When the channel was last updated.

      • lastMessageArrivalTime (datetime) --

        The last time when a new message arrived in the channel.

        AWS IoT Analytics updates this value at most once per minute for one channel. Hence, the lastMessageArrivalTime value is an approximation.

        This feature only applies to messages that arrived in the data store after October 23, 2020.

    • statistics (dict) --

      Statistics about the channel. Included if the includeStatistics parameter is set to true in the request.

      • size (dict) --

        The estimated size of the channel.

        • estimatedSizeInBytes (float) --

          The estimated size of the resource, in bytes.

        • estimatedOn (datetime) --

          The time when the estimate of the size of the resource was made.

DescribeDataset (updated) Link ¶
Changes (response)
{'dataset': {'lateDataRules': [{'ruleConfiguration': {'deltaTimeSessionWindowConfiguration': {'timeoutInMinutes': 'integer'}},
                                'ruleName': 'string'}]}}

Retrieves information about a dataset.

See also: AWS API Documentation

Request Syntax

client.describe_dataset(
    datasetName='string'
)
type datasetName

string

param datasetName

[REQUIRED]

The name of the data set whose information is retrieved.

rtype

dict

returns

Response Syntax

{
    'dataset': {
        'name': 'string',
        'arn': 'string',
        'actions': [
            {
                'actionName': 'string',
                'queryAction': {
                    'sqlQuery': 'string',
                    'filters': [
                        {
                            'deltaTime': {
                                'offsetSeconds': 123,
                                'timeExpression': 'string'
                            }
                        },
                    ]
                },
                'containerAction': {
                    'image': 'string',
                    'executionRoleArn': 'string',
                    'resourceConfiguration': {
                        'computeType': 'ACU_1'|'ACU_2',
                        'volumeSizeInGB': 123
                    },
                    'variables': [
                        {
                            'name': 'string',
                            'stringValue': 'string',
                            'doubleValue': 123.0,
                            'datasetContentVersionValue': {
                                'datasetName': 'string'
                            },
                            'outputFileUriValue': {
                                'fileName': 'string'
                            }
                        },
                    ]
                }
            },
        ],
        'triggers': [
            {
                'schedule': {
                    'expression': 'string'
                },
                'dataset': {
                    'name': 'string'
                }
            },
        ],
        'contentDeliveryRules': [
            {
                'entryName': 'string',
                'destination': {
                    'iotEventsDestinationConfiguration': {
                        'inputName': 'string',
                        'roleArn': 'string'
                    },
                    's3DestinationConfiguration': {
                        'bucket': 'string',
                        'key': 'string',
                        'glueConfiguration': {
                            'tableName': 'string',
                            'databaseName': 'string'
                        },
                        'roleArn': 'string'
                    }
                }
            },
        ],
        'status': 'CREATING'|'ACTIVE'|'DELETING',
        'creationTime': datetime(2015, 1, 1),
        'lastUpdateTime': datetime(2015, 1, 1),
        'retentionPeriod': {
            'unlimited': True|False,
            'numberOfDays': 123
        },
        'versioningConfiguration': {
            'unlimited': True|False,
            'maxVersions': 123
        },
        'lateDataRules': [
            {
                'ruleName': 'string',
                'ruleConfiguration': {
                    'deltaTimeSessionWindowConfiguration': {
                        'timeoutInMinutes': 123
                    }
                }
            },
        ]
    }
}

Response Structure

  • (dict) --

    • dataset (dict) --

      An object that contains information about the data set.

      • name (string) --

        The name of the data set.

      • arn (string) --

        The ARN of the data set.

      • actions (list) --

        The DatasetAction objects that automatically create the data set contents.

        • (dict) --

          A DatasetAction object that specifies how data set contents are automatically created.

          • actionName (string) --

            The name of the data set action by which data set contents are automatically created.

          • queryAction (dict) --

            An SqlQueryDatasetAction object that uses an SQL query to automatically create data set contents.

            • sqlQuery (string) --

              A SQL query string.

            • filters (list) --

              Prefilters applied to message data.

              • (dict) --

                Information that is used to filter message data, to segregate it according to the timeframe in which it arrives.

                • deltaTime (dict) --

                  Used to limit data to that which has arrived since the last execution of the action.

                  • offsetSeconds (integer) --

                    The number of seconds of estimated in-flight lag time of message data. When you create dataset contents using message data from a specified timeframe, some message data might still be in flight when processing begins, and so do not arrive in time to be processed. Use this field to make allowances for the in flight time of your message data, so that data not processed from a previous timeframe is included with the next timeframe. Otherwise, missed message data would be excluded from processing during the next timeframe too, because its timestamp places it within the previous timeframe.

                  • timeExpression (string) --

                    An expression by which the time of the message data might be determined. This can be the name of a timestamp field or a SQL expression that is used to derive the time the message data was generated.

          • containerAction (dict) --

            Information that allows the system to run a containerized application to create the dataset contents. The application must be in a Docker container along with any required support libraries.

            • image (string) --

              The ARN of the Docker container stored in your account. The Docker container contains an application and required support libraries and is used to generate dataset contents.

            • executionRoleArn (string) --

              The ARN of the role that gives permission to the system to access required resources to run the containerAction . This includes, at minimum, permission to retrieve the dataset contents that are the input to the containerized application.

            • resourceConfiguration (dict) --

              Configuration of the resource that executes the containerAction .

              • computeType (string) --

                The type of the compute resource used to execute the containerAction . Possible values are: ACU_1 (vCPU=4, memory=16 GiB) or ACU_2 (vCPU=8, memory=32 GiB).

              • volumeSizeInGB (integer) --

                The size, in GB, of the persistent storage available to the resource instance used to execute the containerAction (min: 1, max: 50).

            • variables (list) --

              The values of variables used in the context of the execution of the containerized application (basically, parameters passed to the application). Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue .

              • (dict) --

                An instance of a variable to be passed to the containerAction execution. Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue .

                • name (string) --

                  The name of the variable.

                • stringValue (string) --

                  The value of the variable as a string.

                • doubleValue (float) --

                  The value of the variable as a double (numeric).

                • datasetContentVersionValue (dict) --

                  The value of the variable as a structure that specifies a dataset content version.

                  • datasetName (string) --

                    The name of the dataset whose latest contents are used as input to the notebook or application.

                • outputFileUriValue (dict) --

                  The value of the variable as a structure that specifies an output file URI.

                  • fileName (string) --

                    The URI of the location where dataset contents are stored, usually the URI of a file in an S3 bucket.

      • triggers (list) --

        The DatasetTrigger objects that specify when the data set is automatically updated.

        • (dict) --

          The DatasetTrigger that specifies when the data set is automatically updated.

          • schedule (dict) --

            The Schedule when the trigger is initiated.

            • expression (string) --

              The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch Events User Guide .

          • dataset (dict) --

            The data set whose content creation triggers the creation of this data set's contents.

            • name (string) --

              The name of the dataset whose content generation triggers the new dataset content generation.

      • contentDeliveryRules (list) --

        When dataset contents are created they are delivered to destinations specified here.

        • (dict) --

          When dataset contents are created, they are delivered to destination specified here.

          • entryName (string) --

            The name of the dataset content delivery rules entry.

          • destination (dict) --

            The destination to which dataset contents are delivered.

            • iotEventsDestinationConfiguration (dict) --

              Configuration information for delivery of dataset contents to AWS IoT Events.

              • inputName (string) --

                The name of the AWS IoT Events input to which dataset contents are delivered.

              • roleArn (string) --

                The ARN of the role that grants AWS IoT Analytics permission to deliver dataset contents to an AWS IoT Events input.

            • s3DestinationConfiguration (dict) --

              Configuration information for delivery of dataset contents to Amazon S3.

              • bucket (string) --

                The name of the S3 bucket to which dataset contents are delivered.

              • key (string) --

                The key of the dataset contents object in an S3 bucket. Each object has a key that is a unique identifier. Each object has exactly one key.

                You can create a unique key with the following options:

                • Use !{iotanalytics:scheduleTime} to insert the time of a scheduled SQL query run.

                • Use !{iotanalytics:versionId} to insert a unique hash that identifies a dataset content.

                • Use !{iotanalytics:creationTime} to insert the creation time of a dataset content.

                The following example creates a unique key for a CSV file: dataset/mydataset/!{iotanalytics:scheduleTime}/!{iotanalytics:versionId}.csv

                Note

                If you don't use !{iotanalytics:versionId} to specify the key, you might get duplicate keys. For example, you might have two dataset contents with the same scheduleTime but different versionId s. This means that one dataset content overwrites the other.

              • glueConfiguration (dict) --

                Configuration information for coordination with AWS Glue, a fully managed extract, transform and load (ETL) service.

                • tableName (string) --

                  The name of the table in your AWS Glue Data Catalog that is used to perform the ETL operations. An AWS Glue Data Catalog table contains partitioned data and descriptions of data sources and targets.

                • databaseName (string) --

                  The name of the database in your AWS Glue Data Catalog in which the table is located. An AWS Glue Data Catalog database contains metadata tables.

              • roleArn (string) --

                The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 and AWS Glue resources.

      • status (string) --

        The status of the data set.

      • creationTime (datetime) --

        When the data set was created.

      • lastUpdateTime (datetime) --

        The last time the data set was updated.

      • retentionPeriod (dict) --

        Optional. How long, in days, message data is kept for the data set.

        • unlimited (boolean) --

          If true, message data is kept indefinitely.

        • numberOfDays (integer) --

          The number of days that message data is kept. The unlimited parameter must be false.

      • versioningConfiguration (dict) --

        Optional. How many versions of dataset contents are kept. If not specified or set to null, only the latest version plus the latest succeeded version (if they are different) are kept for the time period specified by the retentionPeriod parameter. For more information, see Keeping Multiple Versions of AWS IoT Analytics Data Sets in the AWS IoT Analytics User Guide .

        • unlimited (boolean) --

          If true, unlimited versions of dataset contents are kept.

        • maxVersions (integer) --

          How many versions of dataset contents are kept. The unlimited parameter must be false .

      • lateDataRules (list) --

        A list of data rules that send notifications to Amazon CloudWatch, when data arrives late. To specify lateDataRules , the dataset must use a DeltaTimer filter.

        • (dict) --

          A structure that contains the name and configuration information of a late data rule.

          • ruleName (string) --

            The name of the late data rule.

          • ruleConfiguration (dict) --

            The information needed to configure the late data rule.

            • deltaTimeSessionWindowConfiguration (dict) --

              The information needed to configure a delta time session window.

              • timeoutInMinutes (integer) --

                A time interval. You can use timeoutInMinutes so that AWS IoT Analytics can batch up late data notifications that have been generated since the last execution. AWS IoT Analytics sends one batch of notifications to Amazon CloudWatch Events at one time.

                For more information about how to write a timestamp expression, see Date and Time Functions and Operators, in the Presto 0.172 Documentation .

DescribeDatastore (updated) Link ¶
Changes (response)
{'datastore': {'lastMessageArrivalTime': 'timestamp'}}

Retrieves information about a data store.

See also: AWS API Documentation

Request Syntax

client.describe_datastore(
    datastoreName='string',
    includeStatistics=True|False
)
type datastoreName

string

param datastoreName

[REQUIRED]

The name of the data store

type includeStatistics

boolean

param includeStatistics

If true, additional statistical information about the data store is included in the response. This feature cannot be used with a data store whose S3 storage is customer-managed.

rtype

dict

returns

Response Syntax

{
    'datastore': {
        'name': 'string',
        'storage': {
            'serviceManagedS3': {},
            'customerManagedS3': {
                'bucket': 'string',
                'keyPrefix': 'string',
                'roleArn': 'string'
            }
        },
        'arn': 'string',
        'status': 'CREATING'|'ACTIVE'|'DELETING',
        'retentionPeriod': {
            'unlimited': True|False,
            'numberOfDays': 123
        },
        'creationTime': datetime(2015, 1, 1),
        'lastUpdateTime': datetime(2015, 1, 1),
        'lastMessageArrivalTime': datetime(2015, 1, 1)
    },
    'statistics': {
        'size': {
            'estimatedSizeInBytes': 123.0,
            'estimatedOn': datetime(2015, 1, 1)
        }
    }
}

Response Structure

  • (dict) --

    • datastore (dict) --

      Information about the data store.

      • name (string) --

        The name of the data store.

      • storage (dict) --

        Where data store data is stored. You can choose one of serviceManagedS3 or customerManagedS3 storage. If not specified, the default is serviceManagedS3 . You cannot change this storage option after the data store is created.

        • serviceManagedS3 (dict) --

          Use this to store data store data in an S3 bucket managed by AWS IoT Analytics. You cannot change the choice of service-managed or customer-managed S3 storage after the data store is created.

        • customerManagedS3 (dict) --

          Use this to store data store data in an S3 bucket that you manage. When customer managed storage is selected, the retentionPeriod parameter is ignored. The choice of service-managed or customer-managed S3 storage cannot be changed after creation of the data store.

          • bucket (string) --

            The name of the S3 bucket in which data store data is stored.

          • keyPrefix (string) --

            Optional. The prefix used to create the keys of the data store data objects. Each object in an S3 bucket has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/).

          • roleArn (string) --

            The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

      • arn (string) --

        The ARN of the data store.

      • status (string) --

        The status of a data store:

        CREATING

        The data store is being created.

        ACTIVE

        The data store has been created and can be used.

        DELETING

        The data store is being deleted.

      • retentionPeriod (dict) --

        How long, in days, message data is kept for the data store. When customerManagedS3 storage is selected, this parameter is ignored.

        • unlimited (boolean) --

          If true, message data is kept indefinitely.

        • numberOfDays (integer) --

          The number of days that message data is kept. The unlimited parameter must be false.

      • creationTime (datetime) --

        When the data store was created.

      • lastUpdateTime (datetime) --

        The last time the data store was updated.

      • lastMessageArrivalTime (datetime) --

        The last time when a new message arrived in the data store.

        AWS IoT Analytics updates this value at most once per minute for one data store. Hence, the lastMessageArrivalTime value is an approximation.

        This feature only applies to messages that arrived in the data store after October 23, 2020.

    • statistics (dict) --

      Additional statistical information about the data store. Included if the includeStatistics parameter is set to true in the request.

      • size (dict) --

        The estimated size of the data store.

        • estimatedSizeInBytes (float) --

          The estimated size of the resource, in bytes.

        • estimatedOn (datetime) --

          The time when the estimate of the size of the resource was made.

ListChannels (updated) Link ¶
Changes (response)
{'channelSummaries': {'lastMessageArrivalTime': 'timestamp'}}

Retrieves a list of channels.

See also: AWS API Documentation

Request Syntax

client.list_channels(
    nextToken='string',
    maxResults=123
)
type nextToken

string

param nextToken

The token for the next set of results.

type maxResults

integer

param maxResults

The maximum number of results to return in this request.

The default value is 100.

rtype

dict

returns

Response Syntax

{
    'channelSummaries': [
        {
            'channelName': 'string',
            'channelStorage': {
                'serviceManagedS3': {},
                'customerManagedS3': {
                    'bucket': 'string',
                    'keyPrefix': 'string',
                    'roleArn': 'string'
                }
            },
            'status': 'CREATING'|'ACTIVE'|'DELETING',
            'creationTime': datetime(2015, 1, 1),
            'lastUpdateTime': datetime(2015, 1, 1),
            'lastMessageArrivalTime': datetime(2015, 1, 1)
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • channelSummaries (list) --

      A list of ChannelSummary objects.

      • (dict) --

        A summary of information about a channel.

        • channelName (string) --

          The name of the channel.

        • channelStorage (dict) --

          Where channel data is stored.

          • serviceManagedS3 (dict) --

            Used to store channel data in an S3 bucket managed by AWS IoT Analytics.

          • customerManagedS3 (dict) --

            Used to store channel data in an S3 bucket that you manage.

            • bucket (string) --

              The name of the S3 bucket in which channel data is stored.

            • keyPrefix (string) --

              Optional. The prefix used to create the keys of the channel data objects. Each object in an S3 bucket has a key that is its unique identifier within the bucket (each object in a bucket has exactly one key). The prefix must end with a forward slash (/).

            • roleArn (string) --

              The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

        • status (string) --

          The status of the channel.

        • creationTime (datetime) --

          When the channel was created.

        • lastUpdateTime (datetime) --

          The last time the channel was updated.

        • lastMessageArrivalTime (datetime) --

          The last time when a new message arrived in the channel.

          AWS IoT Analytics updates this value at most once per minute for one channel. Hence, the lastMessageArrivalTime value is an approximation.

          This feature only applies to messages that arrived in the data store after October 23, 2020.

    • nextToken (string) --

      The token to retrieve the next set of results, or null if there are no more results.

ListDatastores (updated) Link ¶
Changes (response)
{'datastoreSummaries': {'lastMessageArrivalTime': 'timestamp'}}

Retrieves a list of data stores.

See also: AWS API Documentation

Request Syntax

client.list_datastores(
    nextToken='string',
    maxResults=123
)
type nextToken

string

param nextToken

The token for the next set of results.

type maxResults

integer

param maxResults

The maximum number of results to return in this request.

The default value is 100.

rtype

dict

returns

Response Syntax

{
    'datastoreSummaries': [
        {
            'datastoreName': 'string',
            'datastoreStorage': {
                'serviceManagedS3': {},
                'customerManagedS3': {
                    'bucket': 'string',
                    'keyPrefix': 'string',
                    'roleArn': 'string'
                }
            },
            'status': 'CREATING'|'ACTIVE'|'DELETING',
            'creationTime': datetime(2015, 1, 1),
            'lastUpdateTime': datetime(2015, 1, 1),
            'lastMessageArrivalTime': datetime(2015, 1, 1)
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • datastoreSummaries (list) --

      A list of DatastoreSummary objects.

      • (dict) --

        A summary of information about a data store.

        • datastoreName (string) --

          The name of the data store.

        • datastoreStorage (dict) --

          Where data store data is stored.

          • serviceManagedS3 (dict) --

            Used to store data store data in an S3 bucket managed by AWS IoT Analytics.

          • customerManagedS3 (dict) --

            Used to store data store data in an S3 bucket that you manage.

            • bucket (string) --

              The name of the S3 bucket in which data store data is stored.

            • keyPrefix (string) --

              Optional. The prefix used to create the keys of the data store data objects. Each object in an S3 bucket has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/).

            • roleArn (string) --

              The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 resources.

        • status (string) --

          The status of the data store.

        • creationTime (datetime) --

          When the data store was created.

        • lastUpdateTime (datetime) --

          The last time the data store was updated.

        • lastMessageArrivalTime (datetime) --

          The last time when a new message arrived in the data store.

          AWS IoT Analytics updates this value at most once per minute for one data store. Hence, the lastMessageArrivalTime value is an approximation.

          This feature only applies to messages that arrived in the data store after October 23, 2020.

    • nextToken (string) --

      The token to retrieve the next set of results, or null if there are no more results.

UpdateDataset (updated) Link ¶
Changes (request)
{'lateDataRules': [{'ruleConfiguration': {'deltaTimeSessionWindowConfiguration': {'timeoutInMinutes': 'integer'}},
                    'ruleName': 'string'}]}

Updates the settings of a data set.

See also: AWS API Documentation

Request Syntax

client.update_dataset(
    datasetName='string',
    actions=[
        {
            'actionName': 'string',
            'queryAction': {
                'sqlQuery': 'string',
                'filters': [
                    {
                        'deltaTime': {
                            'offsetSeconds': 123,
                            'timeExpression': 'string'
                        }
                    },
                ]
            },
            'containerAction': {
                'image': 'string',
                'executionRoleArn': 'string',
                'resourceConfiguration': {
                    'computeType': 'ACU_1'|'ACU_2',
                    'volumeSizeInGB': 123
                },
                'variables': [
                    {
                        'name': 'string',
                        'stringValue': 'string',
                        'doubleValue': 123.0,
                        'datasetContentVersionValue': {
                            'datasetName': 'string'
                        },
                        'outputFileUriValue': {
                            'fileName': 'string'
                        }
                    },
                ]
            }
        },
    ],
    triggers=[
        {
            'schedule': {
                'expression': 'string'
            },
            'dataset': {
                'name': 'string'
            }
        },
    ],
    contentDeliveryRules=[
        {
            'entryName': 'string',
            'destination': {
                'iotEventsDestinationConfiguration': {
                    'inputName': 'string',
                    'roleArn': 'string'
                },
                's3DestinationConfiguration': {
                    'bucket': 'string',
                    'key': 'string',
                    'glueConfiguration': {
                        'tableName': 'string',
                        'databaseName': 'string'
                    },
                    'roleArn': 'string'
                }
            }
        },
    ],
    retentionPeriod={
        'unlimited': True|False,
        'numberOfDays': 123
    },
    versioningConfiguration={
        'unlimited': True|False,
        'maxVersions': 123
    },
    lateDataRules=[
        {
            'ruleName': 'string',
            'ruleConfiguration': {
                'deltaTimeSessionWindowConfiguration': {
                    'timeoutInMinutes': 123
                }
            }
        },
    ]
)
type datasetName

string

param datasetName

[REQUIRED]

The name of the data set to update.

type actions

list

param actions

[REQUIRED]

A list of DatasetAction objects.

  • (dict) --

    A DatasetAction object that specifies how data set contents are automatically created.

    • actionName (string) --

      The name of the data set action by which data set contents are automatically created.

    • queryAction (dict) --

      An SqlQueryDatasetAction object that uses an SQL query to automatically create data set contents.

      • sqlQuery (string) -- [REQUIRED]

        A SQL query string.

      • filters (list) --

        Prefilters applied to message data.

        • (dict) --

          Information that is used to filter message data, to segregate it according to the timeframe in which it arrives.

          • deltaTime (dict) --

            Used to limit data to that which has arrived since the last execution of the action.

            • offsetSeconds (integer) -- [REQUIRED]

              The number of seconds of estimated in-flight lag time of message data. When you create dataset contents using message data from a specified timeframe, some message data might still be in flight when processing begins, and so do not arrive in time to be processed. Use this field to make allowances for the in flight time of your message data, so that data not processed from a previous timeframe is included with the next timeframe. Otherwise, missed message data would be excluded from processing during the next timeframe too, because its timestamp places it within the previous timeframe.

            • timeExpression (string) -- [REQUIRED]

              An expression by which the time of the message data might be determined. This can be the name of a timestamp field or a SQL expression that is used to derive the time the message data was generated.

    • containerAction (dict) --

      Information that allows the system to run a containerized application to create the dataset contents. The application must be in a Docker container along with any required support libraries.

      • image (string) -- [REQUIRED]

        The ARN of the Docker container stored in your account. The Docker container contains an application and required support libraries and is used to generate dataset contents.

      • executionRoleArn (string) -- [REQUIRED]

        The ARN of the role that gives permission to the system to access required resources to run the containerAction . This includes, at minimum, permission to retrieve the dataset contents that are the input to the containerized application.

      • resourceConfiguration (dict) -- [REQUIRED]

        Configuration of the resource that executes the containerAction .

        • computeType (string) -- [REQUIRED]

          The type of the compute resource used to execute the containerAction . Possible values are: ACU_1 (vCPU=4, memory=16 GiB) or ACU_2 (vCPU=8, memory=32 GiB).

        • volumeSizeInGB (integer) -- [REQUIRED]

          The size, in GB, of the persistent storage available to the resource instance used to execute the containerAction (min: 1, max: 50).

      • variables (list) --

        The values of variables used in the context of the execution of the containerized application (basically, parameters passed to the application). Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue .

        • (dict) --

          An instance of a variable to be passed to the containerAction execution. Each variable must have a name and a value given by one of stringValue , datasetContentVersionValue , or outputFileUriValue .

          • name (string) -- [REQUIRED]

            The name of the variable.

          • stringValue (string) --

            The value of the variable as a string.

          • doubleValue (float) --

            The value of the variable as a double (numeric).

          • datasetContentVersionValue (dict) --

            The value of the variable as a structure that specifies a dataset content version.

            • datasetName (string) -- [REQUIRED]

              The name of the dataset whose latest contents are used as input to the notebook or application.

          • outputFileUriValue (dict) --

            The value of the variable as a structure that specifies an output file URI.

            • fileName (string) -- [REQUIRED]

              The URI of the location where dataset contents are stored, usually the URI of a file in an S3 bucket.

type triggers

list

param triggers

A list of DatasetTrigger objects. The list can be empty or can contain up to five DatasetTrigger objects.

  • (dict) --

    The DatasetTrigger that specifies when the data set is automatically updated.

    • schedule (dict) --

      The Schedule when the trigger is initiated.

      • expression (string) --

        The expression that defines when to trigger an update. For more information, see Schedule Expressions for Rules in the Amazon CloudWatch Events User Guide .

    • dataset (dict) --

      The data set whose content creation triggers the creation of this data set's contents.

      • name (string) -- [REQUIRED]

        The name of the dataset whose content generation triggers the new dataset content generation.

type contentDeliveryRules

list

param contentDeliveryRules

When dataset contents are created, they are delivered to destinations specified here.

  • (dict) --

    When dataset contents are created, they are delivered to destination specified here.

    • entryName (string) --

      The name of the dataset content delivery rules entry.

    • destination (dict) -- [REQUIRED]

      The destination to which dataset contents are delivered.

      • iotEventsDestinationConfiguration (dict) --

        Configuration information for delivery of dataset contents to AWS IoT Events.

        • inputName (string) -- [REQUIRED]

          The name of the AWS IoT Events input to which dataset contents are delivered.

        • roleArn (string) -- [REQUIRED]

          The ARN of the role that grants AWS IoT Analytics permission to deliver dataset contents to an AWS IoT Events input.

      • s3DestinationConfiguration (dict) --

        Configuration information for delivery of dataset contents to Amazon S3.

        • bucket (string) -- [REQUIRED]

          The name of the S3 bucket to which dataset contents are delivered.

        • key (string) -- [REQUIRED]

          The key of the dataset contents object in an S3 bucket. Each object has a key that is a unique identifier. Each object has exactly one key.

          You can create a unique key with the following options:

          • Use !{iotanalytics:scheduleTime} to insert the time of a scheduled SQL query run.

          • Use !{iotanalytics:versionId} to insert a unique hash that identifies a dataset content.

          • Use !{iotanalytics:creationTime} to insert the creation time of a dataset content.

          The following example creates a unique key for a CSV file: dataset/mydataset/!{iotanalytics:scheduleTime}/!{iotanalytics:versionId}.csv

          Note

          If you don't use !{iotanalytics:versionId} to specify the key, you might get duplicate keys. For example, you might have two dataset contents with the same scheduleTime but different versionId s. This means that one dataset content overwrites the other.

        • glueConfiguration (dict) --

          Configuration information for coordination with AWS Glue, a fully managed extract, transform and load (ETL) service.

          • tableName (string) -- [REQUIRED]

            The name of the table in your AWS Glue Data Catalog that is used to perform the ETL operations. An AWS Glue Data Catalog table contains partitioned data and descriptions of data sources and targets.

          • databaseName (string) -- [REQUIRED]

            The name of the database in your AWS Glue Data Catalog in which the table is located. An AWS Glue Data Catalog database contains metadata tables.

        • roleArn (string) -- [REQUIRED]

          The ARN of the role that grants AWS IoT Analytics permission to interact with your Amazon S3 and AWS Glue resources.

type retentionPeriod

dict

param retentionPeriod

How long, in days, dataset contents are kept for the dataset.

  • unlimited (boolean) --

    If true, message data is kept indefinitely.

  • numberOfDays (integer) --

    The number of days that message data is kept. The unlimited parameter must be false.

type versioningConfiguration

dict

param versioningConfiguration

Optional. How many versions of dataset contents are kept. If not specified or set to null, only the latest version plus the latest succeeded version (if they are different) are kept for the time period specified by the retentionPeriod parameter. For more information, see Keeping Multiple Versions of AWS IoT Analytics Data Sets in the AWS IoT Analytics User Guide .

  • unlimited (boolean) --

    If true, unlimited versions of dataset contents are kept.

  • maxVersions (integer) --

    How many versions of dataset contents are kept. The unlimited parameter must be false .

type lateDataRules

list

param lateDataRules

A list of data rules that send notifications to Amazon CloudWatch, when data arrives late. To specify lateDataRules , the dataset must use a DeltaTimer filter.

  • (dict) --

    A structure that contains the name and configuration information of a late data rule.

    • ruleName (string) --

      The name of the late data rule.

    • ruleConfiguration (dict) -- [REQUIRED]

      The information needed to configure the late data rule.

      • deltaTimeSessionWindowConfiguration (dict) --

        The information needed to configure a delta time session window.

        • timeoutInMinutes (integer) -- [REQUIRED]

          A time interval. You can use timeoutInMinutes so that AWS IoT Analytics can batch up late data notifications that have been generated since the last execution. AWS IoT Analytics sends one batch of notifications to Amazon CloudWatch Events at one time.

          For more information about how to write a timestamp expression, see Date and Time Functions and Operators, in the Presto 0.172 Documentation .

returns

None