Amazon Kinesis

2015/05/21 - Amazon Kinesis - 1 updated api methods

GetRecords (updated) Link ΒΆ
Changes (response)
{'MillisBehindLatest': 'long'}

Gets data records from a shard.

Specify a shard iterator using the ShardIterator parameter. The shard iterator specifies the position in the shard from which you want to start reading data records sequentially. If there are no records available in the portion of the shard that the iterator points to, GetRecords returns an empty list. Note that it might take multiple calls to get to a portion of the shard that contains records.

You can scale by provisioning multiple shards. Your application should have one thread per shard, each reading continuously from its stream. To read from a stream continually, call GetRecords in a loop. Use GetShardIterator to get the shard iterator to specify in the first GetRecords call. GetRecords returns a new shard iterator in NextShardIterator . Specify the shard iterator returned in NextShardIterator in subsequent calls to GetRecords. Note that if the shard has been closed, the shard iterator can't return more data and GetRecords returns null in NextShardIterator . You can terminate the loop when the shard is closed, or when the shard iterator reaches the record with the sequence number or other attribute that marks it as the last record to process.

Each data record can be up to 50 KB in size, and each shard can read up to 2 MB per second. You can ensure that your calls don't exceed the maximum supported size or throughput by using the Limit parameter to specify the maximum number of records that GetRecords can return. Consider your average record size when determining this limit. For example, if your average record size is 40 KB, you can limit the data returned to about 1 MB per call by specifying 25 as the limit.

The size of the data returned by GetRecords will vary depending on the utilization of the shard. The maximum size of data that GetRecords can return is 10 MB. If a call returns this amount of data, subsequent calls made within the next 5 seconds throw ProvisionedThroughputExceededException . If there is insufficient provisioned throughput on the shard, subsequent calls made within the next 1 second throw ProvisionedThroughputExceededException . Note that GetRecords won't return any data when it throws an exception. For this reason, we recommend that you wait one second between calls to GetRecords; however, it's possible that the application will get exceptions for longer than 1 second.

To detect whether the application is falling behind in processing, you can use the MillisBehindLatest response attribute. You can also monitor the amount of data in a stream using the CloudWatch metrics. For more information, see Monitoring Amazon Kinesis with Amazon CloudWatch in the Amazon Kinesis Developer Guide .

Request Syntax

client.get_records(
    ShardIterator='string',
    Limit=123
)
type ShardIterator

string

param ShardIterator

[REQUIRED]

The position in the shard from which you want to start sequentially reading data records. A shard iterator specifies this position using the sequence number of a data record in the shard.

type Limit

integer

param Limit

The maximum number of records to return. Specify a value of up to 10,000. If you specify a value that is greater than 10,000, GetRecords throws InvalidArgumentException .

rtype

dict

returns

Response Syntax

{
    'Records': [
        {
            'SequenceNumber': 'string',
            'Data': b'bytes',
            'PartitionKey': 'string'
        },
    ],
    'NextShardIterator': 'string',
    'MillisBehindLatest': 123
}

Response Structure

  • (dict) --

    Represents the output for GetRecords.

    • Records (list) --

      The data records retrieved from the shard.

      • (dict) --

        The unit of data of the Amazon Kinesis stream, which is composed of a sequence number, a partition key, and a data blob.

        • SequenceNumber (string) --

          The unique identifier for the record in the Amazon Kinesis stream.

        • Data (bytes) --

          The data blob. The data in the blob is both opaque and immutable to the Amazon Kinesis service, which does not inspect, interpret, or change the data in the blob in any way. The maximum size of the data blob (the payload before base64-encoding) is 50 kilobytes (KB)

        • PartitionKey (string) --

          Identifies which shard in the stream the data record is assigned to.

    • NextShardIterator (string) --

      The next position in the shard from which to start sequentially reading data records. If set to null , the shard has been closed and the requested iterator will not return any more data.

    • MillisBehindLatest (integer) --

      The number of milliseconds the GetRecords response is from the tip of the stream, indicating how far behind current time the consumer is. A value of zero indicates record processing is caught up, and there are no new records to process at this moment.