AWS API Changes

2024/08/19 - Amazon Bedrock - 4 new api methods

Changes Amazon Bedrock Batch Inference/ Model Invocation is a feature which allows customers to asynchronously run inference on a large set of records/files stored in S3.

CreateModelInvocationJob (new)

Link ¶

Creates a job to invoke a model on multiple prompts (batch inference). Format your data according to Format your inference data and upload it to an Amazon S3 bucket. For more information, see Create a batch inference job.

The response returns a jobArn that you can use to stop or get details about the job. You can check the status of the job by sending a GetModelCustomizationJob request.

See also: AWS API Documentation

Request Syntax

client.create_model_invocation_job(
    jobName='string',
    roleArn='string',
    clientRequestToken='string',
    modelId='string',
    inputDataConfig={
        's3InputDataConfig': {
            's3InputFormat': 'JSONL',
            's3Uri': 'string'
        }
    },
    outputDataConfig={
        's3OutputDataConfig': {
            's3Uri': 'string',
            's3EncryptionKeyId': 'string'
        }
    },
    timeoutDurationInHours=123,
    tags=[
        {
            'key': 'string',
            'value': 'string'
        },
    ]
)

type jobName:

string

param jobName:

[REQUIRED]

A name to give the batch inference job.

type roleArn:

string

param roleArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the service role with permissions to carry out and manage batch inference. You can use the console to create a default service role or follow the steps at Create a service role for batch inference.

type clientRequestToken:

string

param clientRequestToken:

A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

This field is autopopulated if not provided.

type modelId:

string

param modelId:

[REQUIRED]

The unique identifier of the foundation model to use for the batch inference job.

type inputDataConfig:

dict

param inputDataConfig:

[REQUIRED]

Details about the location of the input to the batch inference job.

s3InputDataConfig (dict) --

Contains the configuration of the S3 location of the input data.
- s3InputFormat (string) --
  
  The format of the input data.
- s3Uri (string) -- [REQUIRED]
  
  The S3 location of the input data.

type outputDataConfig:

dict

param outputDataConfig:

[REQUIRED]

Details about the location of the output of the batch inference job.

s3OutputDataConfig (dict) --

Contains the configuration of the S3 location of the output data.
- s3Uri (string) -- [REQUIRED]
  
  The S3 location of the output data.
- s3EncryptionKeyId (string) --
  
  The unique identifier of the key that encrypts the S3 location of the output data.

type timeoutDurationInHours:

integer

param timeoutDurationInHours:

The number of hours after which to force the batch inference job to time out.

type tags:

list

param tags:

Any tags to associate with the batch inference job. For more information, see Tagging Amazon Bedrock resources.

(dict) --

Definition of the key/value pair for a tag.
- key (string) -- [REQUIRED]
  
  Key for the tag.
- value (string) -- [REQUIRED]
  
  Value for the tag.

rtype:

dict

returns:

Response Syntax

{
    'jobArn': 'string'
}

Response Structure

(dict) --
- jobArn (string) --
  
  The Amazon Resource Name (ARN) of the batch inference job.

StopModelInvocationJob (new)

Link ¶

Stops a batch inference job. You're only charged for tokens that were already processed. For more information, see Stop a batch inference job.

See also: AWS API Documentation

Request Syntax

client.stop_model_invocation_job(
    jobIdentifier='string'
)

type jobIdentifier:

string

param jobIdentifier:

[REQUIRED]

The Amazon Resource Name (ARN) of the batch inference job to stop.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

(dict) --

ListModelInvocationJobs (new)

Link ¶

Lists all batch inference jobs in the account. For more information, see View details about a batch inference job.

See also: AWS API Documentation

Request Syntax

client.list_model_invocation_jobs(
    submitTimeAfter=datetime(2015, 1, 1),
    submitTimeBefore=datetime(2015, 1, 1),
    statusEquals='Submitted'|'InProgress'|'Completed'|'Failed'|'Stopping'|'Stopped'|'PartiallyCompleted'|'Expired'|'Validating'|'Scheduled',
    nameContains='string',
    maxResults=123,
    nextToken='string',
    sortBy='CreationTime',
    sortOrder='Ascending'|'Descending'
)

type submitTimeAfter:

datetime

param submitTimeAfter:

Specify a time to filter for batch inference jobs that were submitted after the time you specify.

type submitTimeBefore:

datetime

param submitTimeBefore:

Specify a time to filter for batch inference jobs that were submitted before the time you specify.

type statusEquals:

string

param statusEquals:

Specify a status to filter for batch inference jobs whose statuses match the string you specify.

type nameContains:

string

param nameContains:

Specify a string to filter for batch inference jobs whose names contain the string.

type maxResults:

integer

param maxResults:

The maximum number of results to return. If there are more results than the number that you specify, a nextToken value is returned. Use the nextToken in a request to return the next batch of results.

type nextToken:

string

param nextToken:

If there were more results than the value you specified in the maxResults field in a previous ListModelInvocationJobs request, the response would have returned a nextToken value. To see the next batch of results, send the nextToken value in another request.

type sortBy:

string

param sortBy:

An attribute by which to sort the results.

type sortOrder:

string

param sortOrder:

Specifies whether to sort the results by ascending or descending order.

rtype:

dict

returns:

Response Syntax

{
    'nextToken': 'string',
    'invocationJobSummaries': [
        {
            'jobArn': 'string',
            'jobName': 'string',
            'modelId': 'string',
            'clientRequestToken': 'string',
            'roleArn': 'string',
            'status': 'Submitted'|'InProgress'|'Completed'|'Failed'|'Stopping'|'Stopped'|'PartiallyCompleted'|'Expired'|'Validating'|'Scheduled',
            'message': 'string',
            'submitTime': datetime(2015, 1, 1),
            'lastModifiedTime': datetime(2015, 1, 1),
            'endTime': datetime(2015, 1, 1),
            'inputDataConfig': {
                's3InputDataConfig': {
                    's3InputFormat': 'JSONL',
                    's3Uri': 'string'
                }
            },
            'outputDataConfig': {
                's3OutputDataConfig': {
                    's3Uri': 'string',
                    's3EncryptionKeyId': 'string'
                }
            },
            'timeoutDurationInHours': 123,
            'jobExpirationTime': datetime(2015, 1, 1)
        },
    ]
}

Response Structure

(dict) --
- nextToken (string) --
  
  If there are more results than can fit in the response, a nextToken is returned. Use the nextToken in a request to return the next batch of results.
- invocationJobSummaries (list) --
  
  A list of items, each of which contains a summary about a batch inference job.
  - (dict) --
    
    A summary of a batch inference job.
    - jobArn (string) --
      
      The Amazon Resource Name (ARN) of the batch inference job.
    - jobName (string) --
      
      The name of the batch inference job.
    - modelId (string) --
      
      The unique identifier of the foundation model used for model inference.
    - clientRequestToken (string) --
      
      A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.
    - roleArn (string) --
      
      The Amazon Resource Name (ARN) of the service role with permissions to carry out and manage batch inference. You can use the console to create a default service role or follow the steps at Create a service role for batch inference.
    - status (string) --
      
      The status of the batch inference job.
    - message (string) --
      
      If the batch inference job failed, this field contains a message describing why the job failed.
    - submitTime (datetime) --
      
      The time at which the batch inference job was submitted.
    - lastModifiedTime (datetime) --
      
      The time at which the batch inference job was last modified.
    - endTime (datetime) --
      
      The time at which the batch inference job ended.
    - inputDataConfig (dict) --
      
      Details about the location of the input to the batch inference job.
      Note
      
      This is a Tagged Union structure. Only one of the following top level keys will be set: s3InputDataConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
      
      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      - s3InputDataConfig (dict) --
        
        Contains the configuration of the S3 location of the input data.
        
        s3InputFormat (string) --
        
        The format of the input data.
        
        s3Uri (string) --
        
        The S3 location of the input data.
    - outputDataConfig (dict) --
      
      Details about the location of the output of the batch inference job.
      Note
      
      This is a Tagged Union structure. Only one of the following top level keys will be set: s3OutputDataConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
      
      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      - s3OutputDataConfig (dict) --
        
        Contains the configuration of the S3 location of the output data.
        
        s3Uri (string) --
        
        The S3 location of the output data.
        
        s3EncryptionKeyId (string) --
        
        The unique identifier of the key that encrypts the S3 location of the output data.
    - timeoutDurationInHours (integer) --
      
      The number of hours after which the batch inference job was set to time out.
    - jobExpirationTime (datetime) --
      
      The time at which the batch inference job times or timed out.

GetModelInvocationJob (new)

Link ¶

Gets details about a batch inference job. For more information, see View details about a batch inference job

See also: AWS API Documentation

Request Syntax

client.get_model_invocation_job(
    jobIdentifier='string'
)

type jobIdentifier:

string

param jobIdentifier:

[REQUIRED]

The Amazon Resource Name (ARN) of the batch inference job.

rtype:

dict

returns:

Response Syntax

{
    'jobArn': 'string',
    'jobName': 'string',
    'modelId': 'string',
    'clientRequestToken': 'string',
    'roleArn': 'string',
    'status': 'Submitted'|'InProgress'|'Completed'|'Failed'|'Stopping'|'Stopped'|'PartiallyCompleted'|'Expired'|'Validating'|'Scheduled',
    'message': 'string',
    'submitTime': datetime(2015, 1, 1),
    'lastModifiedTime': datetime(2015, 1, 1),
    'endTime': datetime(2015, 1, 1),
    'inputDataConfig': {
        's3InputDataConfig': {
            's3InputFormat': 'JSONL',
            's3Uri': 'string'
        }
    },
    'outputDataConfig': {
        's3OutputDataConfig': {
            's3Uri': 'string',
            's3EncryptionKeyId': 'string'
        }
    },
    'timeoutDurationInHours': 123,
    'jobExpirationTime': datetime(2015, 1, 1)
}

Response Structure

(dict) --
- jobArn (string) --
  
  The Amazon Resource Name (ARN) of the batch inference job.
- jobName (string) --
  
  The name of the batch inference job.
- modelId (string) --
  
  The unique identifier of the foundation model used for model inference.
- clientRequestToken (string) --
  
  A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.
- roleArn (string) --
  
  The Amazon Resource Name (ARN) of the service role with permissions to carry out and manage batch inference. You can use the console to create a default service role or follow the steps at Create a service role for batch inference.
- status (string) --
  
  The status of the batch inference job.
- message (string) --
  
  If the batch inference job failed, this field contains a message describing why the job failed.
- submitTime (datetime) --
  
  The time at which the batch inference job was submitted.
- lastModifiedTime (datetime) --
  
  The time at which the batch inference job was last modified.
- endTime (datetime) --
  
  The time at which the batch inference job ended.
- inputDataConfig (dict) --
  
  Details about the location of the input to the batch inference job.
  Note
  
  This is a Tagged Union structure. Only one of the following top level keys will be set: s3InputDataConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
```
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
```
  - s3InputDataConfig (dict) --
    
    Contains the configuration of the S3 location of the input data.
    - s3InputFormat (string) --
      
      The format of the input data.
    - s3Uri (string) --
      
      The S3 location of the input data.
- outputDataConfig (dict) --
  
  Details about the location of the output of the batch inference job.
  Note
  
  This is a Tagged Union structure. Only one of the following top level keys will be set: s3OutputDataConfig. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
```
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
```
  - s3OutputDataConfig (dict) --
    
    Contains the configuration of the S3 location of the output data.
    - s3Uri (string) --
      
      The S3 location of the output data.
    - s3EncryptionKeyId (string) --
      
      The unique identifier of the key that encrypts the S3 location of the output data.
- timeoutDurationInHours (integer) --
  
  The number of hours after which batch inference job was set to time out.
- jobExpirationTime (datetime) --
  
  The time at which the batch inference job times or timed out.