AWS API Changes

2025/07/16 - Amazon Bedrock - 4 new api methods

Changes This release adds support for on-demand custom model inference through CustomModelDeployment APIs for Amazon Bedrock.

DeleteCustomModelDeployment (new)

Link ¶

Deletes a custom model deployment. This operation stops the deployment and removes it from your account. After deletion, the deployment ARN can no longer be used for inference requests.

The following actions are related to the DeleteCustomModelDeployment operation:

See also: AWS API Documentation

Request Syntax

client.delete_custom_model_deployment(
    customModelDeploymentIdentifier='string'
)

type customModelDeploymentIdentifier:

string

param customModelDeploymentIdentifier:

[REQUIRED]

The Amazon Resource Name (ARN) or name of the custom model deployment to delete.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

(dict) --

GetCustomModelDeployment (new)

Link ¶

Retrieves information about a custom model deployment, including its status, configuration, and metadata. Use this operation to monitor the deployment status and retrieve details needed for inference requests.

The following actions are related to the GetCustomModelDeployment operation:

See also: AWS API Documentation

Request Syntax

client.get_custom_model_deployment(
    customModelDeploymentIdentifier='string'
)

type customModelDeploymentIdentifier:

string

param customModelDeploymentIdentifier:

[REQUIRED]

The Amazon Resource Name (ARN) or name of the custom model deployment to retrieve information about.

rtype:

dict

returns:

Response Syntax

{
    'customModelDeploymentArn': 'string',
    'modelDeploymentName': 'string',
    'modelArn': 'string',
    'createdAt': datetime(2015, 1, 1),
    'status': 'Creating'|'Active'|'Failed',
    'description': 'string',
    'failureMessage': 'string',
    'lastUpdatedAt': datetime(2015, 1, 1)
}

Response Structure

(dict) --
- customModelDeploymentArn (string) --
  
  The Amazon Resource Name (ARN) of the custom model deployment.
- modelDeploymentName (string) --
  
  The name of the custom model deployment.
- modelArn (string) --
  
  The Amazon Resource Name (ARN) of the custom model associated with this deployment.
- createdAt (datetime) --
  
  The date and time when the custom model deployment was created.
- status (string) --
  
  The status of the custom model deployment. Possible values are:
  - CREATING - The deployment is being set up and prepared for inference.
  - ACTIVE - The deployment is ready and available for inference requests.
  - FAILED - The deployment failed to be created or became unavailable.
- description (string) --
  
  The description of the custom model deployment.
- failureMessage (string) --
  
  If the deployment status is FAILED, this field contains a message describing the failure reason.
- lastUpdatedAt (datetime) --
  
  The date and time when the custom model deployment was last updated.

ListCustomModelDeployments (new)

Link ¶

Lists custom model deployments in your account. You can filter the results by creation time, name, status, and associated model. Use this operation to manage and monitor your custom model deployments.

We recommend using pagination to ensure that the operation returns quickly and successfully.

The following actions are related to the ListCustomModelDeployments operation:

See also: AWS API Documentation

Request Syntax

client.list_custom_model_deployments(
    createdBefore=datetime(2015, 1, 1),
    createdAfter=datetime(2015, 1, 1),
    nameContains='string',
    maxResults=123,
    nextToken='string',
    sortBy='CreationTime',
    sortOrder='Ascending'|'Descending',
    statusEquals='Creating'|'Active'|'Failed',
    modelArnEquals='string'
)

type createdBefore:

datetime

param createdBefore:

Filters deployments created before the specified date and time.

type createdAfter:

datetime

param createdAfter:

Filters deployments created after the specified date and time.

type nameContains:

string

param nameContains:

Filters deployments whose names contain the specified string.

type maxResults:

integer

param maxResults:

The maximum number of results to return in a single call.

type nextToken:

string

param nextToken:

The token for the next set of results. Use this token to retrieve additional results when the response is truncated.

type sortBy:

string

param sortBy:

The field to sort the results by. The only supported value is CreationTime.

type sortOrder:

string

param sortOrder:

The sort order for the results. Valid values are Ascending and Descending. Default is Descending.

type statusEquals:

string

param statusEquals:

Filters deployments by status. Valid values are CREATING, ACTIVE, and FAILED.

type modelArnEquals:

string

param modelArnEquals:

Filters deployments by the Amazon Resource Name (ARN) of the associated custom model.

rtype:

dict

returns:

Response Syntax

{
    'nextToken': 'string',
    'modelDeploymentSummaries': [
        {
            'customModelDeploymentArn': 'string',
            'customModelDeploymentName': 'string',
            'modelArn': 'string',
            'createdAt': datetime(2015, 1, 1),
            'status': 'Creating'|'Active'|'Failed',
            'lastUpdatedAt': datetime(2015, 1, 1),
            'failureMessage': 'string'
        },
    ]
}

Response Structure

(dict) --
- nextToken (string) --
  
  The token for the next set of results. This value is null when there are no more results to return.
- modelDeploymentSummaries (list) --
  
  A list of custom model deployment summaries.
  - (dict) --
    
    Contains summary information about a custom model deployment, including its ARN, name, status, and associated custom model.
    - customModelDeploymentArn (string) --
      
      The Amazon Resource Name (ARN) of the custom model deployment.
    - customModelDeploymentName (string) --
      
      The name of the custom model deployment.
    - modelArn (string) --
      
      The Amazon Resource Name (ARN) of the custom model associated with this deployment.
    - createdAt (datetime) --
      
      The date and time when the custom model deployment was created.
    - status (string) --
      
      The status of the custom model deployment. Possible values are CREATING, ACTIVE, and FAILED.
    - lastUpdatedAt (datetime) --
      
      The date and time when the custom model deployment was last modified.
    - failureMessage (string) --
      
      If the deployment status is FAILED, this field contains a message describing the failure reason.

CreateCustomModelDeployment (new)

Link ¶

Deploys a custom model for on-demand inference in Amazon Bedrock. After you deploy your custom model, you use the deployment's Amazon Resource Name (ARN) as the modelId parameter when you submit prompts and generate responses with model inference.

For more information about setting up on-demand inference for custom models, see Set up inference for a custom model.

The following actions are related to the CreateCustomModelDeployment operation:

See also: AWS API Documentation

Request Syntax

client.create_custom_model_deployment(
    modelDeploymentName='string',
    modelArn='string',
    description='string',
    tags=[
        {
            'key': 'string',
            'value': 'string'
        },
    ],
    clientRequestToken='string'
)

type modelDeploymentName:

string

param modelDeploymentName:

[REQUIRED]

The name for the custom model deployment. The name must be unique within your Amazon Web Services account and Region.

type modelArn:

string

param modelArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the custom model to deploy for on-demand inference. The custom model must be in the Active state.

type description:

string

param description:

A description for the custom model deployment to help you identify its purpose.

type tags:

list

param tags:

Tags to assign to the custom model deployment. You can use tags to organize and track your Amazon Web Services resources for cost allocation and management purposes.

(dict) --

Definition of the key/value pair for a tag.
- key (string) -- [REQUIRED]
  
  Key for the tag.
- value (string) -- [REQUIRED]
  
  Value for the tag.

type clientRequestToken:

string

param clientRequestToken:

A unique, case-sensitive identifier to ensure that the operation completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

This field is autopopulated if not provided.

rtype:

dict

returns:

Response Syntax

{
    'customModelDeploymentArn': 'string'
}

Response Structure

(dict) --
- customModelDeploymentArn (string) --
  
  The Amazon Resource Name (ARN) of the custom model deployment. Use this ARN as the modelId parameter when invoking the model with the InvokeModel or Converse operations.