Amazon Bedrock

2024/10/29 - Amazon Bedrock - 2 new2 updated api methods

Changes  Update Application Inference Profile

CreateInferenceProfile (new) Link ¶

Creates an application inference profile to track metrics and costs when invoking a model. To create an application inference profile for a foundation model in one region, specify the ARN of the model in that region. To create an application inference profile for a foundation model across multiple regions, specify the ARN of the system-defined inference profile that contains the regions that you want to route requests to. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.

See also: AWS API Documentation

Request Syntax

client.create_inference_profile(
    inferenceProfileName='string',
    description='string',
    clientRequestToken='string',
    modelSource={
        'copyFrom': 'string'
    },
    tags=[
        {
            'key': 'string',
            'value': 'string'
        },
    ]
)
type inferenceProfileName:

string

param inferenceProfileName:

[REQUIRED]

A name for the inference profile.

type description:

string

param description:

A description for the inference profile.

type clientRequestToken:

string

param clientRequestToken:

A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

This field is autopopulated if not provided.

type modelSource:

dict

param modelSource:

[REQUIRED]

The foundation model or system-defined inference profile that the inference profile will track metrics and costs for.

  • copyFrom (string) --

    The ARN of the model or system-defined inference profile that is the source for the inference profile.

type tags:

list

param tags:

An array of objects, each of which contains a tag and its value. For more information, see Tagging resources in the Amazon Bedrock User Guide.

  • (dict) --

    Definition of the key/value pair for a tag.

    • key (string) -- [REQUIRED]

      Key for the tag.

    • value (string) -- [REQUIRED]

      Value for the tag.

rtype:

dict

returns:

Response Syntax

{
    'inferenceProfileArn': 'string',
    'status': 'ACTIVE'
}

Response Structure

  • (dict) --

    • inferenceProfileArn (string) --

      The ARN of the inference profile that you created.

    • status (string) --

      The status of the inference profile. ACTIVE means that the inference profile is ready to be used.

DeleteInferenceProfile (new) Link ¶

Deletes an application inference profile. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.

See also: AWS API Documentation

Request Syntax

client.delete_inference_profile(
    inferenceProfileIdentifier='string'
)
type inferenceProfileIdentifier:

string

param inferenceProfileIdentifier:

[REQUIRED]

The Amazon Resource Name (ARN) or ID of the application inference profile to delete.

rtype:

dict

returns:

Response Syntax

{}

Response Structure

  • (dict) --

GetInferenceProfile (updated) Link ¶
Changes (response)
{'type': {'APPLICATION'}}

Gets information about an inference profile. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.

See also: AWS API Documentation

Request Syntax

client.get_inference_profile(
    inferenceProfileIdentifier='string'
)
type inferenceProfileIdentifier:

string

param inferenceProfileIdentifier:

[REQUIRED]

The ID or Amazon Resource Name (ARN) of the inference profile.

rtype:

dict

returns:

Response Syntax

{
    'inferenceProfileName': 'string',
    'description': 'string',
    'createdAt': datetime(2015, 1, 1),
    'updatedAt': datetime(2015, 1, 1),
    'inferenceProfileArn': 'string',
    'models': [
        {
            'modelArn': 'string'
        },
    ],
    'inferenceProfileId': 'string',
    'status': 'ACTIVE',
    'type': 'SYSTEM_DEFINED'|'APPLICATION'
}

Response Structure

  • (dict) --

    • inferenceProfileName (string) --

      The name of the inference profile.

    • description (string) --

      The description of the inference profile.

    • createdAt (datetime) --

      The time at which the inference profile was created.

    • updatedAt (datetime) --

      The time at which the inference profile was last updated.

    • inferenceProfileArn (string) --

      The Amazon Resource Name (ARN) of the inference profile.

    • models (list) --

      A list of information about each model in the inference profile.

      • (dict) --

        Contains information about a model.

        • modelArn (string) --

          The Amazon Resource Name (ARN) of the model.

    • inferenceProfileId (string) --

      The unique identifier of the inference profile.

    • status (string) --

      The status of the inference profile. ACTIVE means that the inference profile is ready to be used.

    • type (string) --

      The type of the inference profile. The following types are possible:

      • SYSTEM_DEFINED – The inference profile is defined by Amazon Bedrock. You can route inference requests across regions with these inference profiles.

      • APPLICATION – The inference profile was created by a user. This type of inference profile can track metrics and costs when invoking the model in it. The inference profile may route requests to one or multiple regions.

ListInferenceProfiles (updated) Link ¶
Changes (request, response)
Request
{'typeEquals': 'SYSTEM_DEFINED | APPLICATION'}
Response
{'inferenceProfileSummaries': {'type': {'APPLICATION'}}}

Returns a list of inference profiles that you can use. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.

See also: AWS API Documentation

Request Syntax

client.list_inference_profiles(
    maxResults=123,
    nextToken='string',
    typeEquals='SYSTEM_DEFINED'|'APPLICATION'
)
type maxResults:

integer

param maxResults:

The maximum number of results to return in the response. If the total number of results is greater than this value, use the token returned in the response in the nextToken field when making another request to return the next batch of results.

type nextToken:

string

param nextToken:

If the total number of results is greater than the maxResults value provided in the request, enter the token returned in the nextToken field in the response in this field to return the next batch of results.

type typeEquals:

string

param typeEquals:

Filters for inference profiles that match the type you specify.

  • SYSTEM_DEFINED – The inference profile is defined by Amazon Bedrock. You can route inference requests across regions with these inference profiles.

  • APPLICATION – The inference profile was created by a user. This type of inference profile can track metrics and costs when invoking the model in it. The inference profile may route requests to one or multiple regions.

rtype:

dict

returns:

Response Syntax

{
    'inferenceProfileSummaries': [
        {
            'inferenceProfileName': 'string',
            'description': 'string',
            'createdAt': datetime(2015, 1, 1),
            'updatedAt': datetime(2015, 1, 1),
            'inferenceProfileArn': 'string',
            'models': [
                {
                    'modelArn': 'string'
                },
            ],
            'inferenceProfileId': 'string',
            'status': 'ACTIVE',
            'type': 'SYSTEM_DEFINED'|'APPLICATION'
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • inferenceProfileSummaries (list) --

      A list of information about each inference profile that you can use.

      • (dict) --

        Contains information about an inference profile.

        • inferenceProfileName (string) --

          The name of the inference profile.

        • description (string) --

          The description of the inference profile.

        • createdAt (datetime) --

          The time at which the inference profile was created.

        • updatedAt (datetime) --

          The time at which the inference profile was last updated.

        • inferenceProfileArn (string) --

          The Amazon Resource Name (ARN) of the inference profile.

        • models (list) --

          A list of information about each model in the inference profile.

          • (dict) --

            Contains information about a model.

            • modelArn (string) --

              The Amazon Resource Name (ARN) of the model.

        • inferenceProfileId (string) --

          The unique identifier of the inference profile.

        • status (string) --

          The status of the inference profile. ACTIVE means that the inference profile is ready to be used.

        • type (string) --

          The type of the inference profile. The following types are possible:

          • SYSTEM_DEFINED – The inference profile is defined by Amazon Bedrock. You can route inference requests across regions with these inference profiles.

          • APPLICATION – The inference profile was created by a user. This type of inference profile can track metrics and costs when invoking the model in it. The inference profile may route requests to one or multiple regions.

    • nextToken (string) --

      If the total number of results is greater than the maxResults value provided in the request, use this token when making another request in the nextToken field to return the next batch of results.