Agents for Amazon Bedrock Runtime

2023/11/28 - Agents for Amazon Bedrock Runtime - 3 new api methods

Changes  This release adds support for customization types, model life cycle status and minor versions/aliases for model identifiers.

InvokeAgent (new) Link ¶

Invokes the specified Bedrock model to run inference using the input provided in the request body.

See also: AWS API Documentation

Request Syntax

client.invoke_agent(
    sessionState={
        'sessionAttributes': {
            'string': 'string'
        },
        'promptSessionAttributes': {
            'string': 'string'
        }
    },
    agentId='string',
    agentAliasId='string',
    sessionId='string',
    endSession=True|False,
    enableTrace=True|False,
    inputText='string'
)
type sessionState:

dict

param sessionState:

Session state passed by customer. Base64 encoded json string representation of SessionState.

  • sessionAttributes (dict) --

    Session Attributes

    • (string) --

      • (string) --

  • promptSessionAttributes (dict) --

    Prompt Session Attributes

    • (string) --

      • (string) --

type agentId:

string

param agentId:

[REQUIRED]

Identifier for Agent

type agentAliasId:

string

param agentAliasId:

[REQUIRED]

Identifier for Agent Alias

type sessionId:

string

param sessionId:

[REQUIRED]

Identifier used for the current session

type endSession:

boolean

param endSession:

End current session

type enableTrace:

boolean

param enableTrace:

Enable agent trace events for improved debugging

type inputText:

string

param inputText:

[REQUIRED]

Input data in the format specified in the Content-Type request header.

rtype:

dict

returns:

The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.

Response Syntax

{
    'completion': EventStream({
        'chunk': {
            'bytes': b'bytes',
            'attribution': {
                'citations': [
                    {
                        'generatedResponsePart': {
                            'textResponsePart': {
                                'text': 'string',
                                'span': {
                                    'start': 123,
                                    'end': 123
                                }
                            }
                        },
                        'retrievedReferences': [
                            {
                                'content': {
                                    'text': 'string'
                                },
                                'location': {
                                    'type': 'S3',
                                    's3Location': {
                                        'uri': 'string'
                                    }
                                }
                            },
                        ]
                    },
                ]
            }
        },
        'trace': {
            'agentId': 'string',
            'agentAliasId': 'string',
            'sessionId': 'string',
            'trace': {
                'preProcessingTrace': {
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    },
                    'modelInvocationOutput': {
                        'traceId': 'string',
                        'parsedResponse': {
                            'rationale': 'string',
                            'isValid': True|False
                        }
                    }
                },
                'orchestrationTrace': {
                    'rationale': {
                        'traceId': 'string',
                        'text': 'string'
                    },
                    'invocationInput': {
                        'traceId': 'string',
                        'invocationType': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH',
                        'actionGroupInvocationInput': {
                            'actionGroupName': 'string',
                            'verb': 'string',
                            'apiPath': 'string',
                            'parameters': [
                                {
                                    'name': 'string',
                                    'type': 'string',
                                    'value': 'string'
                                },
                            ],
                            'requestBody': {
                                'content': {
                                    'string': [
                                        {
                                            'name': 'string',
                                            'type': 'string',
                                            'value': 'string'
                                        },
                                    ]
                                }
                            }
                        },
                        'knowledgeBaseLookupInput': {
                            'text': 'string',
                            'knowledgeBaseId': 'string'
                        }
                    },
                    'observation': {
                        'traceId': 'string',
                        'type': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH'|'ASK_USER'|'REPROMPT',
                        'actionGroupInvocationOutput': {
                            'text': 'string'
                        },
                        'knowledgeBaseLookupOutput': {
                            'retrievedReferences': [
                                {
                                    'content': {
                                        'text': 'string'
                                    },
                                    'location': {
                                        'type': 'S3',
                                        's3Location': {
                                            'uri': 'string'
                                        }
                                    }
                                },
                            ]
                        },
                        'finalResponse': {
                            'text': 'string'
                        },
                        'repromptResponse': {
                            'text': 'string',
                            'source': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'PARSER'
                        }
                    },
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    }
                },
                'postProcessingTrace': {
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    },
                    'modelInvocationOutput': {
                        'traceId': 'string',
                        'parsedResponse': {
                            'text': 'string'
                        }
                    }
                },
                'failureTrace': {
                    'traceId': 'string',
                    'failureReason': 'string'
                }
            }
        },
        'internalServerException': {
            'message': 'string'
        },
        'validationException': {
            'message': 'string'
        },
        'resourceNotFoundException': {
            'message': 'string'
        },
        'serviceQuotaExceededException': {
            'message': 'string'
        },
        'throttlingException': {
            'message': 'string'
        },
        'accessDeniedException': {
            'message': 'string'
        },
        'conflictException': {
            'message': 'string'
        },
        'dependencyFailedException': {
            'message': 'string',
            'resourceName': 'string'
        },
        'badGatewayException': {
            'message': 'string',
            'resourceName': 'string'
        }
    }),
    'contentType': 'string',
    'sessionId': 'string'
}

Response Structure

  • (dict) --

    InvokeAgent Response

    • completion (:class:`.EventStream`) --

      Inference response from the model in the format specified in the Content-Type response header.

      • chunk (dict) --

        Base 64 endoded byte response

        • bytes (bytes) --

          PartBody of the payload in bytes

        • attribution (dict) --

          Citations associated with final agent response

          • citations (list) --

            List of citations

            • (dict) --

              Citation associated with the agent response

              • generatedResponsePart (dict) --

                Generate response part

                • textResponsePart (dict) --

                  Text response part

                  • text (string) --

                    Response part in text

                  • span (dict) --

                    Span of text

                    • start (integer) --

                      Start of span

                    • end (integer) --

                      End of span

              • retrievedReferences (list) --

                list of retrieved references

                • (dict) --

                  Retrieved reference

                  • content (dict) --

                    Content of a retrieval result.

                    • text (string) --

                      Content of a retrieval result in text

                  • location (dict) --

                    The source location of a retrieval result.

                    • type (string) --

                      The location type of a retrieval result.

                    • s3Location (dict) --

                      The S3 location of a retrieval result.

                      • uri (string) --

                        URI of S3 location

      • trace (dict) --

        Trace Part which contains intermidate response for customer

        • agentId (string) --

          Identifier of the agent.

        • agentAliasId (string) --

          Identifier of the agent alias.

        • sessionId (string) --

          Identifier of the session.

        • trace (dict) --

          Trace contains intermidate response for customer

          • preProcessingTrace (dict) --

            Trace Part which contains information related to preprocessing step

            • modelInvocationInput (dict) --

              Trace Part which contains information used to call Invoke Model

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Prompt Message

              • type (string) --

                types of prompts

              • inferenceConfiguration (dict) --

                Configurations for controlling the inference response of an InvokeAgent API call

                • temperature (float) --

                  Controls randomness, higher values increase diversity

                • topP (float) --

                  Cumulative probability cutoff for token selection

                • topK (integer) --

                  Sample from the k most likely next tokens

                • maximumLength (integer) --

                  Maximum length of output

                • stopSequences (list) --

                  List of stop sequences

                  • (string) --

              • overrideLambda (string) --

                ARN of a Lambda.

              • promptCreationMode (string) --

                indicates if agent uses default prompt or overriden prompt

              • parserMode (string) --

                indicates if agent uses default prompt or overriden prompt

            • modelInvocationOutput (dict) --

              Trace Part which contains information related to preprocessing

              • traceId (string) --

                Identifier for trace

              • parsedResponse (dict) --

                Trace Part which contains information if preprocessing was successful

                • rationale (string) --

                  Agent Trace Rationale String

                • isValid (boolean) --

                  Boolean value

          • orchestrationTrace (dict) --

            Trace contains intermidate response during orchestration

            • rationale (dict) --

              Trace Part which contains information related to reasoning

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Agent Trace Rationale String

            • invocationInput (dict) --

              Trace Part which contains input details for action group or knowledge base

              • traceId (string) --

                Identifier for trace

              • invocationType (string) --

                types of invocations

              • actionGroupInvocationInput (dict) --

                input to lambda used in action group

                • actionGroupName (string) --

                  Agent Trace Action Group Name

                • verb (string) --

                  Agent Trace Action Group Action verb

                • apiPath (string) --

                  Agent Trace Action Group API path

                • parameters (list) --

                  list of parameters included in action group invocation

                  • (dict) --

                    parameters included in action group invocation

                    • name (string) --

                      Name of parameter

                    • type (string) --

                      Type of parameter

                    • value (string) --

                      Value of parameter

                • requestBody (dict) --

                  Request Body Content Map

                  • content (dict) --

                    Content type paramter map

                    • (string) --

                      • (list) --

                        list of parameters included in action group invocation

                        • (dict) --

                          parameters included in action group invocation

                          • name (string) --

                            Name of parameter

                          • type (string) --

                            Type of parameter

                          • value (string) --

                            Value of parameter

              • knowledgeBaseLookupInput (dict) --

                Input to lambda used in action group

                • text (string) --

                  Agent Trace Action Group Lambda Invocation Output String

                • knowledgeBaseId (string) --

                  Agent Trace Action Group Knowledge Base Id

            • observation (dict) --

              Trace Part which contains output details for action group or knowledge base or final response

              • traceId (string) --

                Identifier for trace

              • type (string) --

                types of observations

              • actionGroupInvocationOutput (dict) --

                output from lambda used in action group

                • text (string) --

                  Agent Trace Action Group Lambda Invocation Output String

              • knowledgeBaseLookupOutput (dict) --

                Input to lambda used in action group

                • retrievedReferences (list) --

                  list of retrieved references

                  • (dict) --

                    Retrieved reference

                    • content (dict) --

                      Content of a retrieval result.

                      • text (string) --

                        Content of a retrieval result in text

                    • location (dict) --

                      The source location of a retrieval result.

                      • type (string) --

                        The location type of a retrieval result.

                      • s3Location (dict) --

                        The S3 location of a retrieval result.

                        • uri (string) --

                          URI of S3 location

              • finalResponse (dict) --

                Agent finish output

                • text (string) --

                  Agent Trace Action Group Lambda Invocation Output String

              • repromptResponse (dict) --

                Observation information if there were reprompts

                • text (string) --

                  Reprompt response text

                • source (string) --

                  Parsing error source

            • modelInvocationInput (dict) --

              Trace Part which contains information used to call Invoke Model

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Prompt Message

              • type (string) --

                types of prompts

              • inferenceConfiguration (dict) --

                Configurations for controlling the inference response of an InvokeAgent API call

                • temperature (float) --

                  Controls randomness, higher values increase diversity

                • topP (float) --

                  Cumulative probability cutoff for token selection

                • topK (integer) --

                  Sample from the k most likely next tokens

                • maximumLength (integer) --

                  Maximum length of output

                • stopSequences (list) --

                  List of stop sequences

                  • (string) --

              • overrideLambda (string) --

                ARN of a Lambda.

              • promptCreationMode (string) --

                indicates if agent uses default prompt or overriden prompt

              • parserMode (string) --

                indicates if agent uses default prompt or overriden prompt

          • postProcessingTrace (dict) --

            Trace Part which contains information related to post processing step

            • modelInvocationInput (dict) --

              Trace Part which contains information used to call Invoke Model

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Prompt Message

              • type (string) --

                types of prompts

              • inferenceConfiguration (dict) --

                Configurations for controlling the inference response of an InvokeAgent API call

                • temperature (float) --

                  Controls randomness, higher values increase diversity

                • topP (float) --

                  Cumulative probability cutoff for token selection

                • topK (integer) --

                  Sample from the k most likely next tokens

                • maximumLength (integer) --

                  Maximum length of output

                • stopSequences (list) --

                  List of stop sequences

                  • (string) --

              • overrideLambda (string) --

                ARN of a Lambda.

              • promptCreationMode (string) --

                indicates if agent uses default prompt or overriden prompt

              • parserMode (string) --

                indicates if agent uses default prompt or overriden prompt

            • modelInvocationOutput (dict) --

              Trace Part which contains information related to postprocessing

              • traceId (string) --

                Identifier for trace

              • parsedResponse (dict) --

                Trace Part which contains information if preprocessing was successful

                • text (string) --

                  Agent Trace Output String

          • failureTrace (dict) --

            Trace Part which is emitted when agent trace could not be generated

            • traceId (string) --

              Identifier for trace

            • failureReason (string) --

              Agent Trace Failed Reason String

      • internalServerException (dict) --

        This exception is thrown if there was an unexpected error during processing of request

        • message (string) --

          Non Blank String

      • validationException (dict) --

        This exception is thrown when the request's input validation fails

        • message (string) --

          Non Blank String

      • resourceNotFoundException (dict) --

        This exception is thrown when a resource referenced by the operation does not exist

        • message (string) --

          Non Blank String

      • serviceQuotaExceededException (dict) --

        This exception is thrown when a request is made beyond the service quota

        • message (string) --

          Non Blank String

      • throttlingException (dict) --

        This exception is thrown when the number of requests exceeds the limit

        • message (string) --

          Non Blank String

      • accessDeniedException (dict) --

        This exception is thrown when a request is denied per access permissions

        • message (string) --

          Non Blank String

      • conflictException (dict) --

        This exception is thrown when there is a conflict performing an operation

        • message (string) --

          Non Blank String

      • dependencyFailedException (dict) --

        This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource due to a customer fault (i.e. bad configuration)

        • message (string) --

          Non Blank String

        • resourceName (string) --

          Non Blank String

      • badGatewayException (dict) --

        This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource

        • message (string) --

          Non Blank String

        • resourceName (string) --

          Non Blank String

    • contentType (string) --

      streaming response mimetype of the model

    • sessionId (string) --

      streaming response mimetype of the model

Retrieve (new) Link ¶

Retrieve from knowledge base.

See also: AWS API Documentation

Request Syntax

client.retrieve(
    knowledgeBaseId='string',
    retrievalQuery={
        'text': 'string'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 123
        }
    },
    nextToken='string'
)
type knowledgeBaseId:

string

param knowledgeBaseId:

[REQUIRED]

Identifier of the KnowledgeBase

type retrievalQuery:

dict

param retrievalQuery:

[REQUIRED]

Knowledge base input query.

  • text (string) -- [REQUIRED]

    Knowledge base input query in text

type retrievalConfiguration:

dict

param retrievalConfiguration:

Search parameters for retrieving from knowledge base.

  • vectorSearchConfiguration (dict) -- [REQUIRED]

    Knowledge base vector search configuration

    • numberOfResults (integer) -- [REQUIRED]

      Top-K results to retrieve from knowledge base.

type nextToken:

string

param nextToken:

Opaque continuation token of previous paginated response.

rtype:

dict

returns:

Response Syntax

{
    'retrievalResults': [
        {
            'content': {
                'text': 'string'
            },
            'location': {
                'type': 'S3',
                's3Location': {
                    'uri': 'string'
                }
            },
            'score': 123.0
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • retrievalResults (list) --

      List of knowledge base retrieval results

      • (dict) --

        Result item returned from a knowledge base retrieval.

        • content (dict) --

          Content of a retrieval result.

          • text (string) --

            Content of a retrieval result in text

        • location (dict) --

          The source location of a retrieval result.

          • type (string) --

            The location type of a retrieval result.

          • s3Location (dict) --

            The S3 location of a retrieval result.

            • uri (string) --

              URI of S3 location

        • score (float) --

          The relevance score of a result.

    • nextToken (string) --

      Opaque continuation token of previous paginated response.

RetrieveAndGenerate (new) Link ¶

RetrieveAndGenerate API

See also: AWS API Documentation

Request Syntax

client.retrieve_and_generate(
    sessionId='string',
    input={
        'text': 'string'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'string',
            'modelArn': 'string'
        }
    },
    sessionConfiguration={
        'kmsKeyArn': 'string'
    }
)
type sessionId:

string

param sessionId:

Identifier of the session.

type input:

dict

param input:

[REQUIRED]

Customer input of the turn

  • text (string) -- [REQUIRED]

    Customer input of the turn in text

type retrieveAndGenerateConfiguration:

dict

param retrieveAndGenerateConfiguration:

Configures the retrieval and generation for the session.

  • type (string) -- [REQUIRED]

    The type of RetrieveAndGenerate.

  • knowledgeBaseConfiguration (dict) --

    Configurations for retrieval and generation for knowledge base.

    • knowledgeBaseId (string) -- [REQUIRED]

      Identifier of the KnowledgeBase

    • modelArn (string) -- [REQUIRED]

      Arn of a Bedrock model.

type sessionConfiguration:

dict

param sessionConfiguration:

Configures common parameters of the session.

  • kmsKeyArn (string) -- [REQUIRED]

    The KMS key arn to encrypt the customer data of the session.

rtype:

dict

returns:

Response Syntax

{
    'sessionId': 'string',
    'output': {
        'text': 'string'
    },
    'citations': [
        {
            'generatedResponsePart': {
                'textResponsePart': {
                    'text': 'string',
                    'span': {
                        'start': 123,
                        'end': 123
                    }
                }
            },
            'retrievedReferences': [
                {
                    'content': {
                        'text': 'string'
                    },
                    'location': {
                        'type': 'S3',
                        's3Location': {
                            'uri': 'string'
                        }
                    }
                },
            ]
        },
    ]
}

Response Structure

  • (dict) --

    • sessionId (string) --

      Identifier of the session.

    • output (dict) --

      Service response of the turn

      • text (string) --

        Service response of the turn in text

    • citations (list) --

      List of citations

      • (dict) --

        Citation associated with the agent response

        • generatedResponsePart (dict) --

          Generate response part

          • textResponsePart (dict) --

            Text response part

            • text (string) --

              Response part in text

            • span (dict) --

              Span of text

              • start (integer) --

                Start of span

              • end (integer) --

                End of span

        • retrievedReferences (list) --

          list of retrieved references

          • (dict) --

            Retrieved reference

            • content (dict) --

              Content of a retrieval result.

              • text (string) --

                Content of a retrieval result in text

            • location (dict) --

              The source location of a retrieval result.

              • type (string) --

                The location type of a retrieval result.

              • s3Location (dict) --

                The S3 location of a retrieval result.

                • uri (string) --

                  URI of S3 location