Agents for Amazon Bedrock Runtime

2023/11/28 - Agents for Amazon Bedrock Runtime - 3 new api methods

Changes  This release adds support for customization types, model life cycle status and minor versions/aliases for model identifiers.

Retrieve (new) Link ¶

Retrieve from knowledge base.

See also: AWS API Documentation

Request Syntax

client.retrieve(
    knowledgeBaseId='string',
    retrievalQuery={
        'text': 'string'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 123
        }
    },
    nextToken='string'
)
type knowledgeBaseId

string

param knowledgeBaseId

[REQUIRED]

Identifier of the KnowledgeBase

type retrievalQuery

dict

param retrievalQuery

[REQUIRED]

Knowledge base input query.

  • text (string) -- [REQUIRED]

    Knowledge base input query in text

type retrievalConfiguration

dict

param retrievalConfiguration

Search parameters for retrieving from knowledge base.

  • vectorSearchConfiguration (dict) -- [REQUIRED]

    Knowledge base vector search configuration

    • numberOfResults (integer) -- [REQUIRED]

      Top-K results to retrieve from knowledge base.

type nextToken

string

param nextToken

Opaque continuation token of previous paginated response.

rtype

dict

returns

Response Syntax

{
    'retrievalResults': [
        {
            'content': {
                'text': 'string'
            },
            'location': {
                'type': 'S3',
                's3Location': {
                    'uri': 'string'
                }
            },
            'score': 123.0
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • retrievalResults (list) --

      List of knowledge base retrieval results

      • (dict) --

        Result item returned from a knowledge base retrieval.

        • content (dict) --

          Content of a retrieval result.

          • text (string) --

            Content of a retrieval result in text

        • location (dict) --

          The source location of a retrieval result.

          • type (string) --

            The location type of a retrieval result.

          • s3Location (dict) --

            The S3 location of a retrieval result.

            • uri (string) --

              URI of S3 location

        • score (float) --

          The relevance score of a result.

    • nextToken (string) --

      Opaque continuation token of previous paginated response.

RetrieveAndGenerate (new) Link ¶

RetrieveAndGenerate API

See also: AWS API Documentation

Request Syntax

client.retrieve_and_generate(
    sessionId='string',
    input={
        'text': 'string'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'string',
            'modelArn': 'string'
        }
    },
    sessionConfiguration={
        'kmsKeyArn': 'string'
    }
)
type sessionId

string

param sessionId

Identifier of the session.

type input

dict

param input

[REQUIRED]

Customer input of the turn

  • text (string) -- [REQUIRED]

    Customer input of the turn in text

type retrieveAndGenerateConfiguration

dict

param retrieveAndGenerateConfiguration

Configures the retrieval and generation for the session.

  • type (string) -- [REQUIRED]

    The type of RetrieveAndGenerate.

  • knowledgeBaseConfiguration (dict) --

    Configurations for retrieval and generation for knowledge base.

    • knowledgeBaseId (string) -- [REQUIRED]

      Identifier of the KnowledgeBase

    • modelArn (string) -- [REQUIRED]

      Arn of a Bedrock model.

type sessionConfiguration

dict

param sessionConfiguration

Configures common parameters of the session.

  • kmsKeyArn (string) -- [REQUIRED]

    The KMS key arn to encrypt the customer data of the session.

rtype

dict

returns

Response Syntax

{
    'sessionId': 'string',
    'output': {
        'text': 'string'
    },
    'citations': [
        {
            'generatedResponsePart': {
                'textResponsePart': {
                    'text': 'string',
                    'span': {
                        'start': 123,
                        'end': 123
                    }
                }
            },
            'retrievedReferences': [
                {
                    'content': {
                        'text': 'string'
                    },
                    'location': {
                        'type': 'S3',
                        's3Location': {
                            'uri': 'string'
                        }
                    }
                },
            ]
        },
    ]
}

Response Structure

  • (dict) --

    • sessionId (string) --

      Identifier of the session.

    • output (dict) --

      Service response of the turn

      • text (string) --

        Service response of the turn in text

    • citations (list) --

      List of citations

      • (dict) --

        Citation associated with the agent response

        • generatedResponsePart (dict) --

          Generate response part

          • textResponsePart (dict) --

            Text response part

            • text (string) --

              Response part in text

            • span (dict) --

              Span of text

              • start (integer) --

                Start of span

              • end (integer) --

                End of span

        • retrievedReferences (list) --

          list of retrieved references

          • (dict) --

            Retrieved reference

            • content (dict) --

              Content of a retrieval result.

              • text (string) --

                Content of a retrieval result in text

            • location (dict) --

              The source location of a retrieval result.

              • type (string) --

                The location type of a retrieval result.

              • s3Location (dict) --

                The S3 location of a retrieval result.

                • uri (string) --

                  URI of S3 location

InvokeAgent (new) Link ¶

Invokes the specified Bedrock model to run inference using the input provided in the request body.

See also: AWS API Documentation

Request Syntax

client.invoke_agent(
    sessionState={
        'sessionAttributes': {
            'string': 'string'
        },
        'promptSessionAttributes': {
            'string': 'string'
        }
    },
    agentId='string',
    agentAliasId='string',
    sessionId='string',
    endSession=True|False,
    enableTrace=True|False,
    inputText='string'
)
type sessionState

dict

param sessionState

Session state passed by customer. Base64 encoded json string representation of SessionState.

  • sessionAttributes (dict) --

    Session Attributes

    • (string) --

      • (string) --

  • promptSessionAttributes (dict) --

    Prompt Session Attributes

    • (string) --

      • (string) --

type agentId

string

param agentId

[REQUIRED]

Identifier for Agent

type agentAliasId

string

param agentAliasId

[REQUIRED]

Identifier for Agent Alias

type sessionId

string

param sessionId

[REQUIRED]

Identifier used for the current session

type endSession

boolean

param endSession

End current session

type enableTrace

boolean

param enableTrace

Enable agent trace events for improved debugging

type inputText

string

param inputText

[REQUIRED]

Input data in the format specified in the Content-Type request header.

rtype

dict

returns

The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.

Response Syntax

{
    'completion': EventStream({
        'chunk': {
            'bytes': b'bytes',
            'attribution': {
                'citations': [
                    {
                        'generatedResponsePart': {
                            'textResponsePart': {
                                'text': 'string',
                                'span': {
                                    'start': 123,
                                    'end': 123
                                }
                            }
                        },
                        'retrievedReferences': [
                            {
                                'content': {
                                    'text': 'string'
                                },
                                'location': {
                                    'type': 'S3',
                                    's3Location': {
                                        'uri': 'string'
                                    }
                                }
                            },
                        ]
                    },
                ]
            }
        },
        'trace': {
            'agentId': 'string',
            'agentAliasId': 'string',
            'sessionId': 'string',
            'trace': {
                'preProcessingTrace': {
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    },
                    'modelInvocationOutput': {
                        'traceId': 'string',
                        'parsedResponse': {
                            'rationale': 'string',
                            'isValid': True|False
                        }
                    }
                },
                'orchestrationTrace': {
                    'rationale': {
                        'traceId': 'string',
                        'text': 'string'
                    },
                    'invocationInput': {
                        'traceId': 'string',
                        'invocationType': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH',
                        'actionGroupInvocationInput': {
                            'actionGroupName': 'string',
                            'verb': 'string',
                            'apiPath': 'string',
                            'parameters': [
                                {
                                    'name': 'string',
                                    'type': 'string',
                                    'value': 'string'
                                },
                            ],
                            'requestBody': {
                                'content': {
                                    'string': [
                                        {
                                            'name': 'string',
                                            'type': 'string',
                                            'value': 'string'
                                        },
                                    ]
                                }
                            }
                        },
                        'knowledgeBaseLookupInput': {
                            'text': 'string',
                            'knowledgeBaseId': 'string'
                        }
                    },
                    'observation': {
                        'traceId': 'string',
                        'type': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH'|'ASK_USER'|'REPROMPT',
                        'actionGroupInvocationOutput': {
                            'text': 'string'
                        },
                        'knowledgeBaseLookupOutput': {
                            'retrievedReferences': [
                                {
                                    'content': {
                                        'text': 'string'
                                    },
                                    'location': {
                                        'type': 'S3',
                                        's3Location': {
                                            'uri': 'string'
                                        }
                                    }
                                },
                            ]
                        },
                        'finalResponse': {
                            'text': 'string'
                        },
                        'repromptResponse': {
                            'text': 'string',
                            'source': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'PARSER'
                        }
                    },
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    }
                },
                'postProcessingTrace': {
                    'modelInvocationInput': {
                        'traceId': 'string',
                        'text': 'string',
                        'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
                        'inferenceConfiguration': {
                            'temperature': ...,
                            'topP': ...,
                            'topK': 123,
                            'maximumLength': 123,
                            'stopSequences': [
                                'string',
                            ]
                        },
                        'overrideLambda': 'string',
                        'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
                        'parserMode': 'DEFAULT'|'OVERRIDDEN'
                    },
                    'modelInvocationOutput': {
                        'traceId': 'string',
                        'parsedResponse': {
                            'text': 'string'
                        }
                    }
                },
                'failureTrace': {
                    'traceId': 'string',
                    'failureReason': 'string'
                }
            }
        },
        'internalServerException': {
            'message': 'string'
        },
        'validationException': {
            'message': 'string'
        },
        'resourceNotFoundException': {
            'message': 'string'
        },
        'serviceQuotaExceededException': {
            'message': 'string'
        },
        'throttlingException': {
            'message': 'string'
        },
        'accessDeniedException': {
            'message': 'string'
        },
        'conflictException': {
            'message': 'string'
        },
        'dependencyFailedException': {
            'message': 'string',
            'resourceName': 'string'
        },
        'badGatewayException': {
            'message': 'string',
            'resourceName': 'string'
        }
    }),
    'contentType': 'string',
    'sessionId': 'string'
}

Response Structure

  • (dict) --

    InvokeAgent Response

    • completion (:class:`.EventStream`) --

      Inference response from the model in the format specified in the Content-Type response header.

      • chunk (dict) --

        Base 64 endoded byte response

        • bytes (bytes) --

          PartBody of the payload in bytes

        • attribution (dict) --

          Citations associated with final agent response

          • citations (list) --

            List of citations

            • (dict) --

              Citation associated with the agent response

              • generatedResponsePart (dict) --

                Generate response part

                • textResponsePart (dict) --

                  Text response part

                  • text (string) --

                    Response part in text

                  • span (dict) --

                    Span of text

                    • start (integer) --

                      Start of span

                    • end (integer) --

                      End of span

              • retrievedReferences (list) --

                list of retrieved references

                • (dict) --

                  Retrieved reference

                  • content (dict) --

                    Content of a retrieval result.

                    • text (string) --

                      Content of a retrieval result in text

                  • location (dict) --

                    The source location of a retrieval result.

                    • type (string) --

                      The location type of a retrieval result.

                    • s3Location (dict) --

                      The S3 location of a retrieval result.

                      • uri (string) --

                        URI of S3 location

      • trace (dict) --

        Trace Part which contains intermidate response for customer

        • agentId (string) --

          Identifier of the agent.

        • agentAliasId (string) --

          Identifier of the agent alias.

        • sessionId (string) --

          Identifier of the session.

        • trace (dict) --

          Trace contains intermidate response for customer

          Note

          This is a Tagged Union structure. Only one of the following top level keys will be set: preProcessingTrace, orchestrationTrace, postProcessingTrace, failureTrace. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

          'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
          • preProcessingTrace (dict) --

            Trace Part which contains information related to preprocessing step

            Note

            This is a Tagged Union structure. Only one of the following top level keys will be set: modelInvocationInput, modelInvocationOutput. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

            'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
            • modelInvocationInput (dict) --

              Trace Part which contains information used to call Invoke Model

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Prompt Message

              • type (string) --

                types of prompts

              • inferenceConfiguration (dict) --

                Configurations for controlling the inference response of an InvokeAgent API call

                • temperature (float) --

                  Controls randomness, higher values increase diversity

                • topP (float) --

                  Cumulative probability cutoff for token selection

                • topK (integer) --

                  Sample from the k most likely next tokens

                • maximumLength (integer) --

                  Maximum length of output

                • stopSequences (list) --

                  List of stop sequences

                  • (string) --

              • overrideLambda (string) --

                ARN of a Lambda.

              • promptCreationMode (string) --

                indicates if agent uses default prompt or overriden prompt

              • parserMode (string) --

                indicates if agent uses default prompt or overriden prompt

            • modelInvocationOutput (dict) --

              Trace Part which contains information related to preprocessing

              • traceId (string) --

                Identifier for trace

              • parsedResponse (dict) --

                Trace Part which contains information if preprocessing was successful

                • rationale (string) --

                  Agent Trace Rationale String

                • isValid (boolean) --

                  Boolean value

          • orchestrationTrace (dict) --

            Trace contains intermidate response during orchestration

            Note

            This is a Tagged Union structure. Only one of the following top level keys will be set: rationale, invocationInput, observation, modelInvocationInput. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

            'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
            • rationale (dict) --

              Trace Part which contains information related to reasoning

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Agent Trace Rationale String

            • invocationInput (dict) --

              Trace Part which contains input details for action group or knowledge base

              • traceId (string) --

                Identifier for trace

              • invocationType (string) --

                types of invocations

              • actionGroupInvocationInput (dict) --

                input to lambda used in action group

                • actionGroupName (string) --

                  Agent Trace Action Group Name

                • verb (string) --

                  Agent Trace Action Group Action verb

                • apiPath (string) --

                  Agent Trace Action Group API path

                • parameters (list) --

                  list of parameters included in action group invocation

                  • (dict) --

                    parameters included in action group invocation

                    • name (string) --

                      Name of parameter

                    • type (string) --

                      Type of parameter

                    • value (string) --

                      Value of parameter

                • requestBody (dict) --

                  Request Body Content Map

                  • content (dict) --

                    Content type paramter map

                    • (string) --

                      • (list) --

                        list of parameters included in action group invocation

                        • (dict) --

                          parameters included in action group invocation

                          • name (string) --

                            Name of parameter

                          • type (string) --

                            Type of parameter

                          • value (string) --

                            Value of parameter

              • knowledgeBaseLookupInput (dict) --

                Input to lambda used in action group

                • text (string) --

                  Agent Trace Action Group Lambda Invocation Output String

                • knowledgeBaseId (string) --

                  Agent Trace Action Group Knowledge Base Id

            • observation (dict) --

              Trace Part which contains output details for action group or knowledge base or final response

              • traceId (string) --

                Identifier for trace

              • type (string) --

                types of observations

              • actionGroupInvocationOutput (dict) --

                output from lambda used in action group

                • text (string) --

                  Agent Trace Action Group Lambda Invocation Output String

              • knowledgeBaseLookupOutput (dict) --

                Input to lambda used in action group

                • retrievedReferences (list) --

                  list of retrieved references

                  • (dict) --

                    Retrieved reference

                    • content (dict) --

                      Content of a retrieval result.

                      • text (string) --

                        Content of a retrieval result in text

                    • location (dict) --

                      The source location of a retrieval result.

                      • type (string) --

                        The location type of a retrieval result.

                      • s3Location (dict) --

                        The S3 location of a retrieval result.

                        • uri (string) --

                          URI of S3 location

              • finalResponse (dict) --

                Agent finish output

                • text (string) --

                  Agent Trace Action Group Lambda Invocation Output String

              • repromptResponse (dict) --

                Observation information if there were reprompts

                • text (string) --

                  Reprompt response text

                • source (string) --

                  Parsing error source

            • modelInvocationInput (dict) --

              Trace Part which contains information used to call Invoke Model

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Prompt Message

              • type (string) --

                types of prompts

              • inferenceConfiguration (dict) --

                Configurations for controlling the inference response of an InvokeAgent API call

                • temperature (float) --

                  Controls randomness, higher values increase diversity

                • topP (float) --

                  Cumulative probability cutoff for token selection

                • topK (integer) --

                  Sample from the k most likely next tokens

                • maximumLength (integer) --

                  Maximum length of output

                • stopSequences (list) --

                  List of stop sequences

                  • (string) --

              • overrideLambda (string) --

                ARN of a Lambda.

              • promptCreationMode (string) --

                indicates if agent uses default prompt or overriden prompt

              • parserMode (string) --

                indicates if agent uses default prompt or overriden prompt

          • postProcessingTrace (dict) --

            Trace Part which contains information related to post processing step

            Note

            This is a Tagged Union structure. Only one of the following top level keys will be set: modelInvocationInput, modelInvocationOutput. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

            'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
            • modelInvocationInput (dict) --

              Trace Part which contains information used to call Invoke Model

              • traceId (string) --

                Identifier for trace

              • text (string) --

                Prompt Message

              • type (string) --

                types of prompts

              • inferenceConfiguration (dict) --

                Configurations for controlling the inference response of an InvokeAgent API call

                • temperature (float) --

                  Controls randomness, higher values increase diversity

                • topP (float) --

                  Cumulative probability cutoff for token selection

                • topK (integer) --

                  Sample from the k most likely next tokens

                • maximumLength (integer) --

                  Maximum length of output

                • stopSequences (list) --

                  List of stop sequences

                  • (string) --

              • overrideLambda (string) --

                ARN of a Lambda.

              • promptCreationMode (string) --

                indicates if agent uses default prompt or overriden prompt

              • parserMode (string) --

                indicates if agent uses default prompt or overriden prompt

            • modelInvocationOutput (dict) --

              Trace Part which contains information related to postprocessing

              • traceId (string) --

                Identifier for trace

              • parsedResponse (dict) --

                Trace Part which contains information if preprocessing was successful

                • text (string) --

                  Agent Trace Output String

          • failureTrace (dict) --

            Trace Part which is emitted when agent trace could not be generated

            • traceId (string) --

              Identifier for trace

            • failureReason (string) --

              Agent Trace Failed Reason String

      • internalServerException (dict) --

        This exception is thrown if there was an unexpected error during processing of request

        • message (string) --

          Non Blank String

      • validationException (dict) --

        This exception is thrown when the request's input validation fails

        • message (string) --

          Non Blank String

      • resourceNotFoundException (dict) --

        This exception is thrown when a resource referenced by the operation does not exist

        • message (string) --

          Non Blank String

      • serviceQuotaExceededException (dict) --

        This exception is thrown when a request is made beyond the service quota

        • message (string) --

          Non Blank String

      • throttlingException (dict) --

        This exception is thrown when the number of requests exceeds the limit

        • message (string) --

          Non Blank String

      • accessDeniedException (dict) --

        This exception is thrown when a request is denied per access permissions

        • message (string) --

          Non Blank String

      • conflictException (dict) --

        This exception is thrown when there is a conflict performing an operation

        • message (string) --

          Non Blank String

      • dependencyFailedException (dict) --

        This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource due to a customer fault (i.e. bad configuration)

        • message (string) --

          Non Blank String

        • resourceName (string) --

          Non Blank String

      • badGatewayException (dict) --

        This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource

        • message (string) --

          Non Blank String

        • resourceName (string) --

          Non Blank String

    • contentType (string) --

      streaming response mimetype of the model

    • sessionId (string) --

      streaming response mimetype of the model