Agents for Amazon Bedrock Runtime

2024/10/21 - Agents for Amazon Bedrock Runtime - 1 updated api methods

Changes  Knowledge Bases for Amazon Bedrock now supports custom prompts and model parameters in the orchestrationConfiguration of the RetrieveAndGenerate API. The modelArn field accepts Custom Models and Imported Models ARNs.

RetrieveAndGenerate (updated) Link ¶
Changes (request)
{'retrieveAndGenerateConfiguration': {'knowledgeBaseConfiguration': {'orchestrationConfiguration': {'additionalModelRequestFields': {'string': {}},
                                                                                                    'inferenceConfig': {'textInferenceConfig': {'maxTokens': 'integer',
                                                                                                                                                'stopSequences': ['string'],
                                                                                                                                                'temperature': 'float',
                                                                                                                                                'topP': 'float'}},
                                                                                                    'promptTemplate': {'textPromptTemplate': 'string'}}}}}

Queries a knowledge base and generates responses based on the retrieved results and using the specified foundation model or inference profile. The response only cites sources that are relevant to the query.

See also: AWS API Documentation

Request Syntax

client.retrieve_and_generate(
    input={
        'text': 'string'
    },
    retrieveAndGenerateConfiguration={
        'externalSourcesConfiguration': {
            'generationConfiguration': {
                'additionalModelRequestFields': {
                    'string': {...}|[...]|123|123.4|'string'|True|None
                },
                'guardrailConfiguration': {
                    'guardrailId': 'string',
                    'guardrailVersion': 'string'
                },
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'maxTokens': 123,
                        'stopSequences': [
                            'string',
                        ],
                        'temperature': ...,
                        'topP': ...
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': 'string'
                }
            },
            'modelArn': 'string',
            'sources': [
                {
                    'byteContent': {
                        'contentType': 'string',
                        'data': b'bytes',
                        'identifier': 'string'
                    },
                    's3Location': {
                        'uri': 'string'
                    },
                    'sourceType': 'S3'|'BYTE_CONTENT'
                },
            ]
        },
        'knowledgeBaseConfiguration': {
            'generationConfiguration': {
                'additionalModelRequestFields': {
                    'string': {...}|[...]|123|123.4|'string'|True|None
                },
                'guardrailConfiguration': {
                    'guardrailId': 'string',
                    'guardrailVersion': 'string'
                },
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'maxTokens': 123,
                        'stopSequences': [
                            'string',
                        ],
                        'temperature': ...,
                        'topP': ...
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': 'string'
                }
            },
            'knowledgeBaseId': 'string',
            'modelArn': 'string',
            'orchestrationConfiguration': {
                'additionalModelRequestFields': {
                    'string': {...}|[...]|123|123.4|'string'|True|None
                },
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'maxTokens': 123,
                        'stopSequences': [
                            'string',
                        ],
                        'temperature': ...,
                        'topP': ...
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': 'string'
                },
                'queryTransformationConfiguration': {
                    'type': 'QUERY_DECOMPOSITION'
                }
            },
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'filter': {
                        'andAll': [
                            {'... recursive ...'},
                        ],
                        'equals': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'greaterThan': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'greaterThanOrEquals': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'in': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'lessThan': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'lessThanOrEquals': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'listContains': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'notEquals': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'notIn': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'orAll': [
                            {'... recursive ...'},
                        ],
                        'startsWith': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        },
                        'stringContains': {
                            'key': 'string',
                            'value': {...}|[...]|123|123.4|'string'|True|None
                        }
                    },
                    'numberOfResults': 123,
                    'overrideSearchType': 'HYBRID'|'SEMANTIC'
                }
            }
        },
        'type': 'KNOWLEDGE_BASE'|'EXTERNAL_SOURCES'
    },
    sessionConfiguration={
        'kmsKeyArn': 'string'
    },
    sessionId='string'
)
type input:

dict

param input:

[REQUIRED]

Contains the query to be made to the knowledge base.

  • text (string) -- [REQUIRED]

    The query made to the knowledge base.

type retrieveAndGenerateConfiguration:

dict

param retrieveAndGenerateConfiguration:

Contains configurations for the knowledge base query and retrieval process. For more information, see Query configurations.

  • externalSourcesConfiguration (dict) --

    The configuration for the external source wrapper object in the retrieveAndGenerate function.

    • generationConfiguration (dict) --

      The prompt used with the external source wrapper object with the retrieveAndGenerate function.

      • additionalModelRequestFields (dict) --

        Additional model parameters and their corresponding values not included in the textInferenceConfig structure for an external source. Takes in custom model parameters specific to the language model being used.

        • (string) --

          • (:ref:`document<document>`) --

      • guardrailConfiguration (dict) --

        The configuration details for the guardrail.

        • guardrailId (string) -- [REQUIRED]

          The unique identifier for the guardrail.

        • guardrailVersion (string) -- [REQUIRED]

          The version of the guardrail.

      • inferenceConfig (dict) --

        Configuration settings for inference when using RetrieveAndGenerate to generate responses while using an external source.

        • textInferenceConfig (dict) --

          Configuration settings specific to text generation while generating responses using RetrieveAndGenerate.

          • maxTokens (integer) --

            The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitary values, for actual values consult the limits defined by your specific model.

          • stopSequences (list) --

            A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitary values, for actual values consult the limits defined by your specific model.

            • (string) --

          • temperature (float) --

            Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.

          • topP (float) --

            A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.

      • promptTemplate (dict) --

        Contain the textPromptTemplate string for the external source wrapper object.

        • textPromptTemplate (string) --

          The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.

          For more information, see the following resources:

    • modelArn (string) -- [REQUIRED]

      The model Amazon Resource Name (ARN) for the external source wrapper object in the retrieveAndGenerate function.

    • sources (list) -- [REQUIRED]

      The document for the external source wrapper object in the retrieveAndGenerate function.

      • (dict) --

        The unique external source of the content contained in the wrapper object.

        • byteContent (dict) --

          The identifier, contentType, and data of the external source wrapper object.

          • contentType (string) -- [REQUIRED]

            The MIME type of the document contained in the wrapper object.

          • data (bytes) -- [REQUIRED]

            The byte value of the file to upload, encoded as a Base-64 string.

          • identifier (string) -- [REQUIRED]

            The file name of the document contained in the wrapper object.

        • s3Location (dict) --

          The S3 location of the external source wrapper object.

          • uri (string) -- [REQUIRED]

            The file location of the S3 wrapper object.

        • sourceType (string) -- [REQUIRED]

          The source type of the external source wrapper object.

  • knowledgeBaseConfiguration (dict) --

    Contains details about the knowledge base for retrieving information and generating responses.

    • generationConfiguration (dict) --

      Contains configurations for response generation based on the knowledge base query results.

      • additionalModelRequestFields (dict) --

        Additional model parameters and corresponding values not included in the textInferenceConfig structure for a knowledge base. This allows users to provide custom model parameters specific to the language model being used.

        • (string) --

          • (:ref:`document<document>`) --

      • guardrailConfiguration (dict) --

        The configuration details for the guardrail.

        • guardrailId (string) -- [REQUIRED]

          The unique identifier for the guardrail.

        • guardrailVersion (string) -- [REQUIRED]

          The version of the guardrail.

      • inferenceConfig (dict) --

        Configuration settings for inference when using RetrieveAndGenerate to generate responses while using a knowledge base as a source.

        • textInferenceConfig (dict) --

          Configuration settings specific to text generation while generating responses using RetrieveAndGenerate.

          • maxTokens (integer) --

            The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitary values, for actual values consult the limits defined by your specific model.

          • stopSequences (list) --

            A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitary values, for actual values consult the limits defined by your specific model.

            • (string) --

          • temperature (float) --

            Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.

          • topP (float) --

            A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.

      • promptTemplate (dict) --

        Contains the template for the prompt that's sent to the model for response generation.

        • textPromptTemplate (string) --

          The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.

          For more information, see the following resources:

    • knowledgeBaseId (string) -- [REQUIRED]

      The unique identifier of the knowledge base that is queried.

    • modelArn (string) -- [REQUIRED]

      The ARN of the foundation model or inference profile used to generate a response.

    • orchestrationConfiguration (dict) --

      Settings for how the model processes the prompt prior to retrieval and generation.

      • additionalModelRequestFields (dict) --

        Additional model parameters and corresponding values not included in the textInferenceConfig structure for a knowledge base. This allows users to provide custom model parameters specific to the language model being used.

        • (string) --

          • (:ref:`document<document>`) --

      • inferenceConfig (dict) --

        Configuration settings for inference when using RetrieveAndGenerate to generate responses while using a knowledge base as a source.

        • textInferenceConfig (dict) --

          Configuration settings specific to text generation while generating responses using RetrieveAndGenerate.

          • maxTokens (integer) --

            The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitary values, for actual values consult the limits defined by your specific model.

          • stopSequences (list) --

            A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitary values, for actual values consult the limits defined by your specific model.

            • (string) --

          • temperature (float) --

            Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.

          • topP (float) --

            A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.

      • promptTemplate (dict) --

        Contains the template for the prompt that's sent to the model for response generation.

        • textPromptTemplate (string) --

          The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.

          For more information, see the following resources:

      • queryTransformationConfiguration (dict) --

        To split up the prompt and retrieve multiple sources, set the transformation type to QUERY_DECOMPOSITION.

        • type (string) -- [REQUIRED]

          The type of transformation to apply to the prompt.

    • retrievalConfiguration (dict) --

      Contains configurations for how to retrieve and return the knowledge base query.

      • vectorSearchConfiguration (dict) -- [REQUIRED]

        Contains details about how the results from the vector search should be returned. For more information, see Query configurations.

        • filter (dict) --

          Specifies the filters to use on the metadata in the knowledge base data sources before returning results. For more information, see Query configurations.

          • andAll (list) --

            Knowledge base data sources are returned if their metadata attributes fulfill all the filter conditions inside this list.

            • (dict) --

              Specifies the filters to use on the metadata attributes in the knowledge base data sources before returning results. For more information, see Query configurations. See the examples below to see how to use these filters.

              This data type is used in the following API operations:

          • equals (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value matches the value in this object.

            The following example would return data sources with an animal attribute whose value is cat:

            "equals": { "key": "animal", "value": "cat" }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • greaterThan (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than the value in this object.

            The following example would return data sources with an year attribute whose value is greater than 1989:

            "greaterThan": { "key": "year", "value": 1989 }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • greaterThanOrEquals (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than or equal to the value in this object.

            The following example would return data sources with an year attribute whose value is greater than or equal to 1989:

            "greaterThanOrEquals": { "key": "year", "value": 1989 }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • in (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is in the list specified in the value in this object.

            The following example would return data sources with an animal attribute that is either cat or dog:

            "in": { "key": "animal", "value": ["cat", "dog"] }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • lessThan (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than the value in this object.

            The following example would return data sources with an year attribute whose value is less than to 1989.

            "lessThan": { "key": "year", "value": 1989 }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • lessThanOrEquals (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than or equal to the value in this object.

            The following example would return data sources with an year attribute whose value is less than or equal to 1989.

            "lessThanOrEquals": { "key": "year", "value": 1989 }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • listContains (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is a list that contains the value as one of its members.

            The following example would return data sources with an animals attribute that is a list containing a cat member (for example ["dog", "cat"]).

            "listContains": { "key": "animals", "value": "cat" }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • notEquals (dict) --

            Knowledge base data sources that contain a metadata attribute whose name matches the key and whose value doesn't match the value in this object are returned.

            The following example would return data sources that don't contain an animal attribute whose value is cat.

            "notEquals": { "key": "animal", "value": "cat" }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • notIn (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value isn't in the list specified in the value in this object.

            The following example would return data sources whose animal attribute is neither cat nor dog.

            "notIn": { "key": "animal", "value": ["cat", "dog"] }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • orAll (list) --

            Knowledge base data sources are returned if their metadata attributes fulfill at least one of the filter conditions inside this list.

            • (dict) --

              Specifies the filters to use on the metadata attributes in the knowledge base data sources before returning results. For more information, see Query configurations. See the examples below to see how to use these filters.

              This data type is used in the following API operations:

          • startsWith (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value starts with the value in this object. This filter is currently only supported for Amazon OpenSearch Serverless vector stores.

            The following example would return data sources with an animal attribute starts with ca (for example, cat or camel).

            "startsWith": { "key": "animal", "value": "ca" }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

          • stringContains (dict) --

            Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is one of the following:

            • A string that contains the value as a substring. The following example would return data sources with an animal attribute that contains the substring at (for example cat). "stringContains": { "key": "animal", "value": "at" }

            • A list with a member that contains the value as a substring. The following example would return data sources with an animals attribute that is a list containing a member that contains the substring at (for example ["dog", "cat"]). "stringContains": { "key": "animals", "value": "at" }

            • key (string) -- [REQUIRED]

              The name that the metadata attribute must match.

            • value (:ref:`document<document>`) -- [REQUIRED]

              The value to whcih to compare the value of the metadata attribute.

        • numberOfResults (integer) --

          The number of source chunks to retrieve.

        • overrideSearchType (string) --

          By default, Amazon Bedrock decides a search strategy for you. If you're using an Amazon OpenSearch Serverless vector store that contains a filterable text field, you can specify whether to query the knowledge base with a HYBRID search using both vector embeddings and raw text, or SEMANTIC search using only vector embeddings. For other vector store configurations, only SEMANTIC search is available. For more information, see Test a knowledge base.

  • type (string) -- [REQUIRED]

    The type of resource that contains your data for retrieving information and generating responses.

    If you choose ot use EXTERNAL_SOURCES, then currently only Claude 3 Sonnet models for knowledge bases are supported.

type sessionConfiguration:

dict

param sessionConfiguration:

Contains details about the session with the knowledge base.

  • kmsKeyArn (string) -- [REQUIRED]

    The ARN of the KMS key encrypting the session.

type sessionId:

string

param sessionId:

The unique identifier of the session. When you first make a RetrieveAndGenerate request, Amazon Bedrock automatically generates this value. You must reuse this value for all subsequent requests in the same conversational session. This value allows Amazon Bedrock to maintain context and knowledge from previous interactions. You can't explicitly set the sessionId yourself.

rtype:

dict

returns:

Response Syntax

{
    'citations': [
        {
            'generatedResponsePart': {
                'textResponsePart': {
                    'span': {
                        'end': 123,
                        'start': 123
                    },
                    'text': 'string'
                }
            },
            'retrievedReferences': [
                {
                    'content': {
                        'text': 'string'
                    },
                    'location': {
                        'confluenceLocation': {
                            'url': 'string'
                        },
                        's3Location': {
                            'uri': 'string'
                        },
                        'salesforceLocation': {
                            'url': 'string'
                        },
                        'sharePointLocation': {
                            'url': 'string'
                        },
                        'type': 'S3'|'WEB'|'CONFLUENCE'|'SALESFORCE'|'SHAREPOINT',
                        'webLocation': {
                            'url': 'string'
                        }
                    },
                    'metadata': {
                        'string': {...}|[...]|123|123.4|'string'|True|None
                    }
                },
            ]
        },
    ],
    'guardrailAction': 'INTERVENED'|'NONE',
    'output': {
        'text': 'string'
    },
    'sessionId': 'string'
}

Response Structure

  • (dict) --

    • citations (list) --

      A list of segments of the generated response that are based on sources in the knowledge base, alongside information about the sources.

      • (dict) --

        An object containing a segment of the generated response that is based on a source in the knowledge base, alongside information about the source.

        This data type is used in the following API operations:

        • generatedResponsePart (dict) --

          Contains the generated response and metadata

          • textResponsePart (dict) --

            Contains metadata about a textual part of the generated response that is accompanied by a citation.

            • span (dict) --

              Contains information about where the text with a citation begins and ends in the generated output.

              • end (integer) --

                Where the text with a citation ends in the generated output.

              • start (integer) --

                Where the text with a citation starts in the generated output.

            • text (string) --

              The part of the generated text that contains a citation.

        • retrievedReferences (list) --

          Contains metadata about the sources cited for the generated response.

          • (dict) --

            Contains metadata about a source cited for the generated response.

            This data type is used in the following API operations:

            • content (dict) --

              Contains the cited text from the data source.

              • text (string) --

                The cited text from the data source.

            • location (dict) --

              Contains information about the location of the data source.

              • confluenceLocation (dict) --

                The Confluence data source location.

                • url (string) --

                  The Confluence host URL for the data source location.

              • s3Location (dict) --

                The S3 data source location.

                • uri (string) --

                  The S3 URI for the data source location.

              • salesforceLocation (dict) --

                The Salesforce data source location.

                • url (string) --

                  The Salesforce host URL for the data source location.

              • sharePointLocation (dict) --

                The SharePoint data source location.

                • url (string) --

                  The SharePoint site URL for the data source location.

              • type (string) --

                The type of data source location.

              • webLocation (dict) --

                The web URL/URLs data source location.

                • url (string) --

                  The web URL/URLs for the data source location.

            • metadata (dict) --

              Contains metadata attributes and their values for the file in the data source. For more information, see Metadata and filtering.

              • (string) --

                • (:ref:`document<document>`) --

    • guardrailAction (string) --

      Specifies if there is a guardrail intervention in the response.

    • output (dict) --

      Contains the response generated from querying the knowledge base.

      • text (string) --

        The response generated from querying the knowledge base.

    • sessionId (string) --

      The unique identifier of the session. When you first make a RetrieveAndGenerate request, Amazon Bedrock automatically generates this value. You must reuse this value for all subsequent requests in the same conversational session. This value allows Amazon Bedrock to maintain context and knowledge from previous interactions. You can't explicitly set the sessionId yourself.