2023/11/28 - Agents for Amazon Bedrock Runtime - 3 new api methods
Changes This release adds support for customization types, model life cycle status and minor versions/aliases for model identifiers.
RetrieveAndGenerate API
See also: AWS API Documentation
Request Syntax
client.retrieve_and_generate(
sessionId='string',
input={
'text': 'string'
},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': 'string',
'modelArn': 'string'
}
},
sessionConfiguration={
'kmsKeyArn': 'string'
}
)
string
Identifier of the session.
dict
[REQUIRED]
Customer input of the turn
text (string) -- [REQUIRED]
Customer input of the turn in text
dict
Configures the retrieval and generation for the session.
type (string) -- [REQUIRED]
The type of RetrieveAndGenerate.
knowledgeBaseConfiguration (dict) --
Configurations for retrieval and generation for knowledge base.
knowledgeBaseId (string) -- [REQUIRED]
Identifier of the KnowledgeBase
modelArn (string) -- [REQUIRED]
Arn of a Bedrock model.
dict
Configures common parameters of the session.
kmsKeyArn (string) -- [REQUIRED]
The KMS key arn to encrypt the customer data of the session.
dict
Response Syntax
{
'sessionId': 'string',
'output': {
'text': 'string'
},
'citations': [
{
'generatedResponsePart': {
'textResponsePart': {
'text': 'string',
'span': {
'start': 123,
'end': 123
}
}
},
'retrievedReferences': [
{
'content': {
'text': 'string'
},
'location': {
'type': 'S3',
's3Location': {
'uri': 'string'
}
}
},
]
},
]
}
Response Structure
(dict) --
sessionId (string) --
Identifier of the session.
output (dict) --
Service response of the turn
text (string) --
Service response of the turn in text
citations (list) --
List of citations
(dict) --
Citation associated with the agent response
generatedResponsePart (dict) --
Generate response part
textResponsePart (dict) --
Text response part
text (string) --
Response part in text
span (dict) --
Span of text
start (integer) --
Start of span
end (integer) --
End of span
retrievedReferences (list) --
list of retrieved references
(dict) --
Retrieved reference
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
Retrieve from knowledge base.
See also: AWS API Documentation
Request Syntax
client.retrieve(
knowledgeBaseId='string',
retrievalQuery={
'text': 'string'
},
retrievalConfiguration={
'vectorSearchConfiguration': {
'numberOfResults': 123
}
},
nextToken='string'
)
string
[REQUIRED]
Identifier of the KnowledgeBase
dict
[REQUIRED]
Knowledge base input query.
text (string) -- [REQUIRED]
Knowledge base input query in text
dict
Search parameters for retrieving from knowledge base.
vectorSearchConfiguration (dict) -- [REQUIRED]
Knowledge base vector search configuration
numberOfResults (integer) -- [REQUIRED]
Top-K results to retrieve from knowledge base.
string
Opaque continuation token of previous paginated response.
dict
Response Syntax
{
'retrievalResults': [
{
'content': {
'text': 'string'
},
'location': {
'type': 'S3',
's3Location': {
'uri': 'string'
}
},
'score': 123.0
},
],
'nextToken': 'string'
}
Response Structure
(dict) --
retrievalResults (list) --
List of knowledge base retrieval results
(dict) --
Result item returned from a knowledge base retrieval.
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
score (float) --
The relevance score of a result.
nextToken (string) --
Opaque continuation token of previous paginated response.
Invokes the specified Bedrock model to run inference using the input provided in the request body.
See also: AWS API Documentation
Request Syntax
client.invoke_agent(
sessionState={
'sessionAttributes': {
'string': 'string'
},
'promptSessionAttributes': {
'string': 'string'
}
},
agentId='string',
agentAliasId='string',
sessionId='string',
endSession=True|False,
enableTrace=True|False,
inputText='string'
)
dict
Session state passed by customer. Base64 encoded json string representation of SessionState.
sessionAttributes (dict) --
Session Attributes
(string) --
(string) --
promptSessionAttributes (dict) --
Prompt Session Attributes
(string) --
(string) --
string
[REQUIRED]
Identifier for Agent
string
[REQUIRED]
Identifier for Agent Alias
string
[REQUIRED]
Identifier used for the current session
boolean
End current session
boolean
Enable agent trace events for improved debugging
string
[REQUIRED]
Input data in the format specified in the Content-Type request header.
dict
The response of this operation contains an :class:`.EventStream` member. When iterated the :class:`.EventStream` will yield events based on the structure below, where only one of the top level keys will be present for any given event.
Response Syntax
{
'completion': EventStream({
'chunk': {
'bytes': b'bytes',
'attribution': {
'citations': [
{
'generatedResponsePart': {
'textResponsePart': {
'text': 'string',
'span': {
'start': 123,
'end': 123
}
}
},
'retrievedReferences': [
{
'content': {
'text': 'string'
},
'location': {
'type': 'S3',
's3Location': {
'uri': 'string'
}
}
},
]
},
]
}
},
'trace': {
'agentId': 'string',
'agentAliasId': 'string',
'sessionId': 'string',
'trace': {
'preProcessingTrace': {
'modelInvocationInput': {
'traceId': 'string',
'text': 'string',
'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
'inferenceConfiguration': {
'temperature': ...,
'topP': ...,
'topK': 123,
'maximumLength': 123,
'stopSequences': [
'string',
]
},
'overrideLambda': 'string',
'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
'parserMode': 'DEFAULT'|'OVERRIDDEN'
},
'modelInvocationOutput': {
'traceId': 'string',
'parsedResponse': {
'rationale': 'string',
'isValid': True|False
}
}
},
'orchestrationTrace': {
'rationale': {
'traceId': 'string',
'text': 'string'
},
'invocationInput': {
'traceId': 'string',
'invocationType': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH',
'actionGroupInvocationInput': {
'actionGroupName': 'string',
'verb': 'string',
'apiPath': 'string',
'parameters': [
{
'name': 'string',
'type': 'string',
'value': 'string'
},
],
'requestBody': {
'content': {
'string': [
{
'name': 'string',
'type': 'string',
'value': 'string'
},
]
}
}
},
'knowledgeBaseLookupInput': {
'text': 'string',
'knowledgeBaseId': 'string'
}
},
'observation': {
'traceId': 'string',
'type': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'FINISH'|'ASK_USER'|'REPROMPT',
'actionGroupInvocationOutput': {
'text': 'string'
},
'knowledgeBaseLookupOutput': {
'retrievedReferences': [
{
'content': {
'text': 'string'
},
'location': {
'type': 'S3',
's3Location': {
'uri': 'string'
}
}
},
]
},
'finalResponse': {
'text': 'string'
},
'repromptResponse': {
'text': 'string',
'source': 'ACTION_GROUP'|'KNOWLEDGE_BASE'|'PARSER'
}
},
'modelInvocationInput': {
'traceId': 'string',
'text': 'string',
'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
'inferenceConfiguration': {
'temperature': ...,
'topP': ...,
'topK': 123,
'maximumLength': 123,
'stopSequences': [
'string',
]
},
'overrideLambda': 'string',
'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
'parserMode': 'DEFAULT'|'OVERRIDDEN'
}
},
'postProcessingTrace': {
'modelInvocationInput': {
'traceId': 'string',
'text': 'string',
'type': 'PRE_PROCESSING'|'ORCHESTRATION'|'KNOWLEDGE_BASE_RESPONSE_GENERATION'|'POST_PROCESSING',
'inferenceConfiguration': {
'temperature': ...,
'topP': ...,
'topK': 123,
'maximumLength': 123,
'stopSequences': [
'string',
]
},
'overrideLambda': 'string',
'promptCreationMode': 'DEFAULT'|'OVERRIDDEN',
'parserMode': 'DEFAULT'|'OVERRIDDEN'
},
'modelInvocationOutput': {
'traceId': 'string',
'parsedResponse': {
'text': 'string'
}
}
},
'failureTrace': {
'traceId': 'string',
'failureReason': 'string'
}
}
},
'internalServerException': {
'message': 'string'
},
'validationException': {
'message': 'string'
},
'resourceNotFoundException': {
'message': 'string'
},
'serviceQuotaExceededException': {
'message': 'string'
},
'throttlingException': {
'message': 'string'
},
'accessDeniedException': {
'message': 'string'
},
'conflictException': {
'message': 'string'
},
'dependencyFailedException': {
'message': 'string',
'resourceName': 'string'
},
'badGatewayException': {
'message': 'string',
'resourceName': 'string'
}
}),
'contentType': 'string',
'sessionId': 'string'
}
Response Structure
(dict) --
InvokeAgent Response
completion (:class:`.EventStream`) --
Inference response from the model in the format specified in the Content-Type response header.
chunk (dict) --
Base 64 endoded byte response
bytes (bytes) --
PartBody of the payload in bytes
attribution (dict) --
Citations associated with final agent response
citations (list) --
List of citations
(dict) --
Citation associated with the agent response
generatedResponsePart (dict) --
Generate response part
textResponsePart (dict) --
Text response part
text (string) --
Response part in text
span (dict) --
Span of text
start (integer) --
Start of span
end (integer) --
End of span
retrievedReferences (list) --
list of retrieved references
(dict) --
Retrieved reference
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
trace (dict) --
Trace Part which contains intermidate response for customer
agentId (string) --
Identifier of the agent.
agentAliasId (string) --
Identifier of the agent alias.
sessionId (string) --
Identifier of the session.
trace (dict) --
Trace contains intermidate response for customer
Note
This is a Tagged Union structure. Only one of the following top level keys will be set: preProcessingTrace, orchestrationTrace, postProcessingTrace, failureTrace. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
preProcessingTrace (dict) --
Trace Part which contains information related to preprocessing step
Note
This is a Tagged Union structure. Only one of the following top level keys will be set: modelInvocationInput, modelInvocationOutput. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
modelInvocationInput (dict) --
Trace Part which contains information used to call Invoke Model
traceId (string) --
Identifier for trace
text (string) --
Prompt Message
type (string) --
types of prompts
inferenceConfiguration (dict) --
Configurations for controlling the inference response of an InvokeAgent API call
temperature (float) --
Controls randomness, higher values increase diversity
topP (float) --
Cumulative probability cutoff for token selection
topK (integer) --
Sample from the k most likely next tokens
maximumLength (integer) --
Maximum length of output
stopSequences (list) --
List of stop sequences
(string) --
overrideLambda (string) --
ARN of a Lambda.
promptCreationMode (string) --
indicates if agent uses default prompt or overriden prompt
parserMode (string) --
indicates if agent uses default prompt or overriden prompt
modelInvocationOutput (dict) --
Trace Part which contains information related to preprocessing
traceId (string) --
Identifier for trace
parsedResponse (dict) --
Trace Part which contains information if preprocessing was successful
rationale (string) --
Agent Trace Rationale String
isValid (boolean) --
Boolean value
orchestrationTrace (dict) --
Trace contains intermidate response during orchestration
Note
This is a Tagged Union structure. Only one of the following top level keys will be set: rationale, invocationInput, observation, modelInvocationInput. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
rationale (dict) --
Trace Part which contains information related to reasoning
traceId (string) --
Identifier for trace
text (string) --
Agent Trace Rationale String
invocationInput (dict) --
Trace Part which contains input details for action group or knowledge base
traceId (string) --
Identifier for trace
invocationType (string) --
types of invocations
actionGroupInvocationInput (dict) --
input to lambda used in action group
actionGroupName (string) --
Agent Trace Action Group Name
verb (string) --
Agent Trace Action Group Action verb
apiPath (string) --
Agent Trace Action Group API path
parameters (list) --
list of parameters included in action group invocation
(dict) --
parameters included in action group invocation
name (string) --
Name of parameter
type (string) --
Type of parameter
value (string) --
Value of parameter
requestBody (dict) --
Request Body Content Map
content (dict) --
Content type paramter map
(string) --
(list) --
list of parameters included in action group invocation
(dict) --
parameters included in action group invocation
name (string) --
Name of parameter
type (string) --
Type of parameter
value (string) --
Value of parameter
knowledgeBaseLookupInput (dict) --
Input to lambda used in action group
text (string) --
Agent Trace Action Group Lambda Invocation Output String
knowledgeBaseId (string) --
Agent Trace Action Group Knowledge Base Id
observation (dict) --
Trace Part which contains output details for action group or knowledge base or final response
traceId (string) --
Identifier for trace
type (string) --
types of observations
actionGroupInvocationOutput (dict) --
output from lambda used in action group
text (string) --
Agent Trace Action Group Lambda Invocation Output String
knowledgeBaseLookupOutput (dict) --
Input to lambda used in action group
retrievedReferences (list) --
list of retrieved references
(dict) --
Retrieved reference
content (dict) --
Content of a retrieval result.
text (string) --
Content of a retrieval result in text
location (dict) --
The source location of a retrieval result.
type (string) --
The location type of a retrieval result.
s3Location (dict) --
The S3 location of a retrieval result.
uri (string) --
URI of S3 location
finalResponse (dict) --
Agent finish output
text (string) --
Agent Trace Action Group Lambda Invocation Output String
repromptResponse (dict) --
Observation information if there were reprompts
text (string) --
Reprompt response text
source (string) --
Parsing error source
modelInvocationInput (dict) --
Trace Part which contains information used to call Invoke Model
traceId (string) --
Identifier for trace
text (string) --
Prompt Message
type (string) --
types of prompts
inferenceConfiguration (dict) --
Configurations for controlling the inference response of an InvokeAgent API call
temperature (float) --
Controls randomness, higher values increase diversity
topP (float) --
Cumulative probability cutoff for token selection
topK (integer) --
Sample from the k most likely next tokens
maximumLength (integer) --
Maximum length of output
stopSequences (list) --
List of stop sequences
(string) --
overrideLambda (string) --
ARN of a Lambda.
promptCreationMode (string) --
indicates if agent uses default prompt or overriden prompt
parserMode (string) --
indicates if agent uses default prompt or overriden prompt
postProcessingTrace (dict) --
Trace Part which contains information related to post processing step
Note
This is a Tagged Union structure. Only one of the following top level keys will be set: modelInvocationInput, modelInvocationOutput. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
modelInvocationInput (dict) --
Trace Part which contains information used to call Invoke Model
traceId (string) --
Identifier for trace
text (string) --
Prompt Message
type (string) --
types of prompts
inferenceConfiguration (dict) --
Configurations for controlling the inference response of an InvokeAgent API call
temperature (float) --
Controls randomness, higher values increase diversity
topP (float) --
Cumulative probability cutoff for token selection
topK (integer) --
Sample from the k most likely next tokens
maximumLength (integer) --
Maximum length of output
stopSequences (list) --
List of stop sequences
(string) --
overrideLambda (string) --
ARN of a Lambda.
promptCreationMode (string) --
indicates if agent uses default prompt or overriden prompt
parserMode (string) --
indicates if agent uses default prompt or overriden prompt
modelInvocationOutput (dict) --
Trace Part which contains information related to postprocessing
traceId (string) --
Identifier for trace
parsedResponse (dict) --
Trace Part which contains information if preprocessing was successful
text (string) --
Agent Trace Output String
failureTrace (dict) --
Trace Part which is emitted when agent trace could not be generated
traceId (string) --
Identifier for trace
failureReason (string) --
Agent Trace Failed Reason String
internalServerException (dict) --
This exception is thrown if there was an unexpected error during processing of request
message (string) --
Non Blank String
validationException (dict) --
This exception is thrown when the request's input validation fails
message (string) --
Non Blank String
resourceNotFoundException (dict) --
This exception is thrown when a resource referenced by the operation does not exist
message (string) --
Non Blank String
serviceQuotaExceededException (dict) --
This exception is thrown when a request is made beyond the service quota
message (string) --
Non Blank String
throttlingException (dict) --
This exception is thrown when the number of requests exceeds the limit
message (string) --
Non Blank String
accessDeniedException (dict) --
This exception is thrown when a request is denied per access permissions
message (string) --
Non Blank String
conflictException (dict) --
This exception is thrown when there is a conflict performing an operation
message (string) --
Non Blank String
dependencyFailedException (dict) --
This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource due to a customer fault (i.e. bad configuration)
message (string) --
Non Blank String
resourceName (string) --
Non Blank String
badGatewayException (dict) --
This exception is thrown when a request fails due to dependency like Lambda, Bedrock, STS resource
message (string) --
Non Blank String
resourceName (string) --
Non Blank String
contentType (string) --
streaming response mimetype of the model
sessionId (string) --
streaming response mimetype of the model