2026/06/12 - Amazon Bedrock AgentCore - 6 updated api methods
Changes Added tagging and CMK support across optimization, an explanation field in recommendation output, and an insights feature to identify failure patterns, extract user intents, and summarize execution behavior
{'tags': {'string': 'string'}}
Creates an A/B test for comparing agent configurations. A/B tests split traffic between a control variant and a treatment variant through a gateway, then evaluate performance using online evaluation configurations to determine which variant performs better.
See also: AWS API Documentation
Request Syntax
client.create_ab_test(
name='string',
description='string',
gatewayArn='string',
variants=[
{
'name': 'string',
'weight': 123,
'variantConfiguration': {
'configurationBundle': {
'bundleArn': 'string',
'bundleVersion': 'string'
},
'target': {
'name': 'string'
}
}
},
],
gatewayFilter={
'targetPaths': [
'string',
]
},
evaluationConfig={
'onlineEvaluationConfigArn': 'string',
'perVariantOnlineEvaluationConfig': [
{
'name': 'string',
'onlineEvaluationConfigArn': 'string'
},
]
},
roleArn='string',
enableOnCreate=True|False,
clientToken='string',
tags={
'string': 'string'
}
)
string
[REQUIRED]
The name of the A/B test. Must be unique within your account.
string
The description of the A/B test.
string
[REQUIRED]
The Amazon Resource Name (ARN) of the gateway to use for traffic splitting.
list
[REQUIRED]
The list of variants for the A/B test. Must contain exactly two variants: a control (C) and a treatment (T1), each with a configuration bundle or target reference and a traffic weight.
(dict) --
A variant in an A/B test, representing either the control (C) or treatment (T1) configuration.
name (string) -- [REQUIRED]
The name of the variant. Must be C for control or T1 for treatment.
weight (integer) -- [REQUIRED]
The percentage of traffic to route to this variant. Weights across all variants must sum to 100.
variantConfiguration (dict) -- [REQUIRED]
The configuration for this variant, including the configuration bundle or target reference.
configurationBundle (dict) --
A reference to a configuration bundle version to use for this variant.
bundleArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the configuration bundle.
bundleVersion (string) -- [REQUIRED]
The version of the configuration bundle.
target (dict) --
A reference to a gateway target to route traffic to for this variant.
name (string) -- [REQUIRED]
The name of the gateway target.
dict
Optional filter to restrict which gateway target paths are included in the A/B test.
targetPaths (list) --
A list of target path patterns to include in the A/B test.
(string) --
dict
[REQUIRED]
The evaluation configuration specifying which online evaluation configurations to use for measuring variant performance.
onlineEvaluationConfigArn (string) --
The Amazon Resource Name (ARN) of a single online evaluation configuration to use for both variants.
perVariantOnlineEvaluationConfig (list) --
Per-variant online evaluation configurations, allowing different evaluation settings for each variant.
(dict) --
An online evaluation configuration associated with a specific A/B test variant.
name (string) -- [REQUIRED]
The name of the variant this evaluation configuration applies to.
onlineEvaluationConfigArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the online evaluation configuration for this variant.
string
[REQUIRED]
The IAM role ARN that grants permissions for the A/B test to access gateway and evaluation resources.
boolean
Whether to enable the A/B test immediately upon creation. If true, traffic splitting begins automatically.
string
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, the service ignores the request, but does not return an error.
This field is autopopulated if not provided.
dict
A map of tag keys and values to associate with the A/B test.
(string) --
(string) --
dict
Response Syntax
{
'abTestId': 'string',
'abTestArn': 'string',
'name': 'string',
'status': 'CREATING'|'ACTIVE'|'CREATE_FAILED'|'UPDATING'|'UPDATE_FAILED'|'DELETING'|'DELETE_FAILED'|'FAILED',
'executionStatus': 'PAUSED'|'RUNNING'|'STOPPED'|'NOT_STARTED',
'createdAt': datetime(2015, 1, 1)
}
Response Structure
(dict) --
abTestId (string) --
The unique identifier of the created A/B test.
abTestArn (string) --
The Amazon Resource Name (ARN) of the created A/B test.
name (string) --
The name of the A/B test.
status (string) --
The status of the A/B test.
executionStatus (string) --
The execution status indicating whether the A/B test is currently running.
createdAt (datetime) --
The timestamp when the A/B test was created.
{'dataSourceConfig': {'onlineEvaluationConfigSource': {'onlineEvaluationConfigArn': 'string',
'sessionFilterConfig': {'endTime': 'timestamp',
'startTime': 'timestamp'}}},
'executionSummaryResult': {'executionSummaries': [{'affectedSessionCount': 'integer',
'affectedSessions': [{'approachTaken': 'string',
'finalOutcome': 'string',
'sessionId': 'string'}],
'clusterId': 'integer',
'description': 'string',
'name': 'string'}]},
'failureAnalysisResult': {'failures': [{'affectedSessionCount': 'integer',
'clusterId': 'integer',
'description': 'string',
'name': 'string',
'subCategories': [{'affectedSessionCount': 'integer',
'clusterId': 'integer',
'description': 'string',
'name': 'string',
'rootCauses': [{'affectedSessionCount': 'integer',
'affectedSessions': [{'explanation': 'string',
'failureSpans': [{'signals': [{'category': 'execution-error-category-authentication '
'| '
'execution-error-category-resource-not-found '
'| '
'execution-error-category-service-errors '
'| '
'execution-error-category-rate-limiting '
'| '
'execution-error-category-formatting '
'| '
'execution-error-category-timeout '
'| '
'execution-error-category-resource-exhaustion '
'| '
'execution-error-category-environment '
'| '
'execution-error-category-tool-schema '
'| '
'task-instruction-category-non-compliance '
'| '
'task-instruction-category-problem-id '
'| '
'incorrect-actions-category-tool-selection '
'| '
'incorrect-actions-category-poor-information-retrieval '
'| '
'incorrect-actions-category-clarification '
'| '
'incorrect-actions-category-inappropriate-info-request '
'| '
'context-handling-error-category-context-handling-failures '
'| '
'hallucination-category-hall-capabilities '
'| '
'hallucination-category-hall-misunderstand '
'| '
'hallucination-category-hall-usage '
'| '
'hallucination-category-hall-history '
'| '
'hallucination-category-hall-params '
'| '
'hallucination-category-fabricate-tool-outputs '
'| '
'repetitive-behavior-category-repetition-tool '
'| '
'repetitive-behavior-category-repetition-info '
'| '
'repetitive-behavior-category-step-repetition '
'| '
'orchestration-related-errors-category-reasoning-mismatch '
'| '
'orchestration-related-errors-category-goal-deviation '
'| '
'orchestration-related-errors-category-premature-termination '
'| '
'orchestration-related-errors-category-unaware-termination '
'| '
'llm-output-category-nonsensical '
'| '
'configuration-mismatch-category-tool-definition '
'| '
'coding-use-case-specific-failure-types-category-edge-case-oversights '
'| '
'coding-use-case-specific-failure-types-category-dependency-issues',
'confidence': 'double',
'evidence': 'string'}],
'spanId': 'string',
'traceId': 'string'}],
'fixType': 'string',
'recommendation': 'string',
'sessionId': 'string'}],
'clusterId': 'integer',
'name': 'string',
'recommendation': 'string',
'rootCause': 'string'}]}]}]},
'insights': [{'insightId': 'string'}],
'kmsKeyArn': 'string',
'userIntentResult': {'userIntents': [{'affectedSessionCount': 'integer',
'affectedSessions': [{'sessionId': 'string',
'userMessages': ['string']}],
'clusterId': 'integer',
'description': 'string',
'name': 'string'}]}}
Retrieves detailed information about a batch evaluation, including its status, configuration, results, and any error details.
See also: AWS API Documentation
Request Syntax
client.get_batch_evaluation(
batchEvaluationId='string'
)
string
[REQUIRED]
The unique identifier of the batch evaluation to retrieve.
dict
Response Syntax
{
'batchEvaluationId': 'string',
'batchEvaluationArn': 'string',
'batchEvaluationName': 'string',
'status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERRORS'|'FAILED'|'STOPPING'|'STOPPED'|'DELETING',
'createdAt': datetime(2015, 1, 1),
'evaluators': [
{
'evaluatorId': 'string'
},
],
'insights': [
{
'insightId': 'string'
},
],
'dataSourceConfig': {
'cloudWatchLogs': {
'serviceNames': [
'string',
],
'logGroupNames': [
'string',
],
'filterConfig': {
'sessionIds': [
'string',
],
'timeRange': {
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1)
}
}
},
'onlineEvaluationConfigSource': {
'onlineEvaluationConfigArn': 'string',
'sessionFilterConfig': {
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1)
}
}
},
'outputConfig': {
'cloudWatchConfig': {
'logGroupName': 'string',
'logStreamName': 'string'
}
},
'evaluationResults': {
'numberOfSessionsCompleted': 123,
'numberOfSessionsInProgress': 123,
'numberOfSessionsFailed': 123,
'totalNumberOfSessions': 123,
'numberOfSessionsIgnored': 123,
'evaluatorSummaries': [
{
'evaluatorId': 'string',
'statistics': {
'averageScore': 123.0
},
'totalEvaluated': 123,
'totalFailed': 123
},
]
},
'failureAnalysisResult': {
'failures': [
{
'clusterId': 123,
'name': 'string',
'description': 'string',
'affectedSessionCount': 123,
'subCategories': [
{
'clusterId': 123,
'name': 'string',
'description': 'string',
'affectedSessionCount': 123,
'rootCauses': [
{
'clusterId': 123,
'name': 'string',
'rootCause': 'string',
'recommendation': 'string',
'affectedSessionCount': 123,
'affectedSessions': [
{
'sessionId': 'string',
'explanation': 'string',
'fixType': 'string',
'recommendation': 'string',
'failureSpans': [
{
'spanId': 'string',
'traceId': 'string',
'signals': [
{
'category': 'execution-error-category-authentication'|'execution-error-category-resource-not-found'|'execution-error-category-service-errors'|'execution-error-category-rate-limiting'|'execution-error-category-formatting'|'execution-error-category-timeout'|'execution-error-category-resource-exhaustion'|'execution-error-category-environment'|'execution-error-category-tool-schema'|'task-instruction-category-non-compliance'|'task-instruction-category-problem-id'|'incorrect-actions-category-tool-selection'|'incorrect-actions-category-poor-information-retrieval'|'incorrect-actions-category-clarification'|'incorrect-actions-category-inappropriate-info-request'|'context-handling-error-category-context-handling-failures'|'hallucination-category-hall-capabilities'|'hallucination-category-hall-misunderstand'|'hallucination-category-hall-usage'|'hallucination-category-hall-history'|'hallucination-category-hall-params'|'hallucination-category-fabricate-tool-outputs'|'repetitive-behavior-category-repetition-tool'|'repetitive-behavior-category-repetition-info'|'repetitive-behavior-category-step-repetition'|'orchestration-related-errors-category-reasoning-mismatch'|'orchestration-related-errors-category-goal-deviation'|'orchestration-related-errors-category-premature-termination'|'orchestration-related-errors-category-unaware-termination'|'llm-output-category-nonsensical'|'configuration-mismatch-category-tool-definition'|'coding-use-case-specific-failure-types-category-edge-case-oversights'|'coding-use-case-specific-failure-types-category-dependency-issues',
'evidence': 'string',
'confidence': 123.0
},
]
},
]
},
]
},
]
},
]
},
]
},
'userIntentResult': {
'userIntents': [
{
'clusterId': 123,
'name': 'string',
'description': 'string',
'affectedSessionCount': 123,
'affectedSessions': [
{
'sessionId': 'string',
'userMessages': [
'string',
]
},
]
},
]
},
'executionSummaryResult': {
'executionSummaries': [
{
'clusterId': 123,
'name': 'string',
'description': 'string',
'affectedSessionCount': 123,
'affectedSessions': [
{
'sessionId': 'string',
'approachTaken': 'string',
'finalOutcome': 'string'
},
]
},
]
},
'errorDetails': [
'string',
],
'description': 'string',
'updatedAt': datetime(2015, 1, 1),
'kmsKeyArn': 'string'
}
Response Structure
(dict) --
batchEvaluationId (string) --
The unique identifier of the batch evaluation.
batchEvaluationArn (string) --
The Amazon Resource Name (ARN) of the batch evaluation.
batchEvaluationName (string) --
The name of the batch evaluation.
status (string) --
The current status of the batch evaluation.
createdAt (datetime) --
The timestamp when the batch evaluation was created.
evaluators (list) --
The list of evaluators applied during the batch evaluation.
(dict) --
An evaluator to run against sessions
evaluatorId (string) --
The unique identifier of the evaluator. Can reference built-in evaluators (e.g., Builtin.Helpfulness) or custom evaluators.
insights (list) --
The list of insight analyses applied during the batch evaluation.
(dict) --
A reference to an insight analysis to run against sessions.
insightId (string) --
Canonical insight identifiers using the Builtin.Insight.* naming convention. Used by BatchEvaluate, InternalEvaluate, and ServiceEngineEvaluate flows.
dataSourceConfig (dict) --
The data source configuration specifying where agent traces are pulled from.
cloudWatchLogs (dict) --
Configuration for pulling agent session traces from CloudWatch Logs.
serviceNames (list) --
The list of agent service names to filter traces within the specified log groups.
(string) --
logGroupNames (list) --
The list of CloudWatch log group names to read agent traces from. Maximum of 5 log groups.
(string) --
filterConfig (dict) --
Optional filter configuration to narrow down which sessions to evaluate.
sessionIds (list) --
A list of specific session IDs to evaluate. If specified, only these sessions are included in the evaluation.
(string) --
timeRange (dict) --
The time range filter for selecting sessions to evaluate.
startTime (datetime) --
The start time of the time range. Only sessions with activity at or after this timestamp are included.
endTime (datetime) --
The end time of the time range. Only sessions with activity before this timestamp are included.
onlineEvaluationConfigSource (dict) --
Reference an existing OnlineEvaluationConfig as session source
onlineEvaluationConfigArn (string) --
The Amazon Resource Name (ARN) of the online evaluation configuration to use as the session source.
sessionFilterConfig (dict) --
Optional session filter configuration to narrow down which sessions from the online evaluation configuration to include.
startTime (datetime) --
The start time of the time range. Only sessions with activity at or after this timestamp are included.
endTime (datetime) --
The end time of the time range. Only sessions with activity before this timestamp are included.
outputConfig (dict) --
The output configuration specifying where evaluation results are written.
cloudWatchConfig (dict) --
The CloudWatch Logs configuration for writing evaluation results.
logGroupName (string) --
The name of the CloudWatch log group where evaluation results will be written.
logStreamName (string) --
The name of the CloudWatch log stream where evaluation results will be written.
evaluationResults (dict) --
The aggregated evaluation results, including session completion counts and evaluator score summaries.
numberOfSessionsCompleted (integer) --
The number of sessions that have been successfully evaluated.
numberOfSessionsInProgress (integer) --
The number of sessions currently being evaluated.
numberOfSessionsFailed (integer) --
The number of sessions that failed evaluation.
totalNumberOfSessions (integer) --
The total number of sessions included in the batch evaluation.
numberOfSessionsIgnored (integer) --
The number of sessions that were ignored during evaluation.
evaluatorSummaries (list) --
A list of per-evaluator summary statistics.
(dict) --
Summary statistics for a single evaluator within a batch evaluation.
evaluatorId (string) --
The unique identifier of the evaluator.
statistics (dict) --
The aggregated statistics for this evaluator.
averageScore (float) --
The average score across all evaluated sessions for this evaluator.
totalEvaluated (integer) --
The total number of sessions evaluated by this evaluator.
totalFailed (integer) --
The total number of sessions that failed evaluation by this evaluator.
failureAnalysisResult (dict) --
Unified customer-facing clustering result written to S3.
failures (list) --
The list of failure category clusters identified across analyzed sessions.
(dict) --
A top-level failure category identified by clustering similar failure patterns across sessions.
clusterId (integer) --
The unique identifier of the failure category cluster.
name (string) --
The name of the failure category.
description (string) --
A description of the failure category pattern.
affectedSessionCount (integer) --
The number of sessions affected by this failure category.
subCategories (list) --
The list of failure subcategories within this category.
(dict) --
A subcategory of failures within a top-level failure category.
clusterId (integer) --
The unique identifier of the failure subcategory cluster.
name (string) --
The name of the failure subcategory.
description (string) --
A description of the failure subcategory pattern.
affectedSessionCount (integer) --
The number of sessions affected by this failure subcategory.
rootCauses (list) --
The list of root cause clusters identified within this subcategory.
(dict) --
A cluster of similar root causes identified within a failure subcategory.
clusterId (integer) --
The unique identifier of the root cause cluster.
name (string) --
The name of the root cause cluster.
rootCause (string) --
The root cause explanation for this cluster of failures.
recommendation (string) --
The recommended fix for this root cause.
affectedSessionCount (integer) --
The number of sessions affected by this root cause.
affectedSessions (list) --
The list of sessions affected by this root cause.
(dict) --
A session affected by a detected failure pattern, including root cause details.
sessionId (string) --
The unique identifier of the affected session.
explanation (string) --
An explanation of how the failure manifested in this session.
fixType (string) --
The type of fix recommended for this failure.
recommendation (string) --
The specific fix recommendation for this session.
failureSpans (list) --
The list of spans where failures were detected in this session.
(dict) --
Details about a specific span where a failure was detected.
spanId (string) --
The unique identifier of the span where the failure occurred.
traceId (string) --
The trace identifier associated with the failure span.
signals (list) --
The failure signals detected in this span.
(dict) --
A signal indicating a detected failure within a span.
category (string) --
Failure category taxonomy for agent session insights. Values must stay in sync with the category registry in AgentCoreLens (amzn_agentcore_lens.config.failure_detection.FAILURE_CATEGORIES).
evidence (string) --
The evidence supporting the failure detection.
confidence (float) --
The confidence score of the failure detection.
userIntentResult (dict) --
Customer-facing user intent clustering result written to S3.
userIntents (list) --
The list of user intent clusters identified across analyzed sessions.
(dict) --
A cluster of similar user intents identified across sessions.
clusterId (integer) --
The unique identifier of the user intent cluster.
name (string) --
The name of the user intent cluster.
description (string) --
A description of the user intent pattern.
affectedSessionCount (integer) --
The number of sessions with this user intent.
affectedSessions (list) --
The list of sessions with this user intent.
(dict) --
A session associated with a user intent cluster.
sessionId (string) --
The unique identifier of the session.
userMessages (list) --
The user messages from this session that contributed to the intent cluster.
(string) --
executionSummaryResult (dict) --
Customer-facing execution summary clustering result written to S3.
executionSummaries (list) --
The list of execution summary clusters identified across analyzed sessions.
(dict) --
A cluster of similar execution patterns identified across sessions.
clusterId (integer) --
The unique identifier of the execution summary cluster.
name (string) --
The name of the execution pattern cluster.
description (string) --
A description of the execution pattern.
affectedSessionCount (integer) --
The number of sessions with this execution pattern.
affectedSessions (list) --
The list of sessions with this execution pattern.
(dict) --
A session associated with an execution summary cluster.
sessionId (string) --
The unique identifier of the session.
approachTaken (string) --
The approach taken by the agent during this session.
finalOutcome (string) --
The final outcome of the session.
errorDetails (list) --
The error details if the batch evaluation encountered failures.
(string) --
description (string) --
The description of the batch evaluation.
updatedAt (datetime) --
The timestamp when the batch evaluation was last updated.
kmsKeyArn (string) --
The ARN of the KMS key used to encrypt evaluation data.
{'kmsKeyArn': 'string',
'recommendationConfig': {'systemPromptRecommendationConfig': {'agentTraces': {'batchEvaluation': {'batchEvaluationArn': 'string'}}},
'toolDescriptionRecommendationConfig': {'agentTraces': {'batchEvaluation': {'batchEvaluationArn': 'string'}}}},
'recommendationResult': {'systemPromptRecommendationResult': {'explanation': 'string'},
'toolDescriptionRecommendationResult': {'tools': {'explanation': 'string'}}}}
Retrieves detailed information about a recommendation, including its configuration, status, and results.
See also: AWS API Documentation
Request Syntax
client.get_recommendation(
recommendationId='string'
)
string
[REQUIRED]
The unique identifier of the recommendation to retrieve.
dict
Response Syntax
{
'recommendationId': 'string',
'recommendationArn': 'string',
'name': 'string',
'description': 'string',
'type': 'SYSTEM_PROMPT_RECOMMENDATION'|'TOOL_DESCRIPTION_RECOMMENDATION',
'recommendationConfig': {
'systemPromptRecommendationConfig': {
'systemPrompt': {
'text': 'string',
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string',
'systemPromptJsonPath': 'string'
}
},
'agentTraces': {
'sessionSpans': [
{...}|[...]|123|123.4|'string'|True|None,
],
'cloudwatchLogs': {
'logGroupArns': [
'string',
],
'serviceNames': [
'string',
],
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'rule': {
'filters': [
{
'key': 'string',
'operator': 'Equals'|'NotEquals'|'GreaterThan'|'LessThan'|'GreaterThanOrEqual'|'LessThanOrEqual'|'Contains'|'NotContains',
'value': {
'stringValue': 'string',
'doubleValue': 123.0,
'booleanValue': True|False
}
},
]
}
},
'batchEvaluation': {
'batchEvaluationArn': 'string'
}
},
'evaluationConfig': {
'evaluators': [
{
'evaluatorArn': 'string'
},
]
}
},
'toolDescriptionRecommendationConfig': {
'toolDescription': {
'toolDescriptionText': {
'tools': [
{
'toolName': 'string',
'toolDescription': {
'text': 'string'
}
},
]
},
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string',
'tools': [
{
'toolName': 'string',
'toolDescriptionJsonPath': 'string'
},
]
}
},
'agentTraces': {
'sessionSpans': [
{...}|[...]|123|123.4|'string'|True|None,
],
'cloudwatchLogs': {
'logGroupArns': [
'string',
],
'serviceNames': [
'string',
],
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'rule': {
'filters': [
{
'key': 'string',
'operator': 'Equals'|'NotEquals'|'GreaterThan'|'LessThan'|'GreaterThanOrEqual'|'LessThanOrEqual'|'Contains'|'NotContains',
'value': {
'stringValue': 'string',
'doubleValue': 123.0,
'booleanValue': True|False
}
},
]
}
},
'batchEvaluation': {
'batchEvaluationArn': 'string'
}
}
}
},
'status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'DELETING',
'createdAt': datetime(2015, 1, 1),
'updatedAt': datetime(2015, 1, 1),
'recommendationResult': {
'systemPromptRecommendationResult': {
'recommendedSystemPrompt': 'string',
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string'
},
'explanation': 'string',
'errorCode': 'string',
'errorMessage': 'string'
},
'toolDescriptionRecommendationResult': {
'tools': [
{
'toolName': 'string',
'recommendedToolDescription': 'string',
'explanation': 'string'
},
],
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string'
},
'errorCode': 'string',
'errorMessage': 'string'
}
},
'kmsKeyArn': 'string'
}
Response Structure
(dict) --
recommendationId (string) --
The unique identifier of the recommendation.
recommendationArn (string) --
The Amazon Resource Name (ARN) of the recommendation.
name (string) --
The name of the recommendation.
description (string) --
The description of the recommendation.
type (string) --
The type of recommendation.
recommendationConfig (dict) --
The configuration for the recommendation.
systemPromptRecommendationConfig (dict) --
The configuration for a system prompt recommendation.
systemPrompt (dict) --
The current system prompt to optimize.
text (string) --
The system prompt text provided inline.
configurationBundle (dict) --
The system prompt sourced from a configuration bundle version.
bundleArn (string) --
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) --
The version identifier of the configuration bundle.
systemPromptJsonPath (string) --
The JSON path within the configuration bundle that contains the system prompt.
agentTraces (dict) --
The agent traces to analyze for generating recommendations.
sessionSpans (list) --
Agent traces provided as inline session spans in OpenTelemetry format.
(:ref:`document<document>`) --
cloudwatchLogs (dict) --
Agent traces read from CloudWatch Logs.
logGroupArns (list) --
The list of CloudWatch log group ARNs to read agent traces from.
(string) --
serviceNames (list) --
The list of service names to filter traces within the specified log groups.
(string) --
startTime (datetime) --
The start time of the time range to read traces from.
endTime (datetime) --
The end time of the time range to read traces from.
rule (dict) --
Optional rule configuration for filtering traces.
filters (list) --
The list of filters to apply when reading agent traces.
(dict) --
A filter for narrowing down agent traces from CloudWatch Logs based on key-value comparisons.
key (string) --
The key or field name to filter on within the agent trace data.
operator (string) --
The comparison operator to use for filtering.
value (dict) --
The value to compare against using the specified operator.
stringValue (string) --
A string value for text-based filtering.
doubleValue (float) --
A numeric value for numerical filtering and comparisons.
booleanValue (boolean) --
A boolean value for true/false filtering conditions.
batchEvaluation (dict) --
Use a completed batch evaluation as the source of agent traces.
batchEvaluationArn (string) --
The ARN of the completed batch evaluation to use as the trace source.
evaluationConfig (dict) --
The evaluation configuration specifying which evaluator to use for assessing recommendation quality.
evaluators (list) --
The list of evaluators to use for assessing recommendation quality.
(dict) --
A reference to an evaluator used for recommendation assessment.
evaluatorArn (string) --
The Amazon Resource Name (ARN) of the evaluator.
toolDescriptionRecommendationConfig (dict) --
The configuration for a tool description recommendation.
toolDescription (dict) --
The current tool descriptions to optimize.
toolDescriptionText (dict) --
Tool descriptions provided as inline text.
tools (list) --
The list of tool descriptions to optimize.
(dict) --
A tool description input containing the tool name and its current description.
toolName (string) --
The name of the tool.
toolDescription (dict) --
The current description of the tool to optimize.
text (string) --
The tool description as inline text.
configurationBundle (dict) --
Tool descriptions sourced from a configuration bundle version.
bundleArn (string) --
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) --
The version identifier of the configuration bundle.
tools (list) --
The list of tool entries mapping tool names to their JSON paths within the bundle.
(dict) --
Maps a tool name to its JSON path within a configuration bundle.
toolName (string) --
The name of the tool.
toolDescriptionJsonPath (string) --
The JSON path within the configuration bundle's components that contains the tool description.
agentTraces (dict) --
The agent traces to analyze for generating tool description recommendations.
sessionSpans (list) --
Agent traces provided as inline session spans in OpenTelemetry format.
(:ref:`document<document>`) --
cloudwatchLogs (dict) --
Agent traces read from CloudWatch Logs.
logGroupArns (list) --
The list of CloudWatch log group ARNs to read agent traces from.
(string) --
serviceNames (list) --
The list of service names to filter traces within the specified log groups.
(string) --
startTime (datetime) --
The start time of the time range to read traces from.
endTime (datetime) --
The end time of the time range to read traces from.
rule (dict) --
Optional rule configuration for filtering traces.
filters (list) --
The list of filters to apply when reading agent traces.
(dict) --
A filter for narrowing down agent traces from CloudWatch Logs based on key-value comparisons.
key (string) --
The key or field name to filter on within the agent trace data.
operator (string) --
The comparison operator to use for filtering.
value (dict) --
The value to compare against using the specified operator.
stringValue (string) --
A string value for text-based filtering.
doubleValue (float) --
A numeric value for numerical filtering and comparisons.
booleanValue (boolean) --
A boolean value for true/false filtering conditions.
batchEvaluation (dict) --
Use a completed batch evaluation as the source of agent traces.
batchEvaluationArn (string) --
The ARN of the completed batch evaluation to use as the trace source.
status (string) --
The current status of the recommendation.
createdAt (datetime) --
The timestamp when the recommendation was created.
updatedAt (datetime) --
The timestamp when the recommendation was last updated.
recommendationResult (dict) --
The result of the recommendation, containing the optimized system prompt or tool descriptions. Only present when the recommendation status is COMPLETED.
systemPromptRecommendationResult (dict) --
The result of a system prompt recommendation.
recommendedSystemPrompt (string) --
The optimized system prompt text generated by the recommendation.
configurationBundle (dict) --
The configuration bundle containing the recommended system prompt, if the input was sourced from a configuration bundle.
bundleArn (string) --
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) --
The version identifier of the configuration bundle containing the recommendation.
explanation (string) --
An explanation of why the recommendation was generated and what patterns were identified in the agent traces.
errorCode (string) --
The error code if the recommendation failed.
errorMessage (string) --
The error message if the recommendation failed.
toolDescriptionRecommendationResult (dict) --
The result of a tool description recommendation.
tools (list) --
The list of tools with their recommended descriptions.
(dict) --
The output for a single tool description recommendation.
toolName (string) --
The name of the tool.
recommendedToolDescription (string) --
The optimized tool description text generated by the recommendation.
explanation (string) --
An explanation of why the recommendation was generated for this tool and what patterns were identified in the agent traces.
configurationBundle (dict) --
The configuration bundle containing the recommended tool descriptions, if the input was sourced from a configuration bundle.
bundleArn (string) --
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) --
The version identifier of the configuration bundle containing the recommendation.
errorCode (string) --
The error code if the recommendation failed.
errorMessage (string) --
The error message if the recommendation failed.
kmsKeyArn (string) --
The ARN of the KMS key used to encrypt recommendation data.
{'batchEvaluations': {'insights': [{'insightId': 'string'}],
'kmsKeyArn': 'string'}}
Lists all batch evaluations in the account, providing summary information about each evaluation's status and configuration.
See also: AWS API Documentation
Request Syntax
client.list_batch_evaluations(
maxResults=123,
nextToken='string'
)
integer
The maximum number of results to return in the response. If the total number of results is greater than this value, use the token returned in the response in the nextToken field when making another request to return the next batch of results.
string
If the total number of results is greater than the maxResults value provided in the request, enter the token returned in the nextToken field in the response in this field to return the next batch of results.
dict
Response Syntax
{
'batchEvaluations': [
{
'batchEvaluationId': 'string',
'batchEvaluationArn': 'string',
'batchEvaluationName': 'string',
'status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERRORS'|'FAILED'|'STOPPING'|'STOPPED'|'DELETING',
'createdAt': datetime(2015, 1, 1),
'description': 'string',
'evaluators': [
{
'evaluatorId': 'string'
},
],
'insights': [
{
'insightId': 'string'
},
],
'evaluationResults': {
'numberOfSessionsCompleted': 123,
'numberOfSessionsInProgress': 123,
'numberOfSessionsFailed': 123,
'totalNumberOfSessions': 123,
'numberOfSessionsIgnored': 123,
'evaluatorSummaries': [
{
'evaluatorId': 'string',
'statistics': {
'averageScore': 123.0
},
'totalEvaluated': 123,
'totalFailed': 123
},
]
},
'errorDetails': [
'string',
],
'kmsKeyArn': 'string',
'updatedAt': datetime(2015, 1, 1)
},
],
'nextToken': 'string'
}
Response Structure
(dict) --
batchEvaluations (list) --
The list of batch evaluation summaries.
(dict) --
Summary representation for list responses.
batchEvaluationId (string) --
The unique identifier of the batch evaluation.
batchEvaluationArn (string) --
The Amazon Resource Name (ARN) of the batch evaluation.
batchEvaluationName (string) --
The name of the batch evaluation.
status (string) --
The current status of the batch evaluation.
createdAt (datetime) --
The timestamp when the batch evaluation was created.
description (string) --
The description of the batch evaluation.
evaluators (list) --
The list of evaluators applied during the batch evaluation.
(dict) --
An evaluator to run against sessions
evaluatorId (string) --
The unique identifier of the evaluator. Can reference built-in evaluators (e.g., Builtin.Helpfulness) or custom evaluators.
insights (list) --
The list of insight analyses applied during the batch evaluation.
(dict) --
A reference to an insight analysis to run against sessions.
insightId (string) --
Canonical insight identifiers using the Builtin.Insight.* naming convention. Used by BatchEvaluate, InternalEvaluate, and ServiceEngineEvaluate flows.
evaluationResults (dict) --
The aggregated evaluation results.
numberOfSessionsCompleted (integer) --
The number of sessions that have been successfully evaluated.
numberOfSessionsInProgress (integer) --
The number of sessions currently being evaluated.
numberOfSessionsFailed (integer) --
The number of sessions that failed evaluation.
totalNumberOfSessions (integer) --
The total number of sessions included in the batch evaluation.
numberOfSessionsIgnored (integer) --
The number of sessions that were ignored during evaluation.
evaluatorSummaries (list) --
A list of per-evaluator summary statistics.
(dict) --
Summary statistics for a single evaluator within a batch evaluation.
evaluatorId (string) --
The unique identifier of the evaluator.
statistics (dict) --
The aggregated statistics for this evaluator.
averageScore (float) --
The average score across all evaluated sessions for this evaluator.
totalEvaluated (integer) --
The total number of sessions evaluated by this evaluator.
totalFailed (integer) --
The total number of sessions that failed evaluation by this evaluator.
errorDetails (list) --
The error details if the batch evaluation encountered failures.
(string) --
kmsKeyArn (string) --
The ARN of the KMS key used to encrypt evaluation data.
updatedAt (datetime) --
The timestamp when the batch evaluation was last updated.
nextToken (string) --
If the total number of results is greater than the maxResults value provided in the request, use this token when making another request in the nextToken field to return the next batch of results.
{'dataSourceConfig': {'onlineEvaluationConfigSource': {'onlineEvaluationConfigArn': 'string',
'sessionFilterConfig': {'endTime': 'timestamp',
'startTime': 'timestamp'}}},
'insights': [{'insightId': 'string'}],
'kmsKeyArn': 'string',
'tags': {'string': 'string'}}
Response {'insights': [{'insightId': 'string'}],
'kmsKeyArn': 'string',
'tags': {'string': 'string'}}
Starts a batch evaluation job that evaluates agent performance across multiple sessions. Batch evaluations pull agent traces from CloudWatch Logs or an existing online evaluation configuration and run specified evaluators and insights against them.
See also: AWS API Documentation
Request Syntax
client.start_batch_evaluation(
batchEvaluationName='string',
evaluators=[
{
'evaluatorId': 'string'
},
],
insights=[
{
'insightId': 'string'
},
],
dataSourceConfig={
'cloudWatchLogs': {
'serviceNames': [
'string',
],
'logGroupNames': [
'string',
],
'filterConfig': {
'sessionIds': [
'string',
],
'timeRange': {
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1)
}
}
},
'onlineEvaluationConfigSource': {
'onlineEvaluationConfigArn': 'string',
'sessionFilterConfig': {
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1)
}
}
},
clientToken='string',
evaluationMetadata={
'sessionMetadata': [
{
'sessionId': 'string',
'testScenarioId': 'string',
'groundTruth': {
'inline': {
'assertions': [
{
'text': 'string'
},
],
'expectedTrajectory': {
'toolNames': [
'string',
]
},
'turns': [
{
'input': {
'prompt': 'string'
},
'expectedResponse': {
'text': 'string'
}
},
]
}
},
'metadata': {
'string': 'string'
}
},
]
},
tags={
'string': 'string'
},
kmsKeyArn='string',
description='string'
)
string
[REQUIRED]
The name of the batch evaluation. Must be unique within your account.
list
The list of evaluators to apply during the batch evaluation. Can include both built-in evaluators and custom evaluators. Maximum of 10 evaluators.
(dict) --
An evaluator to run against sessions
evaluatorId (string) -- [REQUIRED]
The unique identifier of the evaluator. Can reference built-in evaluators (e.g., Builtin.Helpfulness) or custom evaluators.
list
The list of insight analyses to run against sessions during the batch evaluation. Maximum of 10 insights.
(dict) --
A reference to an insight analysis to run against sessions.
insightId (string) -- [REQUIRED]
Canonical insight identifiers using the Builtin.Insight.* naming convention. Used by BatchEvaluate, InternalEvaluate, and ServiceEngineEvaluate flows.
dict
[REQUIRED]
The data source configuration that specifies where to pull agent session traces from for evaluation.
cloudWatchLogs (dict) --
Configuration for pulling agent session traces from CloudWatch Logs.
serviceNames (list) -- [REQUIRED]
The list of agent service names to filter traces within the specified log groups.
(string) --
logGroupNames (list) -- [REQUIRED]
The list of CloudWatch log group names to read agent traces from. Maximum of 5 log groups.
(string) --
filterConfig (dict) --
Optional filter configuration to narrow down which sessions to evaluate.
sessionIds (list) --
A list of specific session IDs to evaluate. If specified, only these sessions are included in the evaluation.
(string) --
timeRange (dict) --
The time range filter for selecting sessions to evaluate.
startTime (datetime) --
The start time of the time range. Only sessions with activity at or after this timestamp are included.
endTime (datetime) --
The end time of the time range. Only sessions with activity before this timestamp are included.
onlineEvaluationConfigSource (dict) --
Reference an existing OnlineEvaluationConfig as session source
onlineEvaluationConfigArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the online evaluation configuration to use as the session source.
sessionFilterConfig (dict) --
Optional session filter configuration to narrow down which sessions from the online evaluation configuration to include.
startTime (datetime) --
The start time of the time range. Only sessions with activity at or after this timestamp are included.
endTime (datetime) --
The end time of the time range. Only sessions with activity before this timestamp are included.
string
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, the service ignores the request, but does not return an error.
This field is autopopulated if not provided.
dict
Optional metadata for the evaluation, including session-specific ground truth data and test scenario identifiers.
sessionMetadata (list) --
A list of session metadata entries containing ground truth data and test scenario identifiers for specific sessions.
(dict) --
Metadata for a specific session in a batch evaluation, including ground truth data and test scenario identifiers.
sessionId (string) -- [REQUIRED]
The unique identifier of the session this metadata applies to.
testScenarioId (string) --
An optional test scenario identifier for categorizing and tracking evaluation results.
groundTruth (dict) --
The ground truth data for this session, including expected responses and assertions.
inline (dict) --
Inline ground truth data provided directly in the request.
assertions (list) --
Assertions for evaluation, reuses common model EvaluationContentList.
(dict) --
A content block for ground truth data in evaluation reference inputs. Supports text content for expected responses and assertions.
text (string) --
The text content of the ground truth data. Used for expected response text and assertion statements.
expectedTrajectory (dict) --
The expected tool call sequence for trajectory evaluation.
toolNames (list) --
The list of tool names representing the expected tool call sequence.
(string) --
turns (list) --
A list of per-turn ground truth data, each containing an input prompt and expected response.
(dict) --
Ground truth data for a single conversation turn.
input (dict) --
The input for this conversation turn.
prompt (string) --
The text prompt for this conversation turn.
expectedResponse (dict) --
The expected response for this conversation turn.
text (string) --
The text content of the ground truth data. Used for expected response text and assertion statements.
metadata (dict) --
Additional key-value metadata associated with this session.
(string) --
(string) --
dict
A map of tag keys and values to associate with the batch evaluation.
(string) --
(string) --
string
The ARN of the KMS key used to encrypt evaluation data. If provided, customer data is encrypted at rest with the specified key.
string
The description of the batch evaluation.
dict
Response Syntax
{
'batchEvaluationId': 'string',
'batchEvaluationArn': 'string',
'batchEvaluationName': 'string',
'evaluators': [
{
'evaluatorId': 'string'
},
],
'insights': [
{
'insightId': 'string'
},
],
'status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERRORS'|'FAILED'|'STOPPING'|'STOPPED'|'DELETING',
'createdAt': datetime(2015, 1, 1),
'outputConfig': {
'cloudWatchConfig': {
'logGroupName': 'string',
'logStreamName': 'string'
}
},
'tags': {
'string': 'string'
},
'kmsKeyArn': 'string',
'description': 'string'
}
Response Structure
(dict) --
batchEvaluationId (string) --
The unique identifier of the created batch evaluation.
batchEvaluationArn (string) --
The Amazon Resource Name (ARN) of the created batch evaluation.
batchEvaluationName (string) --
The name of the batch evaluation.
evaluators (list) --
The list of evaluators applied during the batch evaluation.
(dict) --
An evaluator to run against sessions
evaluatorId (string) --
The unique identifier of the evaluator. Can reference built-in evaluators (e.g., Builtin.Helpfulness) or custom evaluators.
insights (list) --
The list of insight analyses applied during the batch evaluation.
(dict) --
A reference to an insight analysis to run against sessions.
insightId (string) --
Canonical insight identifiers using the Builtin.Insight.* naming convention. Used by BatchEvaluate, InternalEvaluate, and ServiceEngineEvaluate flows.
status (string) --
The status of the batch evaluation.
createdAt (datetime) --
The timestamp when the batch evaluation was created.
outputConfig (dict) --
The output configuration specifying where evaluation results are written.
cloudWatchConfig (dict) --
The CloudWatch Logs configuration for writing evaluation results.
logGroupName (string) --
The name of the CloudWatch log group where evaluation results will be written.
logStreamName (string) --
The name of the CloudWatch log stream where evaluation results will be written.
tags (dict) --
The tags associated with the batch evaluation.
(string) --
(string) --
kmsKeyArn (string) --
The ARN of the KMS key used to encrypt evaluation data.
description (string) --
The description of the batch evaluation.
{'kmsKeyArn': 'string',
'recommendationConfig': {'systemPromptRecommendationConfig': {'agentTraces': {'batchEvaluation': {'batchEvaluationArn': 'string'}}},
'toolDescriptionRecommendationConfig': {'agentTraces': {'batchEvaluation': {'batchEvaluationArn': 'string'}}}},
'tags': {'string': 'string'}}
Response {'recommendationConfig': {'systemPromptRecommendationConfig': {'agentTraces': {'batchEvaluation': {'batchEvaluationArn': 'string'}}},
'toolDescriptionRecommendationConfig': {'agentTraces': {'batchEvaluation': {'batchEvaluationArn': 'string'}}}}}
Starts a recommendation job that analyzes agent traces and generates optimization suggestions for system prompts or tool descriptions to improve agent performance.
See also: AWS API Documentation
Request Syntax
client.start_recommendation(
name='string',
description='string',
type='SYSTEM_PROMPT_RECOMMENDATION'|'TOOL_DESCRIPTION_RECOMMENDATION',
recommendationConfig={
'systemPromptRecommendationConfig': {
'systemPrompt': {
'text': 'string',
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string',
'systemPromptJsonPath': 'string'
}
},
'agentTraces': {
'sessionSpans': [
{...}|[...]|123|123.4|'string'|True|None,
],
'cloudwatchLogs': {
'logGroupArns': [
'string',
],
'serviceNames': [
'string',
],
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'rule': {
'filters': [
{
'key': 'string',
'operator': 'Equals'|'NotEquals'|'GreaterThan'|'LessThan'|'GreaterThanOrEqual'|'LessThanOrEqual'|'Contains'|'NotContains',
'value': {
'stringValue': 'string',
'doubleValue': 123.0,
'booleanValue': True|False
}
},
]
}
},
'batchEvaluation': {
'batchEvaluationArn': 'string'
}
},
'evaluationConfig': {
'evaluators': [
{
'evaluatorArn': 'string'
},
]
}
},
'toolDescriptionRecommendationConfig': {
'toolDescription': {
'toolDescriptionText': {
'tools': [
{
'toolName': 'string',
'toolDescription': {
'text': 'string'
}
},
]
},
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string',
'tools': [
{
'toolName': 'string',
'toolDescriptionJsonPath': 'string'
},
]
}
},
'agentTraces': {
'sessionSpans': [
{...}|[...]|123|123.4|'string'|True|None,
],
'cloudwatchLogs': {
'logGroupArns': [
'string',
],
'serviceNames': [
'string',
],
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'rule': {
'filters': [
{
'key': 'string',
'operator': 'Equals'|'NotEquals'|'GreaterThan'|'LessThan'|'GreaterThanOrEqual'|'LessThanOrEqual'|'Contains'|'NotContains',
'value': {
'stringValue': 'string',
'doubleValue': 123.0,
'booleanValue': True|False
}
},
]
}
},
'batchEvaluation': {
'batchEvaluationArn': 'string'
}
}
}
},
kmsKeyArn='string',
clientToken='string',
tags={
'string': 'string'
}
)
string
[REQUIRED]
The name of the recommendation. Must be unique within your account.
string
The description of the recommendation.
string
[REQUIRED]
The type of recommendation to generate. Valid values are SYSTEM_PROMPT_RECOMMENDATION for system prompt optimization or TOOL_DESCRIPTION_RECOMMENDATION for tool description optimization.
dict
[REQUIRED]
The configuration for the recommendation, including the input to optimize, agent traces to analyze, and evaluation settings.
systemPromptRecommendationConfig (dict) --
The configuration for a system prompt recommendation.
systemPrompt (dict) -- [REQUIRED]
The current system prompt to optimize.
text (string) --
The system prompt text provided inline.
configurationBundle (dict) --
The system prompt sourced from a configuration bundle version.
bundleArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) -- [REQUIRED]
The version identifier of the configuration bundle.
systemPromptJsonPath (string) -- [REQUIRED]
The JSON path within the configuration bundle that contains the system prompt.
agentTraces (dict) -- [REQUIRED]
The agent traces to analyze for generating recommendations.
sessionSpans (list) --
Agent traces provided as inline session spans in OpenTelemetry format.
(:ref:`document<document>`) --
cloudwatchLogs (dict) --
Agent traces read from CloudWatch Logs.
logGroupArns (list) -- [REQUIRED]
The list of CloudWatch log group ARNs to read agent traces from.
(string) --
serviceNames (list) -- [REQUIRED]
The list of service names to filter traces within the specified log groups.
(string) --
startTime (datetime) -- [REQUIRED]
The start time of the time range to read traces from.
endTime (datetime) -- [REQUIRED]
The end time of the time range to read traces from.
rule (dict) --
Optional rule configuration for filtering traces.
filters (list) --
The list of filters to apply when reading agent traces.
(dict) --
A filter for narrowing down agent traces from CloudWatch Logs based on key-value comparisons.
key (string) -- [REQUIRED]
The key or field name to filter on within the agent trace data.
operator (string) -- [REQUIRED]
The comparison operator to use for filtering.
value (dict) -- [REQUIRED]
The value to compare against using the specified operator.
stringValue (string) --
A string value for text-based filtering.
doubleValue (float) --
A numeric value for numerical filtering and comparisons.
booleanValue (boolean) --
A boolean value for true/false filtering conditions.
batchEvaluation (dict) --
Use a completed batch evaluation as the source of agent traces.
batchEvaluationArn (string) -- [REQUIRED]
The ARN of the completed batch evaluation to use as the trace source.
evaluationConfig (dict) --
The evaluation configuration specifying which evaluator to use for assessing recommendation quality.
evaluators (list) -- [REQUIRED]
The list of evaluators to use for assessing recommendation quality.
(dict) --
A reference to an evaluator used for recommendation assessment.
evaluatorArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the evaluator.
toolDescriptionRecommendationConfig (dict) --
The configuration for a tool description recommendation.
toolDescription (dict) -- [REQUIRED]
The current tool descriptions to optimize.
toolDescriptionText (dict) --
Tool descriptions provided as inline text.
tools (list) -- [REQUIRED]
The list of tool descriptions to optimize.
(dict) --
A tool description input containing the tool name and its current description.
toolName (string) -- [REQUIRED]
The name of the tool.
toolDescription (dict) -- [REQUIRED]
The current description of the tool to optimize.
text (string) --
The tool description as inline text.
configurationBundle (dict) --
Tool descriptions sourced from a configuration bundle version.
bundleArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) -- [REQUIRED]
The version identifier of the configuration bundle.
tools (list) -- [REQUIRED]
The list of tool entries mapping tool names to their JSON paths within the bundle.
(dict) --
Maps a tool name to its JSON path within a configuration bundle.
toolName (string) -- [REQUIRED]
The name of the tool.
toolDescriptionJsonPath (string) -- [REQUIRED]
The JSON path within the configuration bundle's components that contains the tool description.
agentTraces (dict) -- [REQUIRED]
The agent traces to analyze for generating tool description recommendations.
sessionSpans (list) --
Agent traces provided as inline session spans in OpenTelemetry format.
(:ref:`document<document>`) --
cloudwatchLogs (dict) --
Agent traces read from CloudWatch Logs.
logGroupArns (list) -- [REQUIRED]
The list of CloudWatch log group ARNs to read agent traces from.
(string) --
serviceNames (list) -- [REQUIRED]
The list of service names to filter traces within the specified log groups.
(string) --
startTime (datetime) -- [REQUIRED]
The start time of the time range to read traces from.
endTime (datetime) -- [REQUIRED]
The end time of the time range to read traces from.
rule (dict) --
Optional rule configuration for filtering traces.
filters (list) --
The list of filters to apply when reading agent traces.
(dict) --
A filter for narrowing down agent traces from CloudWatch Logs based on key-value comparisons.
key (string) -- [REQUIRED]
The key or field name to filter on within the agent trace data.
operator (string) -- [REQUIRED]
The comparison operator to use for filtering.
value (dict) -- [REQUIRED]
The value to compare against using the specified operator.
stringValue (string) --
A string value for text-based filtering.
doubleValue (float) --
A numeric value for numerical filtering and comparisons.
booleanValue (boolean) --
A boolean value for true/false filtering conditions.
batchEvaluation (dict) --
Use a completed batch evaluation as the source of agent traces.
batchEvaluationArn (string) -- [REQUIRED]
The ARN of the completed batch evaluation to use as the trace source.
string
The ARN of the KMS key used to encrypt recommendation data. If provided, customer data is encrypted at rest with the specified key.
string
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, the service ignores the request, but does not return an error.
This field is autopopulated if not provided.
dict
A map of tag keys and values to associate with the recommendation.
(string) --
(string) --
dict
Response Syntax
{
'recommendationId': 'string',
'recommendationArn': 'string',
'name': 'string',
'description': 'string',
'type': 'SYSTEM_PROMPT_RECOMMENDATION'|'TOOL_DESCRIPTION_RECOMMENDATION',
'recommendationConfig': {
'systemPromptRecommendationConfig': {
'systemPrompt': {
'text': 'string',
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string',
'systemPromptJsonPath': 'string'
}
},
'agentTraces': {
'sessionSpans': [
{...}|[...]|123|123.4|'string'|True|None,
],
'cloudwatchLogs': {
'logGroupArns': [
'string',
],
'serviceNames': [
'string',
],
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'rule': {
'filters': [
{
'key': 'string',
'operator': 'Equals'|'NotEquals'|'GreaterThan'|'LessThan'|'GreaterThanOrEqual'|'LessThanOrEqual'|'Contains'|'NotContains',
'value': {
'stringValue': 'string',
'doubleValue': 123.0,
'booleanValue': True|False
}
},
]
}
},
'batchEvaluation': {
'batchEvaluationArn': 'string'
}
},
'evaluationConfig': {
'evaluators': [
{
'evaluatorArn': 'string'
},
]
}
},
'toolDescriptionRecommendationConfig': {
'toolDescription': {
'toolDescriptionText': {
'tools': [
{
'toolName': 'string',
'toolDescription': {
'text': 'string'
}
},
]
},
'configurationBundle': {
'bundleArn': 'string',
'versionId': 'string',
'tools': [
{
'toolName': 'string',
'toolDescriptionJsonPath': 'string'
},
]
}
},
'agentTraces': {
'sessionSpans': [
{...}|[...]|123|123.4|'string'|True|None,
],
'cloudwatchLogs': {
'logGroupArns': [
'string',
],
'serviceNames': [
'string',
],
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'rule': {
'filters': [
{
'key': 'string',
'operator': 'Equals'|'NotEquals'|'GreaterThan'|'LessThan'|'GreaterThanOrEqual'|'LessThanOrEqual'|'Contains'|'NotContains',
'value': {
'stringValue': 'string',
'doubleValue': 123.0,
'booleanValue': True|False
}
},
]
}
},
'batchEvaluation': {
'batchEvaluationArn': 'string'
}
}
}
},
'status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'DELETING',
'createdAt': datetime(2015, 1, 1),
'updatedAt': datetime(2015, 1, 1)
}
Response Structure
(dict) --
recommendationId (string) --
The unique identifier of the created recommendation.
recommendationArn (string) --
The Amazon Resource Name (ARN) of the created recommendation.
name (string) --
The name of the recommendation.
description (string) --
The description of the recommendation.
type (string) --
The type of recommendation.
recommendationConfig (dict) --
The configuration for the recommendation.
systemPromptRecommendationConfig (dict) --
The configuration for a system prompt recommendation.
systemPrompt (dict) --
The current system prompt to optimize.
text (string) --
The system prompt text provided inline.
configurationBundle (dict) --
The system prompt sourced from a configuration bundle version.
bundleArn (string) --
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) --
The version identifier of the configuration bundle.
systemPromptJsonPath (string) --
The JSON path within the configuration bundle that contains the system prompt.
agentTraces (dict) --
The agent traces to analyze for generating recommendations.
sessionSpans (list) --
Agent traces provided as inline session spans in OpenTelemetry format.
(:ref:`document<document>`) --
cloudwatchLogs (dict) --
Agent traces read from CloudWatch Logs.
logGroupArns (list) --
The list of CloudWatch log group ARNs to read agent traces from.
(string) --
serviceNames (list) --
The list of service names to filter traces within the specified log groups.
(string) --
startTime (datetime) --
The start time of the time range to read traces from.
endTime (datetime) --
The end time of the time range to read traces from.
rule (dict) --
Optional rule configuration for filtering traces.
filters (list) --
The list of filters to apply when reading agent traces.
(dict) --
A filter for narrowing down agent traces from CloudWatch Logs based on key-value comparisons.
key (string) --
The key or field name to filter on within the agent trace data.
operator (string) --
The comparison operator to use for filtering.
value (dict) --
The value to compare against using the specified operator.
stringValue (string) --
A string value for text-based filtering.
doubleValue (float) --
A numeric value for numerical filtering and comparisons.
booleanValue (boolean) --
A boolean value for true/false filtering conditions.
batchEvaluation (dict) --
Use a completed batch evaluation as the source of agent traces.
batchEvaluationArn (string) --
The ARN of the completed batch evaluation to use as the trace source.
evaluationConfig (dict) --
The evaluation configuration specifying which evaluator to use for assessing recommendation quality.
evaluators (list) --
The list of evaluators to use for assessing recommendation quality.
(dict) --
A reference to an evaluator used for recommendation assessment.
evaluatorArn (string) --
The Amazon Resource Name (ARN) of the evaluator.
toolDescriptionRecommendationConfig (dict) --
The configuration for a tool description recommendation.
toolDescription (dict) --
The current tool descriptions to optimize.
toolDescriptionText (dict) --
Tool descriptions provided as inline text.
tools (list) --
The list of tool descriptions to optimize.
(dict) --
A tool description input containing the tool name and its current description.
toolName (string) --
The name of the tool.
toolDescription (dict) --
The current description of the tool to optimize.
text (string) --
The tool description as inline text.
configurationBundle (dict) --
Tool descriptions sourced from a configuration bundle version.
bundleArn (string) --
The Amazon Resource Name (ARN) of the configuration bundle.
versionId (string) --
The version identifier of the configuration bundle.
tools (list) --
The list of tool entries mapping tool names to their JSON paths within the bundle.
(dict) --
Maps a tool name to its JSON path within a configuration bundle.
toolName (string) --
The name of the tool.
toolDescriptionJsonPath (string) --
The JSON path within the configuration bundle's components that contains the tool description.
agentTraces (dict) --
The agent traces to analyze for generating tool description recommendations.
sessionSpans (list) --
Agent traces provided as inline session spans in OpenTelemetry format.
(:ref:`document<document>`) --
cloudwatchLogs (dict) --
Agent traces read from CloudWatch Logs.
logGroupArns (list) --
The list of CloudWatch log group ARNs to read agent traces from.
(string) --
serviceNames (list) --
The list of service names to filter traces within the specified log groups.
(string) --
startTime (datetime) --
The start time of the time range to read traces from.
endTime (datetime) --
The end time of the time range to read traces from.
rule (dict) --
Optional rule configuration for filtering traces.
filters (list) --
The list of filters to apply when reading agent traces.
(dict) --
A filter for narrowing down agent traces from CloudWatch Logs based on key-value comparisons.
key (string) --
The key or field name to filter on within the agent trace data.
operator (string) --
The comparison operator to use for filtering.
value (dict) --
The value to compare against using the specified operator.
stringValue (string) --
A string value for text-based filtering.
doubleValue (float) --
A numeric value for numerical filtering and comparisons.
booleanValue (boolean) --
A boolean value for true/false filtering conditions.
batchEvaluation (dict) --
Use a completed batch evaluation as the source of agent traces.
batchEvaluationArn (string) --
The ARN of the completed batch evaluation to use as the trace source.
status (string) --
The status of the recommendation.
createdAt (datetime) --
The timestamp when the recommendation was created.
updatedAt (datetime) --
The timestamp when the recommendation was last updated.