Amazon Bedrock AgentCore

2025/12/02 - Amazon Bedrock AgentCore - 1 new3 updated api methods

Changes  Support for AgentCore Evaluations and Episodic memory strategy for AgentCore Memory.

Evaluate (new) Link ¶

Performs on-demand evaluation of agent traces using a specified evaluator. This synchronous API accepts traces in OpenTelemetry format and returns immediate scoring results with detailed explanations.

See also: AWS API Documentation

Request Syntax

client.evaluate(
    evaluatorId='string',
    evaluationInput={
        'sessionSpans': [
            {...}|[...]|123|123.4|'string'|True|None,
        ]
    },
    evaluationTarget={
        'spanIds': [
            'string',
        ],
        'traceIds': [
            'string',
        ]
    }
)
type evaluatorId:

string

param evaluatorId:

[REQUIRED]

The unique identifier of the evaluator to use for scoring. Can be a built-in evaluator (e.g., Builtin.Helpfulness, Builtin.Correctness) or a custom evaluator ARN created through the control plane API.

type evaluationInput:

dict

param evaluationInput:

[REQUIRED]

The input data containing agent session spans to be evaluated. Includes a list of spans in OpenTelemetry format from supported frameworks like Strands (AgentCore Runtime) or LangGraph with OpenInference instrumentation.

  • sessionSpans (list) --

    The collection of spans representing agent execution traces within a session. Each span contains detailed information about tool calls, model interactions, and other agent activities that can be evaluated for quality and performance.

    • (:ref:`document<document>`) --

type evaluationTarget:

dict

param evaluationTarget:

The specific trace or span IDs to evaluate within the provided input. Allows targeting evaluation at different levels: individual tool calls, single request-response interactions (traces), or entire conversation sessions.

  • spanIds (list) --

    The list of specific span IDs to evaluate within the provided traces. Used to target evaluation at individual tool calls or specific operations within the agent's execution flow.

    • (string) --

  • traceIds (list) --

    The list of trace IDs to evaluate, representing complete request-response interactions. Used to evaluate entire conversation turns or specific agent interactions within a session.

    • (string) --

rtype:

dict

returns:

Response Syntax

{
    'evaluationResults': [
        {
            'evaluatorArn': 'string',
            'evaluatorId': 'string',
            'evaluatorName': 'string',
            'explanation': 'string',
            'context': {
                'spanContext': {
                    'sessionId': 'string',
                    'traceId': 'string',
                    'spanId': 'string'
                }
            },
            'value': 123.0,
            'label': 'string',
            'tokenUsage': {
                'inputTokens': 123,
                'outputTokens': 123,
                'totalTokens': 123
            },
            'errorMessage': 'string',
            'errorCode': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • evaluationResults (list) --

      The detailed evaluation results containing scores, explanations, and metadata. Includes the evaluator information, numerical or categorical ratings based on the evaluator's rating scale, and token usage statistics for the evaluation process.

      • (dict) --

        The comprehensive result of an evaluation containing the score, explanation, evaluator metadata, and execution details. Provides both quantitative ratings and qualitative insights about agent performance.

        • evaluatorArn (string) --

          The Amazon Resource Name (ARN) of the evaluator used to generate this result. For custom evaluators, this is the full ARN; for built-in evaluators, this follows the pattern Builtin.{EvaluatorName}.

        • evaluatorId (string) --

          The unique identifier of the evaluator that produced this result. This matches the evaluatorId provided in the evaluation request and can be used to identify which evaluator generated specific results.

        • evaluatorName (string) --

          The human-readable name of the evaluator used for this evaluation. For built-in evaluators, this is the descriptive name (e.g., "Helpfulness", "Correctness"); for custom evaluators, this is the user-defined name.

        • explanation (string) --

          The detailed explanation provided by the evaluator describing the reasoning behind the assigned score. This qualitative feedback helps understand why specific ratings were given and provides actionable insights for improvement.

        • context (dict) --

          The contextual information associated with this evaluation result, including span context details that identify the specific traces and sessions that were evaluated.

          • spanContext (dict) --

            The span context information that uniquely identifies the trace and span being evaluated, including session ID, trace ID, and span ID for precise targeting within the agent's execution flow.

            • sessionId (string) --

              The unique identifier of the session containing this span. Sessions represent complete conversation flows and are detected using configurable SessionTimeoutMinutes (default 15 minutes).

            • traceId (string) --

              The unique identifier of the trace containing this span. Traces represent individual request-response interactions within a session and group related spans together.

            • spanId (string) --

              The unique identifier of the specific span being referenced. Spans represent individual operations like tool calls, model invocations, or other discrete actions within the agent's execution.

        • value (float) --

          The numerical score assigned by the evaluator according to its configured rating scale. For numerical scales, this is a decimal value within the defined range. This field is not allowed for categorical scales.

        • label (string) --

          The categorical label assigned by the evaluator when using a categorical rating scale. This provides a human-readable description of the evaluation result (e.g., "Excellent", "Good", "Poor") corresponding to the numerical value. For numerical scales, this field is optional and provides a natural language explanation of what the value means (e.g., value 0.5 = "Somewhat Helpful").

        • tokenUsage (dict) --

          The token consumption statistics for this evaluation, including input tokens, output tokens, and total tokens used by the underlying language model during the evaluation process.

          • inputTokens (integer) --

            The number of tokens consumed for input processing during the evaluation. Includes tokens from the evaluation prompt, agent traces, and any additional context provided to the evaluator model.

          • outputTokens (integer) --

            The number of tokens generated by the evaluator model in its response. Includes tokens for the score, explanation, and any additional output produced during the evaluation process.

          • totalTokens (integer) --

            The total number of tokens consumed during the evaluation, calculated as the sum of input and output tokens. Used for cost calculation and rate limiting within the service limits.

        • errorMessage (string) --

          The error message describing what went wrong if the evaluation failed. Provides detailed information about evaluation failures to help diagnose and resolve issues with evaluator configuration or input data.

        • errorCode (string) --

          The error code indicating the type of failure that occurred during evaluation. Used to programmatically identify and handle different categories of evaluation errors.

GetMemoryRecord (updated) Link ¶
Changes (response)
{'memoryRecord': {'metadata': {'string': {'stringValue': 'string'}}}}

Retrieves a specific memory record from an AgentCore Memory resource.

To use this operation, you must have the bedrock-agentcore:GetMemoryRecord permission.

See also: AWS API Documentation

Request Syntax

client.get_memory_record(
    memoryId='string',
    memoryRecordId='string'
)
type memoryId:

string

param memoryId:

[REQUIRED]

The identifier of the AgentCore Memory resource containing the memory record.

type memoryRecordId:

string

param memoryRecordId:

[REQUIRED]

The identifier of the memory record to retrieve.

rtype:

dict

returns:

Response Syntax

{
    'memoryRecord': {
        'memoryRecordId': 'string',
        'content': {
            'text': 'string'
        },
        'memoryStrategyId': 'string',
        'namespaces': [
            'string',
        ],
        'createdAt': datetime(2015, 1, 1),
        'metadata': {
            'string': {
                'stringValue': 'string'
            }
        }
    }
}

Response Structure

  • (dict) --

    • memoryRecord (dict) --

      The requested memory record.

      • memoryRecordId (string) --

        The unique identifier of the memory record.

      • content (dict) --

        The content of the memory record.

        • text (string) --

          The text content of the memory record.

      • memoryStrategyId (string) --

        The identifier of the memory strategy associated with this record.

      • namespaces (list) --

        The namespaces associated with this memory record. Namespaces help organize and categorize memory records.

        • (string) --

      • createdAt (datetime) --

        The timestamp when the memory record was created.

      • metadata (dict) --

        A map of metadata key-value pairs associated with a memory record.

        • (string) --

          • (dict) --

            Value associated with the eventMetadata key.

            • stringValue (string) --

              Value associated with the eventMetadata key.

ListMemoryRecords (updated) Link ¶
Changes (response)
{'memoryRecordSummaries': {'metadata': {'string': {'stringValue': 'string'}}}}

Lists memory records in an AgentCore Memory resource based on specified criteria. We recommend using pagination to ensure that the operation returns quickly and successfully.

To use this operation, you must have the bedrock-agentcore:ListMemoryRecords permission.

See also: AWS API Documentation

Request Syntax

client.list_memory_records(
    memoryId='string',
    namespace='string',
    memoryStrategyId='string',
    maxResults=123,
    nextToken='string'
)
type memoryId:

string

param memoryId:

[REQUIRED]

The identifier of the AgentCore Memory resource for which to list memory records.

type namespace:

string

param namespace:

[REQUIRED]

The namespace to filter memory records by. If specified, only memory records in this namespace are returned.

type memoryStrategyId:

string

param memoryStrategyId:

The memory strategy identifier to filter memory records by. If specified, only memory records with this strategy ID are returned.

type maxResults:

integer

param maxResults:

The maximum number of results to return in a single call. The default value is 20.

type nextToken:

string

param nextToken:

The token for the next set of results. Use the value returned in the previous response in the next request to retrieve the next set of results.

rtype:

dict

returns:

Response Syntax

{
    'memoryRecordSummaries': [
        {
            'memoryRecordId': 'string',
            'content': {
                'text': 'string'
            },
            'memoryStrategyId': 'string',
            'namespaces': [
                'string',
            ],
            'createdAt': datetime(2015, 1, 1),
            'score': 123.0,
            'metadata': {
                'string': {
                    'stringValue': 'string'
                }
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • memoryRecordSummaries (list) --

      The list of memory record summaries that match the specified criteria.

      • (dict) --

        Contains summary information about a memory record.

        • memoryRecordId (string) --

          The unique identifier of the memory record.

        • content (dict) --

          The content of the memory record.

          • text (string) --

            The text content of the memory record.

        • memoryStrategyId (string) --

          The identifier of the memory strategy associated with this record.

        • namespaces (list) --

          The namespaces associated with this memory record.

          • (string) --

        • createdAt (datetime) --

          The timestamp when the memory record was created.

        • score (float) --

          The relevance score of the memory record when returned as part of a search result. Higher values indicate greater relevance to the search query.

        • metadata (dict) --

          A map of metadata key-value pairs associated with a memory record.

          • (string) --

            • (dict) --

              Value associated with the eventMetadata key.

              • stringValue (string) --

                Value associated with the eventMetadata key.

    • nextToken (string) --

      The token to use in a subsequent request to get the next set of results. This value is null when there are no more results to return.

RetrieveMemoryRecords (updated) Link ¶
Changes (request, response)
Request
{'searchCriteria': {'metadataFilters': [{'left': {'metadataKey': 'string'},
                                         'operator': 'EQUALS_TO | EXISTS | '
                                                     'NOT_EXISTS',
                                         'right': {'metadataValue': {'stringValue': 'string'}}}]}}
Response
{'memoryRecordSummaries': {'metadata': {'string': {'stringValue': 'string'}}}}

Searches for and retrieves memory records from an AgentCore Memory resource based on specified search criteria. We recommend using pagination to ensure that the operation returns quickly and successfully.

To use this operation, you must have the bedrock-agentcore:RetrieveMemoryRecords permission.

See also: AWS API Documentation

Request Syntax

client.retrieve_memory_records(
    memoryId='string',
    namespace='string',
    searchCriteria={
        'searchQuery': 'string',
        'memoryStrategyId': 'string',
        'topK': 123,
        'metadataFilters': [
            {
                'left': {
                    'metadataKey': 'string'
                },
                'operator': 'EQUALS_TO'|'EXISTS'|'NOT_EXISTS',
                'right': {
                    'metadataValue': {
                        'stringValue': 'string'
                    }
                }
            },
        ]
    },
    nextToken='string',
    maxResults=123
)
type memoryId:

string

param memoryId:

[REQUIRED]

The identifier of the AgentCore Memory resource from which to retrieve memory records.

type namespace:

string

param namespace:

[REQUIRED]

The namespace to filter memory records by.

type searchCriteria:

dict

param searchCriteria:

[REQUIRED]

The search criteria to use for finding relevant memory records. This includes the search query, memory strategy ID, and other search parameters.

  • searchQuery (string) -- [REQUIRED]

    The search query to use for finding relevant memory records.

  • memoryStrategyId (string) --

    The memory strategy identifier to filter memory records by.

  • topK (integer) --

    The maximum number of top-scoring memory records to return. This value is used for semantic search ranking.

  • metadataFilters (list) --

    Filters to apply to metadata associated with a memory.

    • (dict) --

      Filters to apply to metadata associated with a memory. Specify the metadata key and value in the left and right fields and use the operator field to define the relationship to match.

      • left (dict) -- [REQUIRED]

        Left expression of the event metadata filter.

        • metadataKey (string) --

          Key associated with the metadata in an event.

      • operator (string) -- [REQUIRED]

        The relationship between the metadata key and value to match when applying the metadata filter.

      • right (dict) --

        Right expression of the ``eventMetadata``filter.

        • metadataValue (dict) --

          Value associated with the key in eventMetadata.

          • stringValue (string) --

            Value associated with the eventMetadata key.

type nextToken:

string

param nextToken:

The token for the next set of results. Use the value returned in the previous response in the next request to retrieve the next set of results.

type maxResults:

integer

param maxResults:

The maximum number of results to return in a single call. The default value is 20.

rtype:

dict

returns:

Response Syntax

{
    'memoryRecordSummaries': [
        {
            'memoryRecordId': 'string',
            'content': {
                'text': 'string'
            },
            'memoryStrategyId': 'string',
            'namespaces': [
                'string',
            ],
            'createdAt': datetime(2015, 1, 1),
            'score': 123.0,
            'metadata': {
                'string': {
                    'stringValue': 'string'
                }
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • memoryRecordSummaries (list) --

      The list of memory record summaries that match the search criteria, ordered by relevance.

      • (dict) --

        Contains summary information about a memory record.

        • memoryRecordId (string) --

          The unique identifier of the memory record.

        • content (dict) --

          The content of the memory record.

          • text (string) --

            The text content of the memory record.

        • memoryStrategyId (string) --

          The identifier of the memory strategy associated with this record.

        • namespaces (list) --

          The namespaces associated with this memory record.

          • (string) --

        • createdAt (datetime) --

          The timestamp when the memory record was created.

        • score (float) --

          The relevance score of the memory record when returned as part of a search result. Higher values indicate greater relevance to the search query.

        • metadata (dict) --

          A map of metadata key-value pairs associated with a memory record.

          • (string) --

            • (dict) --

              Value associated with the eventMetadata key.

              • stringValue (string) --

                Value associated with the eventMetadata key.

    • nextToken (string) --

      The token to use in a subsequent request to get the next set of results. This value is null when there are no more results to return.