2024/12/02 - Amazon Bedrock - 3 updated api methods
Changes Add support for Knowledge Base Evaluations & LLM as a judge
{'applicationType': 'ModelEvaluation | RagEvaluation', 'evaluationConfig': {'automated': {'evaluatorModelConfig': {'bedrockEvaluatorModels': [{'modelIdentifier': 'string'}]}}}, 'inferenceConfig': {'ragConfigs': [{'knowledgeBaseConfig': {'retrieveAndGenerateConfig': {'externalSourcesConfiguration': {'generationConfiguration': {'additionalModelRequestFields': {'string': {}}, 'guardrailConfiguration': {'guardrailId': 'string', 'guardrailVersion': 'string'}, 'kbInferenceConfig': {'textInferenceConfig': {'maxTokens': 'integer', 'stopSequences': ['string'], 'temperature': 'float', 'topP': 'float'}}, 'promptTemplate': {'textPromptTemplate': 'string'}}, 'modelArn': 'string', 'sources': [{'byteContent': {'contentType': 'string', 'data': 'blob', 'identifier': 'string'}, 's3Location': {'uri': 'string'}, 'sourceType': 'S3 ' '| ' 'BYTE_CONTENT'}]}, 'knowledgeBaseConfiguration': {'generationConfiguration': {'additionalModelRequestFields': {'string': {}}, 'guardrailConfiguration': {'guardrailId': 'string', 'guardrailVersion': 'string'}, 'kbInferenceConfig': {'textInferenceConfig': {'maxTokens': 'integer', 'stopSequences': ['string'], 'temperature': 'float', 'topP': 'float'}}, 'promptTemplate': {'textPromptTemplate': 'string'}}, 'knowledgeBaseId': 'string', 'modelArn': 'string', 'orchestrationConfiguration': {'queryTransformationConfiguration': {'type': 'QUERY_DECOMPOSITION'}}, 'retrievalConfiguration': {'vectorSearchConfiguration': {'filter': {'andAll': [()], 'equals': {'key': 'string', 'value': {}}, 'greaterThan': {'key': 'string', 'value': {}}, 'greaterThanOrEquals': {'key': 'string', 'value': {}}, 'in': {'key': 'string', 'value': {}}, 'lessThan': {'key': 'string', 'value': {}}, 'lessThanOrEquals': {'key': 'string', 'value': {}}, 'listContains': {'key': 'string', 'value': {}}, 'notEquals': {'key': 'string', 'value': {}}, 'notIn': {'key': 'string', 'value': {}}, 'orAll': [()], 'startsWith': {'key': 'string', 'value': {}}, 'stringContains': {'key': 'string', 'value': {}}}, 'numberOfResults': 'integer', 'overrideSearchType': 'HYBRID ' '| ' 'SEMANTIC'}}}, 'type': 'KNOWLEDGE_BASE ' '| ' 'EXTERNAL_SOURCES'}, 'retrieveConfig': {'knowledgeBaseId': 'string', 'knowledgeBaseRetrievalConfiguration': {'vectorSearchConfiguration': {'filter': {'andAll': [()], 'equals': {'key': 'string', 'value': {}}, 'greaterThan': {'key': 'string', 'value': {}}, 'greaterThanOrEquals': {'key': 'string', 'value': {}}, 'in': {'key': 'string', 'value': {}}, 'lessThan': {'key': 'string', 'value': {}}, 'lessThanOrEquals': {'key': 'string', 'value': {}}, 'listContains': {'key': 'string', 'value': {}}, 'notEquals': {'key': 'string', 'value': {}}, 'notIn': {'key': 'string', 'value': {}}, 'orAll': [()], 'startsWith': {'key': 'string', 'value': {}}, 'stringContains': {'key': 'string', 'value': {}}}, 'numberOfResults': 'integer', 'overrideSearchType': 'HYBRID ' '| ' 'SEMANTIC'}}}}}]}}
Creates an evaluation job.
See also: AWS API Documentation
Request Syntax
client.create_evaluation_job( jobName='string', jobDescription='string', clientRequestToken='string', roleArn='string', customerEncryptionKeyId='string', jobTags=[ { 'key': 'string', 'value': 'string' }, ], applicationType='ModelEvaluation'|'RagEvaluation', evaluationConfig={ 'automated': { 'datasetMetricConfigs': [ { 'taskType': 'Summarization'|'Classification'|'QuestionAndAnswer'|'Generation'|'Custom', 'dataset': { 'name': 'string', 'datasetLocation': { 's3Uri': 'string' } }, 'metricNames': [ 'string', ] }, ], 'evaluatorModelConfig': { 'bedrockEvaluatorModels': [ { 'modelIdentifier': 'string' }, ] } }, 'human': { 'humanWorkflowConfig': { 'flowDefinitionArn': 'string', 'instructions': 'string' }, 'customMetrics': [ { 'name': 'string', 'description': 'string', 'ratingMethod': 'string' }, ], 'datasetMetricConfigs': [ { 'taskType': 'Summarization'|'Classification'|'QuestionAndAnswer'|'Generation'|'Custom', 'dataset': { 'name': 'string', 'datasetLocation': { 's3Uri': 'string' } }, 'metricNames': [ 'string', ] }, ] } }, inferenceConfig={ 'models': [ { 'bedrockModel': { 'modelIdentifier': 'string', 'inferenceParams': 'string' } }, ], 'ragConfigs': [ { 'knowledgeBaseConfig': { 'retrieveConfig': { 'knowledgeBaseId': 'string', 'knowledgeBaseRetrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 123, 'overrideSearchType': 'HYBRID'|'SEMANTIC', 'filter': { 'equals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'in': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notIn': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'startsWith': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'listContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'stringContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'andAll': [ {'... recursive ...'}, ], 'orAll': [ {'... recursive ...'}, ] } } } }, 'retrieveAndGenerateConfig': { 'type': 'KNOWLEDGE_BASE'|'EXTERNAL_SOURCES', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'string', 'modelArn': 'string', 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 123, 'overrideSearchType': 'HYBRID'|'SEMANTIC', 'filter': { 'equals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'in': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notIn': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'startsWith': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'listContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'stringContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'andAll': [ {'... recursive ...'}, ], 'orAll': [ {'... recursive ...'}, ] } } }, 'generationConfiguration': { 'promptTemplate': { 'textPromptTemplate': 'string' }, 'guardrailConfiguration': { 'guardrailId': 'string', 'guardrailVersion': 'string' }, 'kbInferenceConfig': { 'textInferenceConfig': { 'temperature': ..., 'topP': ..., 'maxTokens': 123, 'stopSequences': [ 'string', ] } }, 'additionalModelRequestFields': { 'string': {...}|[...]|123|123.4|'string'|True|None } }, 'orchestrationConfiguration': { 'queryTransformationConfiguration': { 'type': 'QUERY_DECOMPOSITION' } } }, 'externalSourcesConfiguration': { 'modelArn': 'string', 'sources': [ { 'sourceType': 'S3'|'BYTE_CONTENT', 's3Location': { 'uri': 'string' }, 'byteContent': { 'identifier': 'string', 'contentType': 'string', 'data': b'bytes' } }, ], 'generationConfiguration': { 'promptTemplate': { 'textPromptTemplate': 'string' }, 'guardrailConfiguration': { 'guardrailId': 'string', 'guardrailVersion': 'string' }, 'kbInferenceConfig': { 'textInferenceConfig': { 'temperature': ..., 'topP': ..., 'maxTokens': 123, 'stopSequences': [ 'string', ] } }, 'additionalModelRequestFields': { 'string': {...}|[...]|123|123.4|'string'|True|None } } } } } }, ] }, outputDataConfig={ 's3Uri': 'string' } )
string
[REQUIRED]
A name for the evaluation job. Names must unique with your Amazon Web Services account, and your account's Amazon Web Services region.
string
A description of the evaluation job.
string
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.
This field is autopopulated if not provided.
string
[REQUIRED]
The Amazon Resource Name (ARN) of an IAM service role that Amazon Bedrock can assume to perform tasks on your behalf. To learn more about the required permissions, see Required permissions for model evaluations.
string
Specify your customer managed encryption key Amazon Resource Name (ARN) that will be used to encrypt your evaluation job.
list
Tags to attach to the model evaluation job.
(dict) --
Definition of the key/value pair for a tag.
key (string) -- [REQUIRED]
Key for the tag.
value (string) -- [REQUIRED]
Value for the tag.
string
Specifies whether the evaluation job is for evaluating a model or evaluating a knowledge base (retrieval and response generation).
dict
[REQUIRED]
Contains the configuration details of either an automated or human-based evaluation job.
automated (dict) --
Contains the configuration details of an automated evaluation job that computes metrics.
datasetMetricConfigs (list) -- [REQUIRED]
Configuration details of the prompt datasets and metrics you want to use for your evaluation job.
(dict) --
Defines the prompt datasets, built-in metric names and custom metric names, and the task type.
taskType (string) -- [REQUIRED]
The the type of task you want to evaluate for your evaluation job. This applies only to model evaluation jobs and is ignored for knowledge base evaluation jobs.
dataset (dict) -- [REQUIRED]
Specifies the prompt dataset.
name (string) -- [REQUIRED]
Used to specify supported built-in prompt datasets. Valid values are Builtin.Bold, Builtin.BoolQ, Builtin.NaturalQuestions, Builtin.Gigaword, Builtin.RealToxicityPrompts, Builtin.TriviaQA, Builtin.T-Rex, Builtin.WomensEcommerceClothingReviews and Builtin.Wikitext2.
datasetLocation (dict) --
For custom prompt datasets, you must specify the location in Amazon S3 where the prompt dataset is saved.
s3Uri (string) --
The S3 URI of the S3 bucket specified in the job.
metricNames (list) -- [REQUIRED]
The names of the metrics you want to use for your evaluation job.
For knowledge base evaluation jobs that evaluate retrieval only, valid values are " Builtin.ContextRelevance", " Builtin.ContextConverage".
For knowledge base evaluation jobs that evaluate retrieval with response generation, valid values are " Builtin.Correctness", " Builtin.Completeness", " Builtin.Helpfulness", " Builtin.LogicalCoherence", " Builtin.Faithfulness", " Builtin.Harmfulness", " Builtin.Stereotyping", " Builtin.Refusal".
For automated model evaluation jobs, valid values are " Builtin.Accuracy", " Builtin.Robustness", and " Builtin.Toxicity". In model evaluation jobs that use a LLM as judge you can specify " Builtin.Correctness", " Builtin.Completeness", " Builtin.Faithfulness", " Builtin.Helpfulness", " Builtin.Coherence", " Builtin.Relevance", " Builtin.FollowingInstructions", " Builtin.ProfessionalStyleAndTone", You can also specify the following responsible AI related metrics only for model evaluation job that use a LLM as judge " Builtin.Harmfulness", " Builtin.Stereotyping", and " Builtin.Refusal".
For human-based model evaluation jobs, the list of strings must match the name parameter specified in HumanEvaluationCustomMetric.
(string) --
evaluatorModelConfig (dict) --
Contains the evaluator model configuration details. EvaluatorModelConfig is required for evaluation jobs that use a knowledge base or in model evaluation job that use a model as judge. This model computes all evaluation related metrics.
bedrockEvaluatorModels (list) --
The evaluator model used in knowledge base evaluation job or in model evaluation job that use a model as judge. This model computes all evaluation related metrics.
(dict) --
The evaluator model used in knowledge base evaluation job or in model evaluation job that use a model as judge. This model computes all evaluation related metrics.
modelIdentifier (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the evaluator model used used in knowledge base evaluation job or in model evaluation job that use a model as judge.
human (dict) --
Contains the configuration details of an evaluation job that uses human workers.
humanWorkflowConfig (dict) --
The parameters of the human workflow.
flowDefinitionArn (string) -- [REQUIRED]
The Amazon Resource Number (ARN) for the flow definition
instructions (string) --
Instructions for the flow definition
customMetrics (list) --
A HumanEvaluationCustomMetric object. It contains the names the metrics, how the metrics are to be evaluated, an optional description.
(dict) --
In a model evaluation job that uses human workers you must define the name of the metric, and how you want that metric rated ratingMethod, and an optional description of the metric.
name (string) -- [REQUIRED]
The name of the metric. Your human evaluators will see this name in the evaluation UI.
description (string) --
An optional description of the metric. Use this parameter to provide more details about the metric.
ratingMethod (string) -- [REQUIRED]
Choose how you want your human workers to evaluation your model. Valid values for rating methods are ThumbsUpDown, IndividualLikertScale, ComparisonLikertScale, ComparisonChoice, and ComparisonRank
datasetMetricConfigs (list) -- [REQUIRED]
Use to specify the metrics, task, and prompt dataset to be used in your model evaluation job.
(dict) --
Defines the prompt datasets, built-in metric names and custom metric names, and the task type.
taskType (string) -- [REQUIRED]
The the type of task you want to evaluate for your evaluation job. This applies only to model evaluation jobs and is ignored for knowledge base evaluation jobs.
dataset (dict) -- [REQUIRED]
Specifies the prompt dataset.
name (string) -- [REQUIRED]
Used to specify supported built-in prompt datasets. Valid values are Builtin.Bold, Builtin.BoolQ, Builtin.NaturalQuestions, Builtin.Gigaword, Builtin.RealToxicityPrompts, Builtin.TriviaQA, Builtin.T-Rex, Builtin.WomensEcommerceClothingReviews and Builtin.Wikitext2.
datasetLocation (dict) --
For custom prompt datasets, you must specify the location in Amazon S3 where the prompt dataset is saved.
s3Uri (string) --
The S3 URI of the S3 bucket specified in the job.
metricNames (list) -- [REQUIRED]
The names of the metrics you want to use for your evaluation job.
For knowledge base evaluation jobs that evaluate retrieval only, valid values are " Builtin.ContextRelevance", " Builtin.ContextConverage".
For knowledge base evaluation jobs that evaluate retrieval with response generation, valid values are " Builtin.Correctness", " Builtin.Completeness", " Builtin.Helpfulness", " Builtin.LogicalCoherence", " Builtin.Faithfulness", " Builtin.Harmfulness", " Builtin.Stereotyping", " Builtin.Refusal".
For automated model evaluation jobs, valid values are " Builtin.Accuracy", " Builtin.Robustness", and " Builtin.Toxicity". In model evaluation jobs that use a LLM as judge you can specify " Builtin.Correctness", " Builtin.Completeness", " Builtin.Faithfulness", " Builtin.Helpfulness", " Builtin.Coherence", " Builtin.Relevance", " Builtin.FollowingInstructions", " Builtin.ProfessionalStyleAndTone", You can also specify the following responsible AI related metrics only for model evaluation job that use a LLM as judge " Builtin.Harmfulness", " Builtin.Stereotyping", and " Builtin.Refusal".
For human-based model evaluation jobs, the list of strings must match the name parameter specified in HumanEvaluationCustomMetric.
(string) --
dict
[REQUIRED]
Contains the configuration details of the inference model for the evaluation job.
For model evaluation jobs, automated jobs support a single model or inference profile, and jobs that use human workers support two models or inference profiles.
models (list) --
Specifies the inference models.
(dict) --
Defines the models used in the model evaluation job.
bedrockModel (dict) --
Defines the Amazon Bedrock model or inference profile and inference parameters you want used.
modelIdentifier (string) -- [REQUIRED]
The ARN of the Amazon Bedrock model or inference profile specified.
inferenceParams (string) --
Each Amazon Bedrock support different inference parameters that change how the model behaves during inference.
ragConfigs (list) --
Contains the configuration details of the inference for a knowledge base evaluation job, including either the retrieval only configuration or the retrieval with response generation configuration.
(dict) --
Contains configuration details for retrieval of information and response generation.
knowledgeBaseConfig (dict) --
Contains configuration details for knowledge base retrieval and response generation.
retrieveConfig (dict) --
Contains configuration details for retrieving information from a knowledge base.
knowledgeBaseId (string) -- [REQUIRED]
The unique identifier of the knowledge base.
knowledgeBaseRetrievalConfiguration (dict) -- [REQUIRED]
Contains configuration details for knowledge base retrieval.
vectorSearchConfiguration (dict) -- [REQUIRED]
Contains configuration details for returning the results from the vector search.
numberOfResults (integer) --
The number of text chunks to retrieve; the number of results to return.
overrideSearchType (string) --
By default, Amazon Bedrock decides a search strategy for you. If you're using an Amazon OpenSearch Serverless vector store that contains a filterable text field, you can specify whether to query the knowledge base with a HYBRID search using both vector embeddings and raw text, or SEMANTIC search using only vector embeddings. For other vector store configurations, only SEMANTIC search is available.
filter (dict) --
Specifies the filters to use on the metadata fields in the knowledge base data sources before returning results.
equals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value matches the value in this object.
The following example would return data sources with an animal attribute whose value is 'cat': "equals": { "key": "animal", "value": "cat" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
notEquals (dict) --
Knowledge base data sources that contain a metadata attribute whose name matches the key and whose value doesn't match the value in this object are returned.
The following example would return data sources that don't contain an animal attribute whose value is 'cat': "notEquals": { "key": "animal", "value": "cat" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
greaterThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than the value in this object.
The following example would return data sources with an year attribute whose value is greater than '1989': "greaterThan": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
greaterThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is greater than or equal to '1989': "greaterThanOrEquals": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
lessThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than the value in this object.
The following example would return data sources with an year attribute whose value is less than to '1989': "lessThan": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
lessThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is less than or equal to '1989': "lessThanOrEquals": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
in (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is in the list specified in the value in this object.
The following example would return data sources with an animal attribute that is either 'cat' or 'dog': "in": { "key": "animal", "value": ["cat", "dog"] }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
notIn (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value isn't in the list specified in the value in this object.
The following example would return data sources whose animal attribute is neither 'cat' nor 'dog': "notIn": { "key": "animal", "value": ["cat", "dog"] }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
startsWith (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value starts with the value in this object. This filter is currently only supported for Amazon OpenSearch Serverless vector stores.
The following example would return data sources with an animal attribute starts with 'ca' (for example, 'cat' or 'camel'). "startsWith": { "key": "animal", "value": "ca" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
listContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is a list that contains the value as one of its members.
The following example would return data sources with an animals attribute that is a list containing a cat member (for example, ["dog", "cat"]): "listContains": { "key": "animals", "value": "cat" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
stringContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is one of the following:
A string that contains the value as a substring. The following example would return data sources with an animal attribute that contains the substring at (for example, 'cat'): "stringContains": { "key": "animal", "value": "at" }
A list with a member that contains the value as a substring. The following example would return data sources with an animals attribute that is a list containing a member that contains the substring at (for example, ["dog", "cat"]): "stringContains": { "key": "animals", "value": "at" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
andAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill all the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
orAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill at least one of the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
retrieveAndGenerateConfig (dict) --
Contains configuration details for retrieving information from a knowledge base and generating responses.
type (string) -- [REQUIRED]
The type of resource that contains your data for retrieving information and generating responses.
If you choose to use EXTERNAL_SOURCES, then currently only Claude 3 Sonnet models for knowledge bases are supported.
knowledgeBaseConfiguration (dict) --
Contains configuration details for the knowledge base retrieval and response generation.
knowledgeBaseId (string) -- [REQUIRED]
The unique identifier of the knowledge base.
modelArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the foundation model or inference profile used to generate responses.
retrievalConfiguration (dict) --
Contains configuration details for retrieving text chunks.
vectorSearchConfiguration (dict) -- [REQUIRED]
Contains configuration details for returning the results from the vector search.
numberOfResults (integer) --
The number of text chunks to retrieve; the number of results to return.
overrideSearchType (string) --
By default, Amazon Bedrock decides a search strategy for you. If you're using an Amazon OpenSearch Serverless vector store that contains a filterable text field, you can specify whether to query the knowledge base with a HYBRID search using both vector embeddings and raw text, or SEMANTIC search using only vector embeddings. For other vector store configurations, only SEMANTIC search is available.
filter (dict) --
Specifies the filters to use on the metadata fields in the knowledge base data sources before returning results.
equals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value matches the value in this object.
The following example would return data sources with an animal attribute whose value is 'cat': "equals": { "key": "animal", "value": "cat" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
notEquals (dict) --
Knowledge base data sources that contain a metadata attribute whose name matches the key and whose value doesn't match the value in this object are returned.
The following example would return data sources that don't contain an animal attribute whose value is 'cat': "notEquals": { "key": "animal", "value": "cat" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
greaterThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than the value in this object.
The following example would return data sources with an year attribute whose value is greater than '1989': "greaterThan": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
greaterThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is greater than or equal to '1989': "greaterThanOrEquals": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
lessThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than the value in this object.
The following example would return data sources with an year attribute whose value is less than to '1989': "lessThan": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
lessThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is less than or equal to '1989': "lessThanOrEquals": { "key": "year", "value": 1989 }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
in (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is in the list specified in the value in this object.
The following example would return data sources with an animal attribute that is either 'cat' or 'dog': "in": { "key": "animal", "value": ["cat", "dog"] }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
notIn (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value isn't in the list specified in the value in this object.
The following example would return data sources whose animal attribute is neither 'cat' nor 'dog': "notIn": { "key": "animal", "value": ["cat", "dog"] }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
startsWith (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value starts with the value in this object. This filter is currently only supported for Amazon OpenSearch Serverless vector stores.
The following example would return data sources with an animal attribute starts with 'ca' (for example, 'cat' or 'camel'). "startsWith": { "key": "animal", "value": "ca" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
listContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is a list that contains the value as one of its members.
The following example would return data sources with an animals attribute that is a list containing a cat member (for example, ["dog", "cat"]): "listContains": { "key": "animals", "value": "cat" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
stringContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is one of the following:
A string that contains the value as a substring. The following example would return data sources with an animal attribute that contains the substring at (for example, 'cat'): "stringContains": { "key": "animal", "value": "at" }
A list with a member that contains the value as a substring. The following example would return data sources with an animals attribute that is a list containing a member that contains the substring at (for example, ["dog", "cat"]): "stringContains": { "key": "animals", "value": "at" }
key (string) -- [REQUIRED]
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) -- [REQUIRED]
The value of the metadata attribute/field.
andAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill all the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
orAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill at least one of the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
generationConfiguration (dict) --
Contains configurations details for response generation based on retrieved text chunks.
promptTemplate (dict) --
Contains the template for the prompt that's sent to the model for response generation.
textPromptTemplate (string) --
The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.
For more information, see Knowledge base prompt template and Use XML tags with Anthropic Claude models.
guardrailConfiguration (dict) --
Contains configuration details for the guardrail.
guardrailId (string) -- [REQUIRED]
The unique identifier for the guardrail.
guardrailVersion (string) -- [REQUIRED]
The version of the guardrail.
kbInferenceConfig (dict) --
Contains configuration details for inference for knowledge base retrieval and response generation.
textInferenceConfig (dict) --
Contains configuration details for text generation using a language model via the RetrieveAndGenerate function.
temperature (float) --
Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.
topP (float) --
A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.
maxTokens (integer) --
The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
stopSequences (list) --
A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
(string) --
additionalModelRequestFields (dict) --
Additional model parameters and corresponding values not included in the textInferenceConfig structure for a knowledge base. This allows you to provide custom model parameters specific to the language model being used.
(string) --
(:ref:`document<document>`) --
orchestrationConfiguration (dict) --
Contains configuration details for the model to process the prompt prior to retrieval and response generation.
queryTransformationConfiguration (dict) -- [REQUIRED]
Contains configuration details for transforming the prompt.
type (string) -- [REQUIRED]
The type of transformation to apply to the prompt.
externalSourcesConfiguration (dict) --
The configuration for the external source wrapper object in the retrieveAndGenerate function.
modelArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the foundation model or inference profile used to generate responses.
sources (list) -- [REQUIRED]
The document for the external source wrapper object in the retrieveAndGenerate function.
(dict) --
The unique external source of the content contained in the wrapper object.
sourceType (string) -- [REQUIRED]
The source type of the external source wrapper object.
s3Location (dict) --
The S3 location of the external source wrapper object.
uri (string) -- [REQUIRED]
The S3 URI location for the wrapper object of the document.
byteContent (dict) --
The identifier, content type, and data of the external source wrapper object.
identifier (string) -- [REQUIRED]
The file name of the document contained in the wrapper object.
contentType (string) -- [REQUIRED]
The MIME type of the document contained in the wrapper object.
data (bytes) -- [REQUIRED]
The byte value of the file to upload, encoded as a Base-64 string.
generationConfiguration (dict) --
Contains configurations details for response generation based on retrieved text chunks.
promptTemplate (dict) --
Contains the template for the prompt for the external source wrapper object.
textPromptTemplate (string) --
The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.
For more information, see Knowledge base prompt template and Use XML tags with Anthropic Claude models.
guardrailConfiguration (dict) --
Configuration details for the guardrail.
guardrailId (string) -- [REQUIRED]
The unique identifier for the guardrail.
guardrailVersion (string) -- [REQUIRED]
The version of the guardrail.
kbInferenceConfig (dict) --
Configuration details for inference when using RetrieveAndGenerate to generate responses while using an external source.
textInferenceConfig (dict) --
Contains configuration details for text generation using a language model via the RetrieveAndGenerate function.
temperature (float) --
Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.
topP (float) --
A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.
maxTokens (integer) --
The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
stopSequences (list) --
A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
(string) --
additionalModelRequestFields (dict) --
Additional model parameters and their corresponding values not included in the text inference configuration for an external source. Takes in custom model parameters specific to the language model being used.
(string) --
(:ref:`document<document>`) --
dict
[REQUIRED]
Contains the configuration details of the Amazon S3 bucket for storing the results of the evaluation job.
s3Uri (string) -- [REQUIRED]
The Amazon S3 URI where the results of the evaluation job are saved.
dict
Response Syntax
{ 'jobArn': 'string' }
Response Structure
(dict) --
jobArn (string) --
The Amazon Resource Name (ARN) of the evaluation job.
{'applicationType': 'ModelEvaluation | RagEvaluation', 'evaluationConfig': {'automated': {'evaluatorModelConfig': {'bedrockEvaluatorModels': [{'modelIdentifier': 'string'}]}}}, 'inferenceConfig': {'ragConfigs': [{'knowledgeBaseConfig': {'retrieveAndGenerateConfig': {'externalSourcesConfiguration': {'generationConfiguration': {'additionalModelRequestFields': {'string': {}}, 'guardrailConfiguration': {'guardrailId': 'string', 'guardrailVersion': 'string'}, 'kbInferenceConfig': {'textInferenceConfig': {'maxTokens': 'integer', 'stopSequences': ['string'], 'temperature': 'float', 'topP': 'float'}}, 'promptTemplate': {'textPromptTemplate': 'string'}}, 'modelArn': 'string', 'sources': [{'byteContent': {'contentType': 'string', 'data': 'blob', 'identifier': 'string'}, 's3Location': {'uri': 'string'}, 'sourceType': 'S3 ' '| ' 'BYTE_CONTENT'}]}, 'knowledgeBaseConfiguration': {'generationConfiguration': {'additionalModelRequestFields': {'string': {}}, 'guardrailConfiguration': {'guardrailId': 'string', 'guardrailVersion': 'string'}, 'kbInferenceConfig': {'textInferenceConfig': {'maxTokens': 'integer', 'stopSequences': ['string'], 'temperature': 'float', 'topP': 'float'}}, 'promptTemplate': {'textPromptTemplate': 'string'}}, 'knowledgeBaseId': 'string', 'modelArn': 'string', 'orchestrationConfiguration': {'queryTransformationConfiguration': {'type': 'QUERY_DECOMPOSITION'}}, 'retrievalConfiguration': {'vectorSearchConfiguration': {'filter': {'andAll': [()], 'equals': {'key': 'string', 'value': {}}, 'greaterThan': {'key': 'string', 'value': {}}, 'greaterThanOrEquals': {'key': 'string', 'value': {}}, 'in': {'key': 'string', 'value': {}}, 'lessThan': {'key': 'string', 'value': {}}, 'lessThanOrEquals': {'key': 'string', 'value': {}}, 'listContains': {'key': 'string', 'value': {}}, 'notEquals': {'key': 'string', 'value': {}}, 'notIn': {'key': 'string', 'value': {}}, 'orAll': [()], 'startsWith': {'key': 'string', 'value': {}}, 'stringContains': {'key': 'string', 'value': {}}}, 'numberOfResults': 'integer', 'overrideSearchType': 'HYBRID ' '| ' 'SEMANTIC'}}}, 'type': 'KNOWLEDGE_BASE ' '| ' 'EXTERNAL_SOURCES'}, 'retrieveConfig': {'knowledgeBaseId': 'string', 'knowledgeBaseRetrievalConfiguration': {'vectorSearchConfiguration': {'filter': {'andAll': [()], 'equals': {'key': 'string', 'value': {}}, 'greaterThan': {'key': 'string', 'value': {}}, 'greaterThanOrEquals': {'key': 'string', 'value': {}}, 'in': {'key': 'string', 'value': {}}, 'lessThan': {'key': 'string', 'value': {}}, 'lessThanOrEquals': {'key': 'string', 'value': {}}, 'listContains': {'key': 'string', 'value': {}}, 'notEquals': {'key': 'string', 'value': {}}, 'notIn': {'key': 'string', 'value': {}}, 'orAll': [()], 'startsWith': {'key': 'string', 'value': {}}, 'stringContains': {'key': 'string', 'value': {}}}, 'numberOfResults': 'integer', 'overrideSearchType': 'HYBRID ' '| ' 'SEMANTIC'}}}}}]}}
Gets information about an evaluation job, such as the status of the job.
See also: AWS API Documentation
Request Syntax
client.get_evaluation_job( jobIdentifier='string' )
string
[REQUIRED]
The Amazon Resource Name (ARN) of the evaluation job you want get information on.
dict
Response Syntax
{ 'jobName': 'string', 'status': 'InProgress'|'Completed'|'Failed'|'Stopping'|'Stopped'|'Deleting', 'jobArn': 'string', 'jobDescription': 'string', 'roleArn': 'string', 'customerEncryptionKeyId': 'string', 'jobType': 'Human'|'Automated', 'applicationType': 'ModelEvaluation'|'RagEvaluation', 'evaluationConfig': { 'automated': { 'datasetMetricConfigs': [ { 'taskType': 'Summarization'|'Classification'|'QuestionAndAnswer'|'Generation'|'Custom', 'dataset': { 'name': 'string', 'datasetLocation': { 's3Uri': 'string' } }, 'metricNames': [ 'string', ] }, ], 'evaluatorModelConfig': { 'bedrockEvaluatorModels': [ { 'modelIdentifier': 'string' }, ] } }, 'human': { 'humanWorkflowConfig': { 'flowDefinitionArn': 'string', 'instructions': 'string' }, 'customMetrics': [ { 'name': 'string', 'description': 'string', 'ratingMethod': 'string' }, ], 'datasetMetricConfigs': [ { 'taskType': 'Summarization'|'Classification'|'QuestionAndAnswer'|'Generation'|'Custom', 'dataset': { 'name': 'string', 'datasetLocation': { 's3Uri': 'string' } }, 'metricNames': [ 'string', ] }, ] } }, 'inferenceConfig': { 'models': [ { 'bedrockModel': { 'modelIdentifier': 'string', 'inferenceParams': 'string' } }, ], 'ragConfigs': [ { 'knowledgeBaseConfig': { 'retrieveConfig': { 'knowledgeBaseId': 'string', 'knowledgeBaseRetrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 123, 'overrideSearchType': 'HYBRID'|'SEMANTIC', 'filter': { 'equals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'in': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notIn': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'startsWith': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'listContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'stringContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'andAll': [ {'... recursive ...'}, ], 'orAll': [ {'... recursive ...'}, ] } } } }, 'retrieveAndGenerateConfig': { 'type': 'KNOWLEDGE_BASE'|'EXTERNAL_SOURCES', 'knowledgeBaseConfiguration': { 'knowledgeBaseId': 'string', 'modelArn': 'string', 'retrievalConfiguration': { 'vectorSearchConfiguration': { 'numberOfResults': 123, 'overrideSearchType': 'HYBRID'|'SEMANTIC', 'filter': { 'equals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'greaterThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThan': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'lessThanOrEquals': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'in': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'notIn': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'startsWith': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'listContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'stringContains': { 'key': 'string', 'value': {...}|[...]|123|123.4|'string'|True|None }, 'andAll': [ {'... recursive ...'}, ], 'orAll': [ {'... recursive ...'}, ] } } }, 'generationConfiguration': { 'promptTemplate': { 'textPromptTemplate': 'string' }, 'guardrailConfiguration': { 'guardrailId': 'string', 'guardrailVersion': 'string' }, 'kbInferenceConfig': { 'textInferenceConfig': { 'temperature': ..., 'topP': ..., 'maxTokens': 123, 'stopSequences': [ 'string', ] } }, 'additionalModelRequestFields': { 'string': {...}|[...]|123|123.4|'string'|True|None } }, 'orchestrationConfiguration': { 'queryTransformationConfiguration': { 'type': 'QUERY_DECOMPOSITION' } } }, 'externalSourcesConfiguration': { 'modelArn': 'string', 'sources': [ { 'sourceType': 'S3'|'BYTE_CONTENT', 's3Location': { 'uri': 'string' }, 'byteContent': { 'identifier': 'string', 'contentType': 'string', 'data': b'bytes' } }, ], 'generationConfiguration': { 'promptTemplate': { 'textPromptTemplate': 'string' }, 'guardrailConfiguration': { 'guardrailId': 'string', 'guardrailVersion': 'string' }, 'kbInferenceConfig': { 'textInferenceConfig': { 'temperature': ..., 'topP': ..., 'maxTokens': 123, 'stopSequences': [ 'string', ] } }, 'additionalModelRequestFields': { 'string': {...}|[...]|123|123.4|'string'|True|None } } } } } }, ] }, 'outputDataConfig': { 's3Uri': 'string' }, 'creationTime': datetime(2015, 1, 1), 'lastModifiedTime': datetime(2015, 1, 1), 'failureMessages': [ 'string', ] }
Response Structure
(dict) --
jobName (string) --
The name for the evaluation job.
status (string) --
The current status of the evaluation job.
jobArn (string) --
The Amazon Resource Name (ARN) of the evaluation job.
jobDescription (string) --
The description of the evaluation job.
roleArn (string) --
The Amazon Resource Name (ARN) of the IAM service role used in the evaluation job.
customerEncryptionKeyId (string) --
The Amazon Resource Name (ARN) of the customer managed encryption key specified when the evaluation job was created.
jobType (string) --
Specifies whether the evaluation job is automated or human-based.
applicationType (string) --
Specifies whether the evaluation job is for evaluating a model or evaluating a knowledge base (retrieval and response generation).
evaluationConfig (dict) --
Contains the configuration details of either an automated or human-based evaluation job.
automated (dict) --
Contains the configuration details of an automated evaluation job that computes metrics.
datasetMetricConfigs (list) --
Configuration details of the prompt datasets and metrics you want to use for your evaluation job.
(dict) --
Defines the prompt datasets, built-in metric names and custom metric names, and the task type.
taskType (string) --
The the type of task you want to evaluate for your evaluation job. This applies only to model evaluation jobs and is ignored for knowledge base evaluation jobs.
dataset (dict) --
Specifies the prompt dataset.
name (string) --
Used to specify supported built-in prompt datasets. Valid values are Builtin.Bold, Builtin.BoolQ, Builtin.NaturalQuestions, Builtin.Gigaword, Builtin.RealToxicityPrompts, Builtin.TriviaQA, Builtin.T-Rex, Builtin.WomensEcommerceClothingReviews and Builtin.Wikitext2.
datasetLocation (dict) --
For custom prompt datasets, you must specify the location in Amazon S3 where the prompt dataset is saved.
s3Uri (string) --
The S3 URI of the S3 bucket specified in the job.
metricNames (list) --
The names of the metrics you want to use for your evaluation job.
For knowledge base evaluation jobs that evaluate retrieval only, valid values are " Builtin.ContextRelevance", " Builtin.ContextConverage".
For knowledge base evaluation jobs that evaluate retrieval with response generation, valid values are " Builtin.Correctness", " Builtin.Completeness", " Builtin.Helpfulness", " Builtin.LogicalCoherence", " Builtin.Faithfulness", " Builtin.Harmfulness", " Builtin.Stereotyping", " Builtin.Refusal".
For automated model evaluation jobs, valid values are " Builtin.Accuracy", " Builtin.Robustness", and " Builtin.Toxicity". In model evaluation jobs that use a LLM as judge you can specify " Builtin.Correctness", " Builtin.Completeness", " Builtin.Faithfulness", " Builtin.Helpfulness", " Builtin.Coherence", " Builtin.Relevance", " Builtin.FollowingInstructions", " Builtin.ProfessionalStyleAndTone", You can also specify the following responsible AI related metrics only for model evaluation job that use a LLM as judge " Builtin.Harmfulness", " Builtin.Stereotyping", and " Builtin.Refusal".
For human-based model evaluation jobs, the list of strings must match the name parameter specified in HumanEvaluationCustomMetric.
(string) --
evaluatorModelConfig (dict) --
Contains the evaluator model configuration details. EvaluatorModelConfig is required for evaluation jobs that use a knowledge base or in model evaluation job that use a model as judge. This model computes all evaluation related metrics.
bedrockEvaluatorModels (list) --
The evaluator model used in knowledge base evaluation job or in model evaluation job that use a model as judge. This model computes all evaluation related metrics.
(dict) --
The evaluator model used in knowledge base evaluation job or in model evaluation job that use a model as judge. This model computes all evaluation related metrics.
modelIdentifier (string) --
The Amazon Resource Name (ARN) of the evaluator model used used in knowledge base evaluation job or in model evaluation job that use a model as judge.
human (dict) --
Contains the configuration details of an evaluation job that uses human workers.
humanWorkflowConfig (dict) --
The parameters of the human workflow.
flowDefinitionArn (string) --
The Amazon Resource Number (ARN) for the flow definition
instructions (string) --
Instructions for the flow definition
customMetrics (list) --
A HumanEvaluationCustomMetric object. It contains the names the metrics, how the metrics are to be evaluated, an optional description.
(dict) --
In a model evaluation job that uses human workers you must define the name of the metric, and how you want that metric rated ratingMethod, and an optional description of the metric.
name (string) --
The name of the metric. Your human evaluators will see this name in the evaluation UI.
description (string) --
An optional description of the metric. Use this parameter to provide more details about the metric.
ratingMethod (string) --
Choose how you want your human workers to evaluation your model. Valid values for rating methods are ThumbsUpDown, IndividualLikertScale, ComparisonLikertScale, ComparisonChoice, and ComparisonRank
datasetMetricConfigs (list) --
Use to specify the metrics, task, and prompt dataset to be used in your model evaluation job.
(dict) --
Defines the prompt datasets, built-in metric names and custom metric names, and the task type.
taskType (string) --
The the type of task you want to evaluate for your evaluation job. This applies only to model evaluation jobs and is ignored for knowledge base evaluation jobs.
dataset (dict) --
Specifies the prompt dataset.
name (string) --
Used to specify supported built-in prompt datasets. Valid values are Builtin.Bold, Builtin.BoolQ, Builtin.NaturalQuestions, Builtin.Gigaword, Builtin.RealToxicityPrompts, Builtin.TriviaQA, Builtin.T-Rex, Builtin.WomensEcommerceClothingReviews and Builtin.Wikitext2.
datasetLocation (dict) --
For custom prompt datasets, you must specify the location in Amazon S3 where the prompt dataset is saved.
s3Uri (string) --
The S3 URI of the S3 bucket specified in the job.
metricNames (list) --
The names of the metrics you want to use for your evaluation job.
For knowledge base evaluation jobs that evaluate retrieval only, valid values are " Builtin.ContextRelevance", " Builtin.ContextConverage".
For knowledge base evaluation jobs that evaluate retrieval with response generation, valid values are " Builtin.Correctness", " Builtin.Completeness", " Builtin.Helpfulness", " Builtin.LogicalCoherence", " Builtin.Faithfulness", " Builtin.Harmfulness", " Builtin.Stereotyping", " Builtin.Refusal".
For automated model evaluation jobs, valid values are " Builtin.Accuracy", " Builtin.Robustness", and " Builtin.Toxicity". In model evaluation jobs that use a LLM as judge you can specify " Builtin.Correctness", " Builtin.Completeness", " Builtin.Faithfulness", " Builtin.Helpfulness", " Builtin.Coherence", " Builtin.Relevance", " Builtin.FollowingInstructions", " Builtin.ProfessionalStyleAndTone", You can also specify the following responsible AI related metrics only for model evaluation job that use a LLM as judge " Builtin.Harmfulness", " Builtin.Stereotyping", and " Builtin.Refusal".
For human-based model evaluation jobs, the list of strings must match the name parameter specified in HumanEvaluationCustomMetric.
(string) --
inferenceConfig (dict) --
Contains the configuration details of the inference model used for the evaluation job.
models (list) --
Specifies the inference models.
(dict) --
Defines the models used in the model evaluation job.
bedrockModel (dict) --
Defines the Amazon Bedrock model or inference profile and inference parameters you want used.
modelIdentifier (string) --
The ARN of the Amazon Bedrock model or inference profile specified.
inferenceParams (string) --
Each Amazon Bedrock support different inference parameters that change how the model behaves during inference.
ragConfigs (list) --
Contains the configuration details of the inference for a knowledge base evaluation job, including either the retrieval only configuration or the retrieval with response generation configuration.
(dict) --
Contains configuration details for retrieval of information and response generation.
knowledgeBaseConfig (dict) --
Contains configuration details for knowledge base retrieval and response generation.
retrieveConfig (dict) --
Contains configuration details for retrieving information from a knowledge base.
knowledgeBaseId (string) --
The unique identifier of the knowledge base.
knowledgeBaseRetrievalConfiguration (dict) --
Contains configuration details for knowledge base retrieval.
vectorSearchConfiguration (dict) --
Contains configuration details for returning the results from the vector search.
numberOfResults (integer) --
The number of text chunks to retrieve; the number of results to return.
overrideSearchType (string) --
By default, Amazon Bedrock decides a search strategy for you. If you're using an Amazon OpenSearch Serverless vector store that contains a filterable text field, you can specify whether to query the knowledge base with a HYBRID search using both vector embeddings and raw text, or SEMANTIC search using only vector embeddings. For other vector store configurations, only SEMANTIC search is available.
filter (dict) --
Specifies the filters to use on the metadata fields in the knowledge base data sources before returning results.
equals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value matches the value in this object.
The following example would return data sources with an animal attribute whose value is 'cat': "equals": { "key": "animal", "value": "cat" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
notEquals (dict) --
Knowledge base data sources that contain a metadata attribute whose name matches the key and whose value doesn't match the value in this object are returned.
The following example would return data sources that don't contain an animal attribute whose value is 'cat': "notEquals": { "key": "animal", "value": "cat" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
greaterThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than the value in this object.
The following example would return data sources with an year attribute whose value is greater than '1989': "greaterThan": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
greaterThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is greater than or equal to '1989': "greaterThanOrEquals": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
lessThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than the value in this object.
The following example would return data sources with an year attribute whose value is less than to '1989': "lessThan": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
lessThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is less than or equal to '1989': "lessThanOrEquals": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
in (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is in the list specified in the value in this object.
The following example would return data sources with an animal attribute that is either 'cat' or 'dog': "in": { "key": "animal", "value": ["cat", "dog"] }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
notIn (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value isn't in the list specified in the value in this object.
The following example would return data sources whose animal attribute is neither 'cat' nor 'dog': "notIn": { "key": "animal", "value": ["cat", "dog"] }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
startsWith (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value starts with the value in this object. This filter is currently only supported for Amazon OpenSearch Serverless vector stores.
The following example would return data sources with an animal attribute starts with 'ca' (for example, 'cat' or 'camel'). "startsWith": { "key": "animal", "value": "ca" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
listContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is a list that contains the value as one of its members.
The following example would return data sources with an animals attribute that is a list containing a cat member (for example, ["dog", "cat"]): "listContains": { "key": "animals", "value": "cat" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
stringContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is one of the following:
A string that contains the value as a substring. The following example would return data sources with an animal attribute that contains the substring at (for example, 'cat'): "stringContains": { "key": "animal", "value": "at" }
A list with a member that contains the value as a substring. The following example would return data sources with an animals attribute that is a list containing a member that contains the substring at (for example, ["dog", "cat"]): "stringContains": { "key": "animals", "value": "at" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
andAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill all the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
orAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill at least one of the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
retrieveAndGenerateConfig (dict) --
Contains configuration details for retrieving information from a knowledge base and generating responses.
type (string) --
The type of resource that contains your data for retrieving information and generating responses.
If you choose to use EXTERNAL_SOURCES, then currently only Claude 3 Sonnet models for knowledge bases are supported.
knowledgeBaseConfiguration (dict) --
Contains configuration details for the knowledge base retrieval and response generation.
knowledgeBaseId (string) --
The unique identifier of the knowledge base.
modelArn (string) --
The Amazon Resource Name (ARN) of the foundation model or inference profile used to generate responses.
retrievalConfiguration (dict) --
Contains configuration details for retrieving text chunks.
vectorSearchConfiguration (dict) --
Contains configuration details for returning the results from the vector search.
numberOfResults (integer) --
The number of text chunks to retrieve; the number of results to return.
overrideSearchType (string) --
By default, Amazon Bedrock decides a search strategy for you. If you're using an Amazon OpenSearch Serverless vector store that contains a filterable text field, you can specify whether to query the knowledge base with a HYBRID search using both vector embeddings and raw text, or SEMANTIC search using only vector embeddings. For other vector store configurations, only SEMANTIC search is available.
filter (dict) --
Specifies the filters to use on the metadata fields in the knowledge base data sources before returning results.
equals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value matches the value in this object.
The following example would return data sources with an animal attribute whose value is 'cat': "equals": { "key": "animal", "value": "cat" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
notEquals (dict) --
Knowledge base data sources that contain a metadata attribute whose name matches the key and whose value doesn't match the value in this object are returned.
The following example would return data sources that don't contain an animal attribute whose value is 'cat': "notEquals": { "key": "animal", "value": "cat" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
greaterThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than the value in this object.
The following example would return data sources with an year attribute whose value is greater than '1989': "greaterThan": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
greaterThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is greater than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is greater than or equal to '1989': "greaterThanOrEquals": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
lessThan (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than the value in this object.
The following example would return data sources with an year attribute whose value is less than to '1989': "lessThan": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
lessThanOrEquals (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is less than or equal to the value in this object.
The following example would return data sources with an year attribute whose value is less than or equal to '1989': "lessThanOrEquals": { "key": "year", "value": 1989 }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
in (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is in the list specified in the value in this object.
The following example would return data sources with an animal attribute that is either 'cat' or 'dog': "in": { "key": "animal", "value": ["cat", "dog"] }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
notIn (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value isn't in the list specified in the value in this object.
The following example would return data sources whose animal attribute is neither 'cat' nor 'dog': "notIn": { "key": "animal", "value": ["cat", "dog"] }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
startsWith (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value starts with the value in this object. This filter is currently only supported for Amazon OpenSearch Serverless vector stores.
The following example would return data sources with an animal attribute starts with 'ca' (for example, 'cat' or 'camel'). "startsWith": { "key": "animal", "value": "ca" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
listContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is a list that contains the value as one of its members.
The following example would return data sources with an animals attribute that is a list containing a cat member (for example, ["dog", "cat"]): "listContains": { "key": "animals", "value": "cat" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
stringContains (dict) --
Knowledge base data sources are returned if they contain a metadata attribute whose name matches the key and whose value is one of the following:
A string that contains the value as a substring. The following example would return data sources with an animal attribute that contains the substring at (for example, 'cat'): "stringContains": { "key": "animal", "value": "at" }
A list with a member that contains the value as a substring. The following example would return data sources with an animals attribute that is a list containing a member that contains the substring at (for example, ["dog", "cat"]): "stringContains": { "key": "animals", "value": "at" }
key (string) --
The name of metadata attribute/field, which must match the name in your data source/document metadata.
value (:ref:`document<document>`) --
The value of the metadata attribute/field.
andAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill all the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
orAll (list) --
Knowledge base data sources are returned if their metadata attributes fulfill at least one of the filter conditions inside this list.
(dict) --
Specifies the filters to use on the metadata attributes/fields in the knowledge base data sources before returning results.
generationConfiguration (dict) --
Contains configurations details for response generation based on retrieved text chunks.
promptTemplate (dict) --
Contains the template for the prompt that's sent to the model for response generation.
textPromptTemplate (string) --
The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.
For more information, see Knowledge base prompt template and Use XML tags with Anthropic Claude models.
guardrailConfiguration (dict) --
Contains configuration details for the guardrail.
guardrailId (string) --
The unique identifier for the guardrail.
guardrailVersion (string) --
The version of the guardrail.
kbInferenceConfig (dict) --
Contains configuration details for inference for knowledge base retrieval and response generation.
textInferenceConfig (dict) --
Contains configuration details for text generation using a language model via the RetrieveAndGenerate function.
temperature (float) --
Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.
topP (float) --
A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.
maxTokens (integer) --
The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
stopSequences (list) --
A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
(string) --
additionalModelRequestFields (dict) --
Additional model parameters and corresponding values not included in the textInferenceConfig structure for a knowledge base. This allows you to provide custom model parameters specific to the language model being used.
(string) --
(:ref:`document<document>`) --
orchestrationConfiguration (dict) --
Contains configuration details for the model to process the prompt prior to retrieval and response generation.
queryTransformationConfiguration (dict) --
Contains configuration details for transforming the prompt.
type (string) --
The type of transformation to apply to the prompt.
externalSourcesConfiguration (dict) --
The configuration for the external source wrapper object in the retrieveAndGenerate function.
modelArn (string) --
The Amazon Resource Name (ARN) of the foundation model or inference profile used to generate responses.
sources (list) --
The document for the external source wrapper object in the retrieveAndGenerate function.
(dict) --
The unique external source of the content contained in the wrapper object.
sourceType (string) --
The source type of the external source wrapper object.
s3Location (dict) --
The S3 location of the external source wrapper object.
uri (string) --
The S3 URI location for the wrapper object of the document.
byteContent (dict) --
The identifier, content type, and data of the external source wrapper object.
identifier (string) --
The file name of the document contained in the wrapper object.
contentType (string) --
The MIME type of the document contained in the wrapper object.
data (bytes) --
The byte value of the file to upload, encoded as a Base-64 string.
generationConfiguration (dict) --
Contains configurations details for response generation based on retrieved text chunks.
promptTemplate (dict) --
Contains the template for the prompt for the external source wrapper object.
textPromptTemplate (string) --
The template for the prompt that's sent to the model for response generation. You can include prompt placeholders, which become replaced before the prompt is sent to the model to provide instructions and context to the model. In addition, you can include XML tags to delineate meaningful sections of the prompt template.
For more information, see Knowledge base prompt template and Use XML tags with Anthropic Claude models.
guardrailConfiguration (dict) --
Configuration details for the guardrail.
guardrailId (string) --
The unique identifier for the guardrail.
guardrailVersion (string) --
The version of the guardrail.
kbInferenceConfig (dict) --
Configuration details for inference when using RetrieveAndGenerate to generate responses while using an external source.
textInferenceConfig (dict) --
Contains configuration details for text generation using a language model via the RetrieveAndGenerate function.
temperature (float) --
Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.
topP (float) --
A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.
maxTokens (integer) --
The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
stopSequences (list) --
A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
(string) --
additionalModelRequestFields (dict) --
Additional model parameters and their corresponding values not included in the text inference configuration for an external source. Takes in custom model parameters specific to the language model being used.
(string) --
(:ref:`document<document>`) --
outputDataConfig (dict) --
Contains the configuration details of the Amazon S3 bucket for storing the results of the evaluation job.
s3Uri (string) --
The Amazon S3 URI where the results of the evaluation job are saved.
creationTime (datetime) --
The time the evaluation job was created.
lastModifiedTime (datetime) --
The time the evaluation job was last modified.
failureMessages (list) --
A list of strings that specify why the evaluation job failed to create.
(string) --
{'applicationTypeEquals': 'ModelEvaluation | RagEvaluation'}Response
{'jobSummaries': {'applicationType': 'ModelEvaluation | RagEvaluation', 'evaluatorModelIdentifiers': ['string'], 'ragIdentifiers': ['string']}}
Lists all existing evaluation jobs.
See also: AWS API Documentation
Request Syntax
client.list_evaluation_jobs( creationTimeAfter=datetime(2015, 1, 1), creationTimeBefore=datetime(2015, 1, 1), statusEquals='InProgress'|'Completed'|'Failed'|'Stopping'|'Stopped'|'Deleting', applicationTypeEquals='ModelEvaluation'|'RagEvaluation', nameContains='string', maxResults=123, nextToken='string', sortBy='CreationTime', sortOrder='Ascending'|'Descending' )
datetime
A filter to only list evaluation jobs created after a specified time.
datetime
A filter to only list evaluation jobs created before a specified time.
string
A filter to only list evaluation jobs that are of a certain status.
string
A filter to only list evaluation jobs that are either model evaluations or knowledge base evaluations.
string
A filter to only list evaluation jobs that contain a specified string in the job name.
integer
The maximum number of results to return.
string
Continuation token from the previous response, for Amazon Bedrock to list the next set of results.
string
Specifies a creation time to sort the list of evaluation jobs by when they were created.
string
Specifies whether to sort the list of evaluation jobs by either ascending or descending order.
dict
Response Syntax
{ 'nextToken': 'string', 'jobSummaries': [ { 'jobArn': 'string', 'jobName': 'string', 'status': 'InProgress'|'Completed'|'Failed'|'Stopping'|'Stopped'|'Deleting', 'creationTime': datetime(2015, 1, 1), 'jobType': 'Human'|'Automated', 'evaluationTaskTypes': [ 'Summarization'|'Classification'|'QuestionAndAnswer'|'Generation'|'Custom', ], 'modelIdentifiers': [ 'string', ], 'ragIdentifiers': [ 'string', ], 'evaluatorModelIdentifiers': [ 'string', ], 'applicationType': 'ModelEvaluation'|'RagEvaluation' }, ] }
Response Structure
(dict) --
nextToken (string) --
Continuation token from the previous response, for Amazon Bedrock to list the next set of results.
jobSummaries (list) --
A list of summaries of the evaluation jobs.
(dict) --
Summary information of an evaluation job.
jobArn (string) --
The Amazon Resource Name (ARN) of the evaluation job.
jobName (string) --
The name for the evaluation job.
status (string) --
The current status of the evaluation job.
creationTime (datetime) --
The time the evaluation job was created.
jobType (string) --
Specifies whether the evaluation job is automated or human-based.
evaluationTaskTypes (list) --
The type of task for model evaluation.
(string) --
modelIdentifiers (list) --
The Amazon Resource Names (ARNs) of the model(s) used for the evaluation job.
(string) --
ragIdentifiers (list) --
The Amazon Resource Names (ARNs) of the knowledge base resources used for a knowledge base evaluation job.
(string) --
evaluatorModelIdentifiers (list) --
The Amazon Resource Names (ARNs) of the models used to compute the metrics for a knowledge base evaluation job.
(string) --
applicationType (string) --
Specifies whether the evaluation job is for evaluating a model or evaluating a knowledge base (retrieval and response generation).