AWS API Changes

2024/10/01 - Agents for Amazon Bedrock - 1 new3 updated api methods

Changes This release adds support to stop an ongoing ingestion job using the StopIngestionJob API in Agents for Amazon Bedrock.

StopIngestionJob (new)

Link ¶

Stops a currently running data ingestion job. You can send a StartIngestionJob request again to ingest the rest of your data when you are ready.

See also: AWS API Documentation

Request Syntax

client.stop_ingestion_job(
    dataSourceId='string',
    ingestionJobId='string',
    knowledgeBaseId='string'
)

type dataSourceId:

string

param dataSourceId:

[REQUIRED]

The unique identifier of the data source for the data ingestion job you want to stop.

type ingestionJobId:

string

param ingestionJobId:

[REQUIRED]

The unique identifier of the data ingestion job you want to stop.

type knowledgeBaseId:

string

param knowledgeBaseId:

[REQUIRED]

The unique identifier of the knowledge base for the data ingestion job you want to stop.

rtype:

dict

returns:

Response Syntax

{
    'ingestionJob': {
        'dataSourceId': 'string',
        'description': 'string',
        'failureReasons': [
            'string',
        ],
        'ingestionJobId': 'string',
        'knowledgeBaseId': 'string',
        'startedAt': datetime(2015, 1, 1),
        'statistics': {
            'numberOfDocumentsDeleted': 123,
            'numberOfDocumentsFailed': 123,
            'numberOfDocumentsScanned': 123,
            'numberOfMetadataDocumentsModified': 123,
            'numberOfMetadataDocumentsScanned': 123,
            'numberOfModifiedDocumentsIndexed': 123,
            'numberOfNewDocumentsIndexed': 123
        },
        'status': 'STARTING'|'IN_PROGRESS'|'COMPLETE'|'FAILED'|'STOPPING'|'STOPPED',
        'updatedAt': datetime(2015, 1, 1)
    }
}

Response Structure

(dict) --
- ingestionJob (dict) --
  
  Contains information about the stopped data ingestion job.
  - dataSourceId (string) --
    
    The unique identifier of the data source for the data ingestion job.
  - description (string) --
    
    The description of the data ingestion job.
  - failureReasons (list) --
    
    A list of reasons that the data ingestion job failed.
    - (string) --
  - ingestionJobId (string) --
    
    The unique identifier of the data ingestion job.
  - knowledgeBaseId (string) --
    
    The unique identifier of the knowledge for the data ingestion job.
  - startedAt (datetime) --
    
    The time the data ingestion job started.
    
    If you stop a data ingestion job, the startedAt time is the time the job was started before the job was stopped.
  - statistics (dict) --
    
    Contains statistics about the data ingestion job.
    - numberOfDocumentsDeleted (integer) --
      
      The number of source documents that were deleted.
    - numberOfDocumentsFailed (integer) --
      
      The number of source documents that failed to be ingested.
    - numberOfDocumentsScanned (integer) --
      
      The total number of source documents that were scanned. Includes new, updated, and unchanged documents.
    - numberOfMetadataDocumentsModified (integer) --
      
      The number of metadata files that were updated or deleted.
    - numberOfMetadataDocumentsScanned (integer) --
      
      The total number of metadata files that were scanned. Includes new, updated, and unchanged files.
    - numberOfModifiedDocumentsIndexed (integer) --
      
      The number of modified source documents in the data source that were successfully indexed.
    - numberOfNewDocumentsIndexed (integer) --
      
      The number of new source documents in the data source that were successfully indexed.
  - status (string) --
    
    The status of the data ingestion job.
  - updatedAt (datetime) --
    
    The time the data ingestion job was last updated.
    
    If you stop a data ingestion job, the updatedAt time is the time the job was stopped.

GetIngestionJob (updated)

Link ¶
Changes (response)

{'ingestionJob': {'status': {'STOPPED', 'STOPPING'}}}

Gets information about a data ingestion job. Data sources are ingested into your knowledge base so that Large Lanaguage Models (LLMs) can use your data.

See also: AWS API Documentation

Request Syntax

client.get_ingestion_job(
    dataSourceId='string',
    ingestionJobId='string',
    knowledgeBaseId='string'
)

type dataSourceId:

string

param dataSourceId:

[REQUIRED]

The unique identifier of the data source for the data ingestion job you want to get information on.

type ingestionJobId:

string

param ingestionJobId:

[REQUIRED]

The unique identifier of the data ingestion job you want to get information on.

type knowledgeBaseId:

string

param knowledgeBaseId:

[REQUIRED]

The unique identifier of the knowledge base for the data ingestion job you want to get information on.

rtype:

dict

returns:

Response Syntax

{
    'ingestionJob': {
        'dataSourceId': 'string',
        'description': 'string',
        'failureReasons': [
            'string',
        ],
        'ingestionJobId': 'string',
        'knowledgeBaseId': 'string',
        'startedAt': datetime(2015, 1, 1),
        'statistics': {
            'numberOfDocumentsDeleted': 123,
            'numberOfDocumentsFailed': 123,
            'numberOfDocumentsScanned': 123,
            'numberOfMetadataDocumentsModified': 123,
            'numberOfMetadataDocumentsScanned': 123,
            'numberOfModifiedDocumentsIndexed': 123,
            'numberOfNewDocumentsIndexed': 123
        },
        'status': 'STARTING'|'IN_PROGRESS'|'COMPLETE'|'FAILED'|'STOPPING'|'STOPPED',
        'updatedAt': datetime(2015, 1, 1)
    }
}

Response Structure

(dict) --
- ingestionJob (dict) --
  
  Contains details about the data ingestion job.
  - dataSourceId (string) --
    
    The unique identifier of the data source for the data ingestion job.
  - description (string) --
    
    The description of the data ingestion job.
  - failureReasons (list) --
    
    A list of reasons that the data ingestion job failed.
    - (string) --
  - ingestionJobId (string) --
    
    The unique identifier of the data ingestion job.
  - knowledgeBaseId (string) --
    
    The unique identifier of the knowledge for the data ingestion job.
  - startedAt (datetime) --
    
    The time the data ingestion job started.
    
    If you stop a data ingestion job, the startedAt time is the time the job was started before the job was stopped.
  - statistics (dict) --
    
    Contains statistics about the data ingestion job.
    - numberOfDocumentsDeleted (integer) --
      
      The number of source documents that were deleted.
    - numberOfDocumentsFailed (integer) --
      
      The number of source documents that failed to be ingested.
    - numberOfDocumentsScanned (integer) --
      
      The total number of source documents that were scanned. Includes new, updated, and unchanged documents.
    - numberOfMetadataDocumentsModified (integer) --
      
      The number of metadata files that were updated or deleted.
    - numberOfMetadataDocumentsScanned (integer) --
      
      The total number of metadata files that were scanned. Includes new, updated, and unchanged files.
    - numberOfModifiedDocumentsIndexed (integer) --
      
      The number of modified source documents in the data source that were successfully indexed.
    - numberOfNewDocumentsIndexed (integer) --
      
      The number of new source documents in the data source that were successfully indexed.
  - status (string) --
    
    The status of the data ingestion job.
  - updatedAt (datetime) --
    
    The time the data ingestion job was last updated.
    
    If you stop a data ingestion job, the updatedAt time is the time the job was stopped.

ListIngestionJobs (updated)

Link ¶
Changes (response)

{'ingestionJobSummaries': {'status': {'STOPPED', 'STOPPING'}}}

Lists the data ingestion jobs for a data source. The list also includes information about each job.

See also: AWS API Documentation

Request Syntax

client.list_ingestion_jobs(
    dataSourceId='string',
    filters=[
        {
            'attribute': 'STATUS',
            'operator': 'EQ',
            'values': [
                'string',
            ]
        },
    ],
    knowledgeBaseId='string',
    maxResults=123,
    nextToken='string',
    sortBy={
        'attribute': 'STATUS'|'STARTED_AT',
        'order': 'ASCENDING'|'DESCENDING'
    }
)

type dataSourceId:

string

param dataSourceId:

[REQUIRED]

The unique identifier of the data source for the list of data ingestion jobs.

type filters:

list

param filters:

Contains information about the filters for filtering the data.

(dict) --

The definition of a filter to filter the data.
- attribute (string) -- [REQUIRED]
  
  The name of field or attribute to apply the filter.
- operator (string) -- [REQUIRED]
  
  The operation to apply to the field or attribute.
- values (list) -- [REQUIRED]
  
  A list of values that belong to the field or attribute.
  - (string) --

type knowledgeBaseId:

string

param knowledgeBaseId:

[REQUIRED]

The unique identifier of the knowledge base for the list of data ingestion jobs.

type maxResults:

integer

param maxResults:

The maximum number of results to return in the response. If the total number of results is greater than this value, use the token returned in the response in the nextToken field when making another request to return the next batch of results.

type nextToken:

string

param nextToken:

If the total number of results is greater than the maxResults value provided in the request, enter the token returned in the nextToken field in the response in this field to return the next batch of results.

type sortBy:

dict

param sortBy:

Contains details about how to sort the data.

attribute (string) -- [REQUIRED]

The name of field or attribute to apply sorting of data.
order (string) -- [REQUIRED]

The order for sorting the data.

rtype:

dict

returns:

Response Syntax

{
    'ingestionJobSummaries': [
        {
            'dataSourceId': 'string',
            'description': 'string',
            'ingestionJobId': 'string',
            'knowledgeBaseId': 'string',
            'startedAt': datetime(2015, 1, 1),
            'statistics': {
                'numberOfDocumentsDeleted': 123,
                'numberOfDocumentsFailed': 123,
                'numberOfDocumentsScanned': 123,
                'numberOfMetadataDocumentsModified': 123,
                'numberOfMetadataDocumentsScanned': 123,
                'numberOfModifiedDocumentsIndexed': 123,
                'numberOfNewDocumentsIndexed': 123
            },
            'status': 'STARTING'|'IN_PROGRESS'|'COMPLETE'|'FAILED'|'STOPPING'|'STOPPED',
            'updatedAt': datetime(2015, 1, 1)
        },
    ],
    'nextToken': 'string'
}

Response Structure

(dict) --
- ingestionJobSummaries (list) --
  
  A list of data ingestion jobs with information about each job.
  - (dict) --
    
    Contains details about a data ingestion job.
    - dataSourceId (string) --
      
      The unique identifier of the data source for the data ingestion job.
    - description (string) --
      
      The description of the data ingestion job.
    - ingestionJobId (string) --
      
      The unique identifier of the data ingestion job.
    - knowledgeBaseId (string) --
      
      The unique identifier of the knowledge base for the data ingestion job.
    - startedAt (datetime) --
      
      The time the data ingestion job started.
    - statistics (dict) --
      
      Contains statistics for the data ingestion job.
      - numberOfDocumentsDeleted (integer) --
        
        The number of source documents that were deleted.
      - numberOfDocumentsFailed (integer) --
        
        The number of source documents that failed to be ingested.
      - numberOfDocumentsScanned (integer) --
        
        The total number of source documents that were scanned. Includes new, updated, and unchanged documents.
      - numberOfMetadataDocumentsModified (integer) --
        
        The number of metadata files that were updated or deleted.
      - numberOfMetadataDocumentsScanned (integer) --
        
        The total number of metadata files that were scanned. Includes new, updated, and unchanged files.
      - numberOfModifiedDocumentsIndexed (integer) --
        
        The number of modified source documents in the data source that were successfully indexed.
      - numberOfNewDocumentsIndexed (integer) --
        
        The number of new source documents in the data source that were successfully indexed.
    - status (string) --
      
      The status of the data ingestion job.
    - updatedAt (datetime) --
      
      The time the data ingestion job was last updated.
- nextToken (string) --
  
  If the total number of results is greater than the maxResults value provided in the request, use this token when making another request in the nextToken field to return the next batch of results.

StartIngestionJob (updated)

Link ¶
Changes (response)

{'ingestionJob': {'status': {'STOPPED', 'STOPPING'}}}

Begins a data ingestion job. Data sources are ingested into your knowledge base so that Large Language Models (LLMs) can use your data.

See also: AWS API Documentation

Request Syntax

client.start_ingestion_job(
    clientToken='string',
    dataSourceId='string',
    description='string',
    knowledgeBaseId='string'
)

type clientToken:

string

param clientToken:

A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.

This field is autopopulated if not provided.

type dataSourceId:

string

param dataSourceId:

[REQUIRED]

The unique identifier of the data source you want to ingest into your knowledge base.

type description:

string

param description:

A description of the data ingestion job.

type knowledgeBaseId:

string

param knowledgeBaseId:

[REQUIRED]

The unique identifier of the knowledge base for the data ingestion job.

rtype:

dict

returns:

Response Syntax

{
    'ingestionJob': {
        'dataSourceId': 'string',
        'description': 'string',
        'failureReasons': [
            'string',
        ],
        'ingestionJobId': 'string',
        'knowledgeBaseId': 'string',
        'startedAt': datetime(2015, 1, 1),
        'statistics': {
            'numberOfDocumentsDeleted': 123,
            'numberOfDocumentsFailed': 123,
            'numberOfDocumentsScanned': 123,
            'numberOfMetadataDocumentsModified': 123,
            'numberOfMetadataDocumentsScanned': 123,
            'numberOfModifiedDocumentsIndexed': 123,
            'numberOfNewDocumentsIndexed': 123
        },
        'status': 'STARTING'|'IN_PROGRESS'|'COMPLETE'|'FAILED'|'STOPPING'|'STOPPED',
        'updatedAt': datetime(2015, 1, 1)
    }
}

Response Structure

(dict) --
- ingestionJob (dict) --
  
  Contains information about the data ingestion job.
  - dataSourceId (string) --
    
    The unique identifier of the data source for the data ingestion job.
  - description (string) --
    
    The description of the data ingestion job.
  - failureReasons (list) --
    
    A list of reasons that the data ingestion job failed.
    - (string) --
  - ingestionJobId (string) --
    
    The unique identifier of the data ingestion job.
  - knowledgeBaseId (string) --
    
    The unique identifier of the knowledge for the data ingestion job.
  - startedAt (datetime) --
    
    The time the data ingestion job started.
    
    If you stop a data ingestion job, the startedAt time is the time the job was started before the job was stopped.
  - statistics (dict) --
    
    Contains statistics about the data ingestion job.
    - numberOfDocumentsDeleted (integer) --
      
      The number of source documents that were deleted.
    - numberOfDocumentsFailed (integer) --
      
      The number of source documents that failed to be ingested.
    - numberOfDocumentsScanned (integer) --
      
      The total number of source documents that were scanned. Includes new, updated, and unchanged documents.
    - numberOfMetadataDocumentsModified (integer) --
      
      The number of metadata files that were updated or deleted.
    - numberOfMetadataDocumentsScanned (integer) --
      
      The total number of metadata files that were scanned. Includes new, updated, and unchanged files.
    - numberOfModifiedDocumentsIndexed (integer) --
      
      The number of modified source documents in the data source that were successfully indexed.
    - numberOfNewDocumentsIndexed (integer) --
      
      The number of new source documents in the data source that were successfully indexed.
  - status (string) --
    
    The status of the data ingestion job.
  - updatedAt (datetime) --
    
    The time the data ingestion job was last updated.
    
    If you stop a data ingestion job, the updatedAt time is the time the job was stopped.