2020/12/08 - AWSKendraFrontendService - 5 updated api methods
Changes Update kendra client to latest version
{'Configuration': {'GoogleDriveConfiguration': {'ExcludeMimeTypes': ['string'], 'ExcludeSharedDrives': ['string'], 'ExcludeUserAccounts': ['string'], 'ExclusionPatterns': ['string'], 'FieldMappings': [{'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string'}], 'InclusionPatterns': ['string'], 'SecretArn': 'string'}}, 'Type': {'GOOGLEDRIVE'}}
Creates a data source that you use to with an Amazon Kendra index.
You specify a name, data source connector type and description for your data source. You also specify configuration information such as document metadata (author, source URI, and so on) and user context information.
CreateDataSource is a synchronous operation. The operation returns 200 if the data source was successfully created. Otherwise, an exception is raised.
See also: AWS API Documentation
Request Syntax
client.create_data_source( Name='string', IndexId='string', Type='S3'|'SHAREPOINT'|'DATABASE'|'SALESFORCE'|'ONEDRIVE'|'SERVICENOW'|'CUSTOM'|'CONFLUENCE'|'GOOGLEDRIVE', Configuration={ 'S3Configuration': { 'BucketName': 'string', 'InclusionPrefixes': [ 'string', ], 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'DocumentsMetadataConfiguration': { 'S3Prefix': 'string' }, 'AccessControlListConfiguration': { 'KeyPath': 'string' } }, 'SharePointConfiguration': { 'SharePointVersion': 'SHAREPOINT_ONLINE', 'Urls': [ 'string', ], 'SecretArn': 'string', 'CrawlAttachments': True|False, 'UseChangeLog': True|False, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DocumentTitleFieldName': 'string', 'DisableLocalGroups': True|False }, 'DatabaseConfiguration': { 'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL', 'ConnectionConfiguration': { 'DatabaseHost': 'string', 'DatabasePort': 123, 'DatabaseName': 'string', 'TableName': 'string', 'SecretArn': 'string' }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'ColumnConfiguration': { 'DocumentIdColumnName': 'string', 'DocumentDataColumnName': 'string', 'DocumentTitleColumnName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ChangeDetectingColumns': [ 'string', ] }, 'AclConfiguration': { 'AllowedGroupsColumnName': 'string' }, 'SqlConfiguration': { 'QueryIdentifiersEnclosingOption': 'DOUBLE_QUOTES'|'NONE' } }, 'SalesforceConfiguration': { 'ServerUrl': 'string', 'SecretArn': 'string', 'StandardObjectConfigurations': [ { 'Name': 'ACCOUNT'|'CAMPAIGN'|'CASE'|'CONTACT'|'CONTRACT'|'DOCUMENT'|'GROUP'|'IDEA'|'LEAD'|'OPPORTUNITY'|'PARTNER'|'PRICEBOOK'|'PRODUCT'|'PROFILE'|'SOLUTION'|'TASK'|'USER', 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, ], 'KnowledgeArticleConfiguration': { 'IncludedStates': [ 'DRAFT'|'PUBLISHED'|'ARCHIVED', ], 'StandardKnowledgeArticleTypeConfiguration': { 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'CustomKnowledgeArticleTypeConfigurations': [ { 'Name': 'string', 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, ] }, 'ChatterFeedConfiguration': { 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'IncludeFilterTypes': [ 'ACTIVE_USER'|'STANDARD_USER', ] }, 'CrawlAttachments': True|False, 'StandardObjectAttachmentConfiguration': { 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ] }, 'OneDriveConfiguration': { 'TenantDomain': 'string', 'SecretArn': 'string', 'OneDriveUsers': { 'OneDriveUserList': [ 'string', ], 'OneDriveUserS3Path': { 'Bucket': 'string', 'Key': 'string' } }, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DisableLocalGroups': True|False }, 'ServiceNowConfiguration': { 'HostUrl': 'string', 'SecretArn': 'string', 'ServiceNowBuildVersion': 'LONDON'|'OTHERS', 'KnowledgeArticleConfiguration': { 'CrawlAttachments': True|False, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ], 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'ServiceCatalogConfiguration': { 'CrawlAttachments': True|False, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ], 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] } }, 'ConfluenceConfiguration': { 'ServerUrl': 'string', 'SecretArn': 'string', 'Version': 'CLOUD'|'SERVER', 'SpaceConfiguration': { 'CrawlPersonalSpaces': True|False, 'CrawlArchivedSpaces': True|False, 'IncludeSpaces': [ 'string', ], 'ExcludeSpaces': [ 'string', ], 'SpaceFieldMappings': [ { 'DataSourceFieldName': 'DISPLAY_URL'|'ITEM_TYPE'|'SPACE_KEY'|'URL', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'PageConfiguration': { 'PageFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'CONTENT_STATUS'|'CREATED_DATE'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'MODIFIED_DATE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'BlogConfiguration': { 'BlogFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'PUBLISH_DATE'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'AttachmentConfiguration': { 'CrawlAttachments': True|False, 'AttachmentFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'CONTENT_TYPE'|'CREATED_DATE'|'DISPLAY_URL'|'FILE_SIZE'|'ITEM_TYPE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ] }, 'GoogleDriveConfiguration': { 'SecretArn': 'string', 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ExcludeMimeTypes': [ 'string', ], 'ExcludeUserAccounts': [ 'string', ], 'ExcludeSharedDrives': [ 'string', ] } }, Description='string', Schedule='string', RoleArn='string', Tags=[ { 'Key': 'string', 'Value': 'string' }, ], ClientToken='string' )
string
[REQUIRED]
A unique name for the data source. A data source name can't be changed without deleting and recreating the data source.
string
[REQUIRED]
The identifier of the index that should be associated with this data source.
string
[REQUIRED]
The type of repository that contains the data source.
dict
The connector configuration information that is required to access the repository.
You can't specify the Configuration parameter when the Type parameter is set to CUSTOM. If you do, you receive a ValidationException exception.
The Configuration parameter is required for all other data sources.
S3Configuration (dict) --
Provides information to create a data source connector for a document repository in an Amazon S3 bucket.
BucketName (string) -- [REQUIRED]
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
(string) --
InclusionPatterns (list) --
A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia.
(string) --
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia.
(string) --
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the AWS S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.
KeyPath (string) --
Path to the AWS S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides information necessary to create a data source connector for a Microsoft SharePoint site.
SharePointVersion (string) -- [REQUIRED]
The version of Microsoft SharePoint that you are using as a data source.
Urls (list) -- [REQUIRED]
The URLs of the Microsoft SharePoint site that contains the documents that should be indexed.
(string) --
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Microsoft SharePoint Data Source. For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
CrawlAttachments (boolean) --
TRUE to include attachments to documents stored in your Microsoft SharePoint site in the index; otherwise, FALSE.
UseChangeLog (boolean) --
Set to TRUE to use the Microsoft SharePoint change log to determine the documents that need to be updated in the index. Depending on the size of the SharePoint change log, it may take longer for Amazon Kendra to use the change log than it takes it to determine the changed documents using the Amazon Kendra document crawler.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft SharePoint attributes to custom fields in the Amazon Kendra index. You must first create the index fields using the operation before you map SharePoint attributes. For more information, see Mapping Data Source Fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DisableLocalGroups (boolean) --
A Boolean value that specifies whether local groups are disabled ( True) or enabled ( False).
DatabaseConfiguration (dict) --
Provides information necessary to create a data source connector for a database.
DatabaseEngineType (string) -- [REQUIRED]
The type of database engine that runs the database.
ConnectionConfiguration (dict) -- [REQUIRED]
The information necessary to connect to a database.
DatabaseHost (string) -- [REQUIRED]
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) -- [REQUIRED]
The port that the database uses for connections.
DatabaseName (string) -- [REQUIRED]
The name of the database containing the document data.
TableName (string) -- [REQUIRED]
The name of the table that contains the document data.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source. For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
ColumnConfiguration (dict) -- [REQUIRED]
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) -- [REQUIRED]
The column that provides the document's unique identifier.
DocumentDataColumnName (string) -- [REQUIRED]
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ChangeDetectingColumns (list) -- [REQUIRED]
One to five columns that indicate when a document in the database has changed.
(string) --
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) -- [REQUIRED]
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext field of the Query operation.
SqlConfiguration (dict) --
Provides information about how Amazon Kendra uses quote marks around SQL identifiers when querying a database data source.
QueryIdentifiersEnclosingOption (string) --
Determines whether Amazon Kendra encloses SQL identifiers for tables and column names in double quotes (") when making a database query.
By default, Amazon Kendra passes SQL identifiers the way that they are entered into the data source configuration. It does not change the case of identifiers or enclose them in quotes.
PostgreSQL internally converts uppercase characters to lower case characters in identifiers unless they are quoted. Choosing this option encloses identifiers in quotes so that PostgreSQL does not convert the character's case.
For MySQL databases, you must enable the ansi_quotes option when you set this field to DOUBLE_QUOTES.
SalesforceConfiguration (dict) --
Provides configuration information for data sources that connect to a Salesforce site.
ServerUrl (string) -- [REQUIRED]
The instance URL for the Salesforce site that you want to index.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key/value pairs required to connect to your Salesforce instance. The secret must contain a JSON structure with the following keys:
authenticationUrl - The OAUTH endpoint that Amazon Kendra connects to get an OAUTH token.
consumerKey - The application public key generated when you created your Salesforce application.
consumerSecret - The application private key generated when you created your Salesforce application.
password - The password associated with the user logging in to the Salesforce instance.
securityToken - The token associated with the user account logging in to the Salesforce instance.
username - The user name of the user logging in to the Salesforce instance.
StandardObjectConfigurations (list) --
Specifies the Salesforce standard objects that Amazon Kendra indexes.
(dict) --
Specifies confguration information for indexing a single standard object.
Name (string) -- [REQUIRED]
The name of the standard object.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the field in the standard object table that contains the document contents.
DocumentTitleFieldName (string) --
The name of the field in the standard object table that contains the document titleB.
FieldMappings (list) --
One or more objects that map fields in the standard object to Amazon Kendra index fields. The index field must exist before you can map a Salesforce field to it.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
KnowledgeArticleConfiguration (dict) --
Specifies configuration information for the knowlege article types that Amazon Kendra indexes. Amazon Kendra indexes standard knowledge articles and the standard fields of knowledge articles, or the custom fields of custom knowledge articles, but not both.
IncludedStates (list) -- [REQUIRED]
Specifies the document states that should be included when Amazon Kendra indexes knowledge articles. You must specify at least one state.
(string) --
StandardKnowledgeArticleTypeConfiguration (dict) --
Provides configuration information for standard Salesforce knowledge articles.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the field that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field that contains the document title.
FieldMappings (list) --
One or more objects that map fields in the knowledge article to Amazon Kendra index fields. The index field must exist before you can map a Salesforce field to it.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
CustomKnowledgeArticleTypeConfigurations (list) --
Provides configuration information for custom Salesforce knowledge articles.
(dict) --
Provides configuration information for indexing Salesforce custom articles.
Name (string) -- [REQUIRED]
The name of the configuration.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the field in the custom knowledge article that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field in the custom knowledge article that contains the document title.
FieldMappings (list) --
One or more objects that map fields in the custom knowledge article to fields in the Amazon Kendra index.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ChatterFeedConfiguration (dict) --
Specifies configuration information for Salesforce chatter feeds.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the column in the Salesforce FeedItem table that contains the content to index. Typically this is the Body column.
DocumentTitleFieldName (string) --
The name of the column in the Salesforce FeedItem table that contains the title of the document. This is typically the Title collumn.
FieldMappings (list) --
Maps fields from a Salesforce chatter feed into Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
IncludeFilterTypes (list) --
Filters the documents in the feed based on status of the user. When you specify ACTIVE_USERS only documents from users who have an active account are indexed. When you specify STANDARD_USER only documents for Salesforce standard users are documented. You can specify both.
(string) --
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to Salesforce objects.
StandardObjectAttachmentConfiguration (dict) --
Provides configuration information for processing attachments to Salesforce standard objects.
DocumentTitleFieldName (string) --
The name of the field used for the document title.
FieldMappings (list) --
One or more objects that map fields in attachments to Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
IncludeAttachmentFilePatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the name of the attached file.
(string) --
ExcludeAttachmentFilePatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the name of the attached file.
(string) --
OneDriveConfiguration (dict) --
Provides configuration for data sources that connect to Microsoft OneDrive.
TenantDomain (string) -- [REQUIRED]
The Azure Active Directory domain of the organization.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the user name and password to connect to OneDrive. The user namd should be the application ID for the OneDrive application, and the password is the application key for the OneDrive application.
OneDriveUsers (dict) -- [REQUIRED]
A list of user accounts whose documents should be indexed.
OneDriveUserList (list) --
A list of users whose documents should be indexed. Specify the user names in email format, for example, username@tenantdomain. If you need to index the documents of more than 100 users, use the OneDriveUserS3Path field to specify the location of a file containing a list of users.
(string) --
OneDriveUserS3Path (dict) --
The S3 bucket location of a file containing a list of users whose documents should be indexed.
Bucket (string) -- [REQUIRED]
The name of the S3 bucket that contains the file.
Key (string) -- [REQUIRED]
The name of the file.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the pattern are included in the index. Documents that don't match the pattern are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The exclusion pattern is applied to the file name.
(string) --
ExclusionPatterns (list) --
List of regular expressions applied to documents. Items that match the exclusion pattern are not indexed. If you provide both an inclusion pattern and an exclusion pattern, any item that matches the exclusion pattern isn't indexed.
The exclusion pattern is applied to the file name.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft OneDrive fields to custom fields in the Amazon Kendra index. You must first create the index fields before you map OneDrive fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
DisableLocalGroups (boolean) --
A Boolean value that specifies whether local groups are disabled ( True) or enabled ( False).
ServiceNowConfiguration (dict) --
Provides configuration for data sources that connect to ServiceNow instances.
HostUrl (string) -- [REQUIRED]
The ServiceNow instance that the data source connects to. The host endpoint should look like the following: {instance}.service-now.com.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the AWS Secret Manager secret that contains the user name and password required to connect to the ServiceNow instance.
ServiceNowBuildVersion (string) -- [REQUIRED]
The identifier of the release that the ServiceNow host is running. If the host is not running the LONDON release, use OTHERS.
KnowledgeArticleConfiguration (dict) --
Provides configuration information for crawling knowledge articles in the ServiceNow site.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to knowledge articles.
IncludeAttachmentFilePatterns (list) --
List of regular expressions applied to knowledge articles. Items that don't match the inclusion pattern are not indexed. The regex is applied to the field specified in the PatternTargetField.
(string) --
ExcludeAttachmentFilePatterns (list) --
List of regular expressions applied to knowledge articles. Items that don't match the inclusion pattern are not indexed. The regex is applied to the field specified in the PatternTargetField
(string) --
DocumentDataFieldName (string) -- [REQUIRED]
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Mapping between ServiceNow fields and Amazon Kendra index fields. You must create the index field before you map the field.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ServiceCatalogConfiguration (dict) --
Provides configuration information for crawling service catalogs in the ServiceNow site.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should crawl attachments to the service catalog items.
IncludeAttachmentFilePatterns (list) --
Determines the types of file attachments that are included in the index.
(string) --
ExcludeAttachmentFilePatterns (list) --
Determines the types of file attachments that are excluded from the index.
(string) --
DocumentDataFieldName (string) -- [REQUIRED]
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Mapping between ServiceNow fields and Amazon Kendra index fields. You must create the index field before you map the field.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ConfluenceConfiguration (dict) --
Provides configuration information for connecting to a Confluence data source.
ServerUrl (string) -- [REQUIRED]
The URL of your Confluence instance. Use the full URL of the server. For example, https://server.example.com:port/. You can also use an IP address, for example, https://192.168.1.113/.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key/value pairs required to connect to your Confluence server. The secret must contain a JSON structure with the following keys:
username - The user name or email address of a user with administrative privileges for the Confluence server.
password - The password associated with the user logging in to the Confluence server.
Version (string) -- [REQUIRED]
Specifies the version of the Confluence installation that you are connecting to.
SpaceConfiguration (dict) --
Specifies configuration information for indexing Confluence spaces.
CrawlPersonalSpaces (boolean) --
Specifies whether Amazon Kendra should index personal spaces. Users can add restrictions to items in personal spaces. If personal spaces are indexed, queries without user context information may return restricted items from a personal space in their results. For more information, see Filtering on user context.
CrawlArchivedSpaces (boolean) --
Specifies whether Amazon Kendra should index archived spaces.
IncludeSpaces (list) --
A list of space keys for Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are indexed. Spaces that aren't in the list aren't indexed. A space in the list must exist. Otherwise, Amazon Kendra logs an error when the data source is synchronized. If a space is in both the IncludeSpaces and the ExcludeSpaces list, the space is excluded.
(string) --
ExcludeSpaces (list) --
A list of space keys of Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are not indexed. If a space is in both the ExcludeSpaces and the IncludeSpaces list, the space is excluded.
(string) --
SpaceFieldMappings (list) --
Defines how space metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the SpaceFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
PageConfiguration (dict) --
Specifies configuration information for indexing Confluence pages.
PageFieldMappings (list) --
Defines how page metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the PageFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
BlogConfiguration (dict) --
Specifies configuration information for indexing Confluence blogs.
BlogFieldMappings (list) --
Defines how blog metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the BlogFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a blog field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
AttachmentConfiguration (dict) --
Specifies configuration information for indexing attachments to Confluence blogs and pages.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra indexes attachments to the pages and blogs in the Confluence data source.
AttachmentFieldMappings (list) --
Defines how attachment metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the AttachentFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
You must first create the index field using the operation.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
VpcConfiguration (dict) --
Specifies the information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
InclusionPatterns (list) --
A list of regular expression patterns that apply to a URL on the Confluence server. An inclusion pattern can apply to a blog post, a page, a space, or an attachment. Items that match the patterns are included in the index. Items that don't match the pattern are excluded from the index. If an item matches both an inclusion pattern and an exclusion pattern, the item isn't included in the index.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns that apply to a URL on the Confluence server. An exclusion pattern can apply to a blog post, a page, a space, or an attachment. Items that match the pattern are excluded from the index. Items that don't match the pattern are included in the index. If a item matches both an exclusion pattern and an inclusion pattern, the item isn't included in the index.
(string) --
GoogleDriveConfiguration (dict) --
Provides configuration for data sources that connect to Google Drive.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of a AWS Secrets Manager secret that contains the credentials required to connect to Google Drive. For more information, see Using a Google Workspace Drive data source.
InclusionPatterns (list) --
A list of regular expression patterns that apply to path on Google Drive. Items that match the pattern are included in the index from both shared drives and users' My Drives. Items that don't match the pattern are excluded from the index. If an item matches both an inclusion pattern and an exclusion pattern, it is excluded from the index.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns that apply to the path on Google Drive. Items that match the pattern are excluded from the index from both shared drives and users' My Drives. Items that don't match the pattern are included in the index. If an item matches both an exclusion pattern and an inclusion pattern, it is excluded from the index.
(string) --
FieldMappings (list) --
Defines mapping between a field in the Google Drive and a Amazon Kendra index field.
If you are using the console, you can define index fields when creating the mapping. If you are using the API, you must first create the field using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ExcludeMimeTypes (list) --
A list of MIME types to exclude from the index. All documents matching the specified MIME type are excluded.
For a list of MIME types, see Using a Google Workspace Drive data source.
(string) --
ExcludeUserAccounts (list) --
A list of email addresses of the users. Documents owned by these users are excluded from the index. Documents shared with excluded users are indexed unless they are excluded in another way.
(string) --
ExcludeSharedDrives (list) --
A list of identifiers or shared drives to exclude from the index. All files and folders stored on the shared drive are excluded.
(string) --
string
A description for the data source.
string
Sets the frequency that Amazon Kendra will check the documents in your repository and update the index. If you don't set a schedule Amazon Kendra will not periodically update the index. You can call the StartDataSourceSyncJob operation to update the index.
You can't specify the Schedule parameter when the Type parameter is set to CUSTOM. If you do, you receive a ValidationException exception.
string
The Amazon Resource Name (ARN) of a role with permission to access the data source. For more information, see IAM Roles for Amazon Kendra.
You can't specify the RoleArn parameter when the Type parameter is set to CUSTOM. If you do, you receive a ValidationException exception.
The RoleArn parameter is required for all other data sources.
list
A list of key-value pairs that identify the data source. You can use the tags to identify and organize your resources and to control access to resources.
(dict) --
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
Key (string) -- [REQUIRED]
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
Value (string) -- [REQUIRED]
The value associated with the tag. The value may be an empty string but it can't be null.
string
A token that you provide to identify the request to create a data source. Multiple calls to the CreateDataSource operation with the same client token will create only one data source.
This field is autopopulated if not provided.
dict
Response Syntax
{ 'Id': 'string' }
Response Structure
(dict) --
Id (string) --
A unique identifier for the data source.
{'Configuration': {'GoogleDriveConfiguration': {'ExcludeMimeTypes': ['string'], 'ExcludeSharedDrives': ['string'], 'ExcludeUserAccounts': ['string'], 'ExclusionPatterns': ['string'], 'FieldMappings': [{'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string'}], 'InclusionPatterns': ['string'], 'SecretArn': 'string'}}, 'Type': {'GOOGLEDRIVE'}}
Gets information about a Amazon Kendra data source.
See also: AWS API Documentation
Request Syntax
client.describe_data_source( Id='string', IndexId='string' )
string
[REQUIRED]
The unique identifier of the data source to describe.
string
[REQUIRED]
The identifier of the index that contains the data source.
dict
Response Syntax
{ 'Id': 'string', 'IndexId': 'string', 'Name': 'string', 'Type': 'S3'|'SHAREPOINT'|'DATABASE'|'SALESFORCE'|'ONEDRIVE'|'SERVICENOW'|'CUSTOM'|'CONFLUENCE'|'GOOGLEDRIVE', 'Configuration': { 'S3Configuration': { 'BucketName': 'string', 'InclusionPrefixes': [ 'string', ], 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'DocumentsMetadataConfiguration': { 'S3Prefix': 'string' }, 'AccessControlListConfiguration': { 'KeyPath': 'string' } }, 'SharePointConfiguration': { 'SharePointVersion': 'SHAREPOINT_ONLINE', 'Urls': [ 'string', ], 'SecretArn': 'string', 'CrawlAttachments': True|False, 'UseChangeLog': True|False, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DocumentTitleFieldName': 'string', 'DisableLocalGroups': True|False }, 'DatabaseConfiguration': { 'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL', 'ConnectionConfiguration': { 'DatabaseHost': 'string', 'DatabasePort': 123, 'DatabaseName': 'string', 'TableName': 'string', 'SecretArn': 'string' }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'ColumnConfiguration': { 'DocumentIdColumnName': 'string', 'DocumentDataColumnName': 'string', 'DocumentTitleColumnName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ChangeDetectingColumns': [ 'string', ] }, 'AclConfiguration': { 'AllowedGroupsColumnName': 'string' }, 'SqlConfiguration': { 'QueryIdentifiersEnclosingOption': 'DOUBLE_QUOTES'|'NONE' } }, 'SalesforceConfiguration': { 'ServerUrl': 'string', 'SecretArn': 'string', 'StandardObjectConfigurations': [ { 'Name': 'ACCOUNT'|'CAMPAIGN'|'CASE'|'CONTACT'|'CONTRACT'|'DOCUMENT'|'GROUP'|'IDEA'|'LEAD'|'OPPORTUNITY'|'PARTNER'|'PRICEBOOK'|'PRODUCT'|'PROFILE'|'SOLUTION'|'TASK'|'USER', 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, ], 'KnowledgeArticleConfiguration': { 'IncludedStates': [ 'DRAFT'|'PUBLISHED'|'ARCHIVED', ], 'StandardKnowledgeArticleTypeConfiguration': { 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'CustomKnowledgeArticleTypeConfigurations': [ { 'Name': 'string', 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, ] }, 'ChatterFeedConfiguration': { 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'IncludeFilterTypes': [ 'ACTIVE_USER'|'STANDARD_USER', ] }, 'CrawlAttachments': True|False, 'StandardObjectAttachmentConfiguration': { 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ] }, 'OneDriveConfiguration': { 'TenantDomain': 'string', 'SecretArn': 'string', 'OneDriveUsers': { 'OneDriveUserList': [ 'string', ], 'OneDriveUserS3Path': { 'Bucket': 'string', 'Key': 'string' } }, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DisableLocalGroups': True|False }, 'ServiceNowConfiguration': { 'HostUrl': 'string', 'SecretArn': 'string', 'ServiceNowBuildVersion': 'LONDON'|'OTHERS', 'KnowledgeArticleConfiguration': { 'CrawlAttachments': True|False, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ], 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'ServiceCatalogConfiguration': { 'CrawlAttachments': True|False, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ], 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] } }, 'ConfluenceConfiguration': { 'ServerUrl': 'string', 'SecretArn': 'string', 'Version': 'CLOUD'|'SERVER', 'SpaceConfiguration': { 'CrawlPersonalSpaces': True|False, 'CrawlArchivedSpaces': True|False, 'IncludeSpaces': [ 'string', ], 'ExcludeSpaces': [ 'string', ], 'SpaceFieldMappings': [ { 'DataSourceFieldName': 'DISPLAY_URL'|'ITEM_TYPE'|'SPACE_KEY'|'URL', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'PageConfiguration': { 'PageFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'CONTENT_STATUS'|'CREATED_DATE'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'MODIFIED_DATE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'BlogConfiguration': { 'BlogFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'PUBLISH_DATE'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'AttachmentConfiguration': { 'CrawlAttachments': True|False, 'AttachmentFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'CONTENT_TYPE'|'CREATED_DATE'|'DISPLAY_URL'|'FILE_SIZE'|'ITEM_TYPE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ] }, 'GoogleDriveConfiguration': { 'SecretArn': 'string', 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ExcludeMimeTypes': [ 'string', ], 'ExcludeUserAccounts': [ 'string', ], 'ExcludeSharedDrives': [ 'string', ] } }, 'CreatedAt': datetime(2015, 1, 1), 'UpdatedAt': datetime(2015, 1, 1), 'Description': 'string', 'Status': 'CREATING'|'DELETING'|'FAILED'|'UPDATING'|'ACTIVE', 'Schedule': 'string', 'RoleArn': 'string', 'ErrorMessage': 'string' }
Response Structure
(dict) --
Id (string) --
The identifier of the data source.
IndexId (string) --
The identifier of the index that contains the data source.
Name (string) --
The name that you gave the data source when it was created.
Type (string) --
The type of the data source.
Configuration (dict) --
Information that describes where the data source is located and how the data source is configured. The specific information in the description depends on the data source provider.
S3Configuration (dict) --
Provides information to create a data source connector for a document repository in an Amazon S3 bucket.
BucketName (string) --
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
(string) --
InclusionPatterns (list) --
A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia.
(string) --
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia.
(string) --
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the AWS S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.
KeyPath (string) --
Path to the AWS S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides information necessary to create a data source connector for a Microsoft SharePoint site.
SharePointVersion (string) --
The version of Microsoft SharePoint that you are using as a data source.
Urls (list) --
The URLs of the Microsoft SharePoint site that contains the documents that should be indexed.
(string) --
SecretArn (string) --
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Microsoft SharePoint Data Source. For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
CrawlAttachments (boolean) --
TRUE to include attachments to documents stored in your Microsoft SharePoint site in the index; otherwise, FALSE.
UseChangeLog (boolean) --
Set to TRUE to use the Microsoft SharePoint change log to determine the documents that need to be updated in the index. Depending on the size of the SharePoint change log, it may take longer for Amazon Kendra to use the change log than it takes it to determine the changed documents using the Amazon Kendra document crawler.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft SharePoint attributes to custom fields in the Amazon Kendra index. You must first create the index fields using the operation before you map SharePoint attributes. For more information, see Mapping Data Source Fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DisableLocalGroups (boolean) --
A Boolean value that specifies whether local groups are disabled ( True) or enabled ( False).
DatabaseConfiguration (dict) --
Provides information necessary to create a data source connector for a database.
DatabaseEngineType (string) --
The type of database engine that runs the database.
ConnectionConfiguration (dict) --
The information necessary to connect to a database.
DatabaseHost (string) --
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) --
The port that the database uses for connections.
DatabaseName (string) --
The name of the database containing the document data.
TableName (string) --
The name of the table that contains the document data.
SecretArn (string) --
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source. For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
ColumnConfiguration (dict) --
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) --
The column that provides the document's unique identifier.
DocumentDataColumnName (string) --
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ChangeDetectingColumns (list) --
One to five columns that indicate when a document in the database has changed.
(string) --
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) --
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext field of the Query operation.
SqlConfiguration (dict) --
Provides information about how Amazon Kendra uses quote marks around SQL identifiers when querying a database data source.
QueryIdentifiersEnclosingOption (string) --
Determines whether Amazon Kendra encloses SQL identifiers for tables and column names in double quotes (") when making a database query.
By default, Amazon Kendra passes SQL identifiers the way that they are entered into the data source configuration. It does not change the case of identifiers or enclose them in quotes.
PostgreSQL internally converts uppercase characters to lower case characters in identifiers unless they are quoted. Choosing this option encloses identifiers in quotes so that PostgreSQL does not convert the character's case.
For MySQL databases, you must enable the ansi_quotes option when you set this field to DOUBLE_QUOTES.
SalesforceConfiguration (dict) --
Provides configuration information for data sources that connect to a Salesforce site.
ServerUrl (string) --
The instance URL for the Salesforce site that you want to index.
SecretArn (string) --
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key/value pairs required to connect to your Salesforce instance. The secret must contain a JSON structure with the following keys:
authenticationUrl - The OAUTH endpoint that Amazon Kendra connects to get an OAUTH token.
consumerKey - The application public key generated when you created your Salesforce application.
consumerSecret - The application private key generated when you created your Salesforce application.
password - The password associated with the user logging in to the Salesforce instance.
securityToken - The token associated with the user account logging in to the Salesforce instance.
username - The user name of the user logging in to the Salesforce instance.
StandardObjectConfigurations (list) --
Specifies the Salesforce standard objects that Amazon Kendra indexes.
(dict) --
Specifies confguration information for indexing a single standard object.
Name (string) --
The name of the standard object.
DocumentDataFieldName (string) --
The name of the field in the standard object table that contains the document contents.
DocumentTitleFieldName (string) --
The name of the field in the standard object table that contains the document titleB.
FieldMappings (list) --
One or more objects that map fields in the standard object to Amazon Kendra index fields. The index field must exist before you can map a Salesforce field to it.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
KnowledgeArticleConfiguration (dict) --
Specifies configuration information for the knowlege article types that Amazon Kendra indexes. Amazon Kendra indexes standard knowledge articles and the standard fields of knowledge articles, or the custom fields of custom knowledge articles, but not both.
IncludedStates (list) --
Specifies the document states that should be included when Amazon Kendra indexes knowledge articles. You must specify at least one state.
(string) --
StandardKnowledgeArticleTypeConfiguration (dict) --
Provides configuration information for standard Salesforce knowledge articles.
DocumentDataFieldName (string) --
The name of the field that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field that contains the document title.
FieldMappings (list) --
One or more objects that map fields in the knowledge article to Amazon Kendra index fields. The index field must exist before you can map a Salesforce field to it.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
CustomKnowledgeArticleTypeConfigurations (list) --
Provides configuration information for custom Salesforce knowledge articles.
(dict) --
Provides configuration information for indexing Salesforce custom articles.
Name (string) --
The name of the configuration.
DocumentDataFieldName (string) --
The name of the field in the custom knowledge article that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field in the custom knowledge article that contains the document title.
FieldMappings (list) --
One or more objects that map fields in the custom knowledge article to fields in the Amazon Kendra index.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ChatterFeedConfiguration (dict) --
Specifies configuration information for Salesforce chatter feeds.
DocumentDataFieldName (string) --
The name of the column in the Salesforce FeedItem table that contains the content to index. Typically this is the Body column.
DocumentTitleFieldName (string) --
The name of the column in the Salesforce FeedItem table that contains the title of the document. This is typically the Title collumn.
FieldMappings (list) --
Maps fields from a Salesforce chatter feed into Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
IncludeFilterTypes (list) --
Filters the documents in the feed based on status of the user. When you specify ACTIVE_USERS only documents from users who have an active account are indexed. When you specify STANDARD_USER only documents for Salesforce standard users are documented. You can specify both.
(string) --
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to Salesforce objects.
StandardObjectAttachmentConfiguration (dict) --
Provides configuration information for processing attachments to Salesforce standard objects.
DocumentTitleFieldName (string) --
The name of the field used for the document title.
FieldMappings (list) --
One or more objects that map fields in attachments to Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
IncludeAttachmentFilePatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the name of the attached file.
(string) --
ExcludeAttachmentFilePatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the name of the attached file.
(string) --
OneDriveConfiguration (dict) --
Provides configuration for data sources that connect to Microsoft OneDrive.
TenantDomain (string) --
The Azure Active Directory domain of the organization.
SecretArn (string) --
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the user name and password to connect to OneDrive. The user namd should be the application ID for the OneDrive application, and the password is the application key for the OneDrive application.
OneDriveUsers (dict) --
A list of user accounts whose documents should be indexed.
OneDriveUserList (list) --
A list of users whose documents should be indexed. Specify the user names in email format, for example, username@tenantdomain. If you need to index the documents of more than 100 users, use the OneDriveUserS3Path field to specify the location of a file containing a list of users.
(string) --
OneDriveUserS3Path (dict) --
The S3 bucket location of a file containing a list of users whose documents should be indexed.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the pattern are included in the index. Documents that don't match the pattern are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The exclusion pattern is applied to the file name.
(string) --
ExclusionPatterns (list) --
List of regular expressions applied to documents. Items that match the exclusion pattern are not indexed. If you provide both an inclusion pattern and an exclusion pattern, any item that matches the exclusion pattern isn't indexed.
The exclusion pattern is applied to the file name.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft OneDrive fields to custom fields in the Amazon Kendra index. You must first create the index fields before you map OneDrive fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
DisableLocalGroups (boolean) --
A Boolean value that specifies whether local groups are disabled ( True) or enabled ( False).
ServiceNowConfiguration (dict) --
Provides configuration for data sources that connect to ServiceNow instances.
HostUrl (string) --
The ServiceNow instance that the data source connects to. The host endpoint should look like the following: {instance}.service-now.com.
SecretArn (string) --
The Amazon Resource Name (ARN) of the AWS Secret Manager secret that contains the user name and password required to connect to the ServiceNow instance.
ServiceNowBuildVersion (string) --
The identifier of the release that the ServiceNow host is running. If the host is not running the LONDON release, use OTHERS.
KnowledgeArticleConfiguration (dict) --
Provides configuration information for crawling knowledge articles in the ServiceNow site.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to knowledge articles.
IncludeAttachmentFilePatterns (list) --
List of regular expressions applied to knowledge articles. Items that don't match the inclusion pattern are not indexed. The regex is applied to the field specified in the PatternTargetField.
(string) --
ExcludeAttachmentFilePatterns (list) --
List of regular expressions applied to knowledge articles. Items that don't match the inclusion pattern are not indexed. The regex is applied to the field specified in the PatternTargetField
(string) --
DocumentDataFieldName (string) --
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Mapping between ServiceNow fields and Amazon Kendra index fields. You must create the index field before you map the field.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ServiceCatalogConfiguration (dict) --
Provides configuration information for crawling service catalogs in the ServiceNow site.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should crawl attachments to the service catalog items.
IncludeAttachmentFilePatterns (list) --
Determines the types of file attachments that are included in the index.
(string) --
ExcludeAttachmentFilePatterns (list) --
Determines the types of file attachments that are excluded from the index.
(string) --
DocumentDataFieldName (string) --
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Mapping between ServiceNow fields and Amazon Kendra index fields. You must create the index field before you map the field.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ConfluenceConfiguration (dict) --
Provides configuration information for connecting to a Confluence data source.
ServerUrl (string) --
The URL of your Confluence instance. Use the full URL of the server. For example, https://server.example.com:port/. You can also use an IP address, for example, https://192.168.1.113/.
SecretArn (string) --
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key/value pairs required to connect to your Confluence server. The secret must contain a JSON structure with the following keys:
username - The user name or email address of a user with administrative privileges for the Confluence server.
password - The password associated with the user logging in to the Confluence server.
Version (string) --
Specifies the version of the Confluence installation that you are connecting to.
SpaceConfiguration (dict) --
Specifies configuration information for indexing Confluence spaces.
CrawlPersonalSpaces (boolean) --
Specifies whether Amazon Kendra should index personal spaces. Users can add restrictions to items in personal spaces. If personal spaces are indexed, queries without user context information may return restricted items from a personal space in their results. For more information, see Filtering on user context.
CrawlArchivedSpaces (boolean) --
Specifies whether Amazon Kendra should index archived spaces.
IncludeSpaces (list) --
A list of space keys for Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are indexed. Spaces that aren't in the list aren't indexed. A space in the list must exist. Otherwise, Amazon Kendra logs an error when the data source is synchronized. If a space is in both the IncludeSpaces and the ExcludeSpaces list, the space is excluded.
(string) --
ExcludeSpaces (list) --
A list of space keys of Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are not indexed. If a space is in both the ExcludeSpaces and the IncludeSpaces list, the space is excluded.
(string) --
SpaceFieldMappings (list) --
Defines how space metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the SpaceFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
PageConfiguration (dict) --
Specifies configuration information for indexing Confluence pages.
PageFieldMappings (list) --
Defines how page metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the PageFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
BlogConfiguration (dict) --
Specifies configuration information for indexing Confluence blogs.
BlogFieldMappings (list) --
Defines how blog metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the BlogFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a blog field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
AttachmentConfiguration (dict) --
Specifies configuration information for indexing attachments to Confluence blogs and pages.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra indexes attachments to the pages and blogs in the Confluence data source.
AttachmentFieldMappings (list) --
Defines how attachment metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the AttachentFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
You must first create the index field using the operation.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
VpcConfiguration (dict) --
Specifies the information for connecting to an Amazon VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
InclusionPatterns (list) --
A list of regular expression patterns that apply to a URL on the Confluence server. An inclusion pattern can apply to a blog post, a page, a space, or an attachment. Items that match the patterns are included in the index. Items that don't match the pattern are excluded from the index. If an item matches both an inclusion pattern and an exclusion pattern, the item isn't included in the index.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns that apply to a URL on the Confluence server. An exclusion pattern can apply to a blog post, a page, a space, or an attachment. Items that match the pattern are excluded from the index. Items that don't match the pattern are included in the index. If a item matches both an exclusion pattern and an inclusion pattern, the item isn't included in the index.
(string) --
GoogleDriveConfiguration (dict) --
Provides configuration for data sources that connect to Google Drive.
SecretArn (string) --
The Amazon Resource Name (ARN) of a AWS Secrets Manager secret that contains the credentials required to connect to Google Drive. For more information, see Using a Google Workspace Drive data source.
InclusionPatterns (list) --
A list of regular expression patterns that apply to path on Google Drive. Items that match the pattern are included in the index from both shared drives and users' My Drives. Items that don't match the pattern are excluded from the index. If an item matches both an inclusion pattern and an exclusion pattern, it is excluded from the index.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns that apply to the path on Google Drive. Items that match the pattern are excluded from the index from both shared drives and users' My Drives. Items that don't match the pattern are included in the index. If an item matches both an exclusion pattern and an inclusion pattern, it is excluded from the index.
(string) --
FieldMappings (list) --
Defines mapping between a field in the Google Drive and a Amazon Kendra index field.
If you are using the console, you can define index fields when creating the mapping. If you are using the API, you must first create the field using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ExcludeMimeTypes (list) --
A list of MIME types to exclude from the index. All documents matching the specified MIME type are excluded.
For a list of MIME types, see Using a Google Workspace Drive data source.
(string) --
ExcludeUserAccounts (list) --
A list of email addresses of the users. Documents owned by these users are excluded from the index. Documents shared with excluded users are indexed unless they are excluded in another way.
(string) --
ExcludeSharedDrives (list) --
A list of identifiers or shared drives to exclude from the index. All files and folders stored on the shared drive are excluded.
(string) --
CreatedAt (datetime) --
The Unix timestamp of when the data source was created.
UpdatedAt (datetime) --
The Unix timestamp of when the data source was last updated.
Description (string) --
The description of the data source.
Status (string) --
The current status of the data source. When the status is ACTIVE the data source is ready to use. When the status is FAILED, the ErrorMessage field contains the reason that the data source failed.
Schedule (string) --
The schedule that Amazon Kendra will update the data source.
RoleArn (string) --
The Amazon Resource Name (ARN) of the role that enables the data source to access its resources.
ErrorMessage (string) --
When the Status field value is FAILED, the ErrorMessage field contains a description of the error that caused the data source to fail.
{'SummaryItems': {'Type': {'GOOGLEDRIVE'}}}
Lists the data sources that you have created.
See also: AWS API Documentation
Request Syntax
client.list_data_sources( IndexId='string', NextToken='string', MaxResults=123 )
string
[REQUIRED]
The identifier of the index that contains the data source.
string
If the previous response was incomplete (because there is more data to retrieve), Amazon Kendra returns a pagination token in the response. You can use this pagination token to retrieve the next set of data sources ( DataSourceSummaryItems).
integer
The maximum number of data sources to return.
dict
Response Syntax
{ 'SummaryItems': [ { 'Name': 'string', 'Id': 'string', 'Type': 'S3'|'SHAREPOINT'|'DATABASE'|'SALESFORCE'|'ONEDRIVE'|'SERVICENOW'|'CUSTOM'|'CONFLUENCE'|'GOOGLEDRIVE', 'CreatedAt': datetime(2015, 1, 1), 'UpdatedAt': datetime(2015, 1, 1), 'Status': 'CREATING'|'DELETING'|'FAILED'|'UPDATING'|'ACTIVE' }, ], 'NextToken': 'string' }
Response Structure
(dict) --
SummaryItems (list) --
An array of summary information for one or more data sources.
(dict) --
Summary information for a Amazon Kendra data source. Returned in a call to .
Name (string) --
The name of the data source.
Id (string) --
The unique identifier for the data source.
Type (string) --
The type of the data source.
CreatedAt (datetime) --
The UNIX datetime that the data source was created.
UpdatedAt (datetime) --
The UNIX datetime that the data source was lasted updated.
Status (string) --
The status of the data source. When the status is ATIVE the data source is ready to use.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of data sources.
{'VisitorId': 'string'}Response
{'ResultItems': {'FeedbackToken': 'string'}}
Searches an active index. Use this API to search your documents using query. The Query operation enables to do faceted search and to filter results based on document attributes.
It also enables you to provide user context that Amazon Kendra uses to enforce document access control in the search results.
Amazon Kendra searches your index for text content and question and answer (FAQ) content. By default the response contains three types of results.
Relevant passages
Matching FAQs
Relevant documents
You can specify that the query return only one type of result using the QueryResultTypeConfig parameter.
Each query returns the 100 most relevant results.
See also: AWS API Documentation
Request Syntax
client.query( IndexId='string', QueryText='string', AttributeFilter={ 'AndAllFilters': [ {'... recursive ...'}, ], 'OrAllFilters': [ {'... recursive ...'}, ], 'NotFilter': {'... recursive ...'}, 'EqualsTo': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, 'ContainsAll': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, 'ContainsAny': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, 'GreaterThan': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, 'GreaterThanOrEquals': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, 'LessThan': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, 'LessThanOrEquals': { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } } }, Facets=[ { 'DocumentAttributeKey': 'string' }, ], RequestedDocumentAttributes=[ 'string', ], QueryResultTypeFilter='DOCUMENT'|'QUESTION_ANSWER'|'ANSWER', PageNumber=123, PageSize=123, SortingConfiguration={ 'DocumentAttributeKey': 'string', 'SortOrder': 'DESC'|'ASC' }, UserContext={ 'Token': 'string' }, VisitorId='string' )
string
[REQUIRED]
The unique identifier of the index to search. The identifier is returned in the response from the operation.
string
[REQUIRED]
The text to search for.
dict
Enables filtered searches based on document attributes. You can only provide one attribute filter; however, the AndAllFilters, NotFilter, and OrAllFilters parameters contain a list of other filters.
The AttributeFilter parameter enables you to create a set of filtering rules that a document must satisfy to be included in the query results.
AndAllFilters (list) --
Performs a logical AND operation on all supplied filters.
(dict) --
Provides filtering the query results based on document attributes.
When you use the AndAllFilters or OrAllFilters, filters you can use 2 layers under the first attribute filter. For example, you can use:
<AndAllFilters>
<OrAllFilters>
<EqualTo>
If you use more than 2 layers, you receive a ValidationException exception with the message " AttributeFilter cannot have a depth of more than 2."
OrAllFilters (list) --
Performs a logical OR operation on all supplied filters.
(dict) --
Provides filtering the query results based on document attributes.
When you use the AndAllFilters or OrAllFilters, filters you can use 2 layers under the first attribute filter. For example, you can use:
<AndAllFilters>
<OrAllFilters>
<EqualTo>
If you use more than 2 layers, you receive a ValidationException exception with the message " AttributeFilter cannot have a depth of more than 2."
NotFilter (dict) --
Performs a logical NOT operation on all supplied filters.
EqualsTo (dict) --
Performs an equals operation on two document attributes.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
ContainsAll (dict) --
Returns true when a document contains all of the specified document attributes. This filter is only applicable to StringListValue metadata.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
ContainsAny (dict) --
Returns true when a document contains any of the specified document attributes. This filter is only applicable to StringListValue metadata.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
GreaterThan (dict) --
Performs a greater than operation on two document attributes. Use with a document attribute of type Integer or Long.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
GreaterThanOrEquals (dict) --
Performs a greater or equals than operation on two document attributes. Use with a document attribute of type Integer or Long.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
LessThan (dict) --
Performs a less than operation on two document attributes. Use with a document attribute of type Integer or Long.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
LessThanOrEquals (dict) --
Performs a less than or equals operation on two document attributes. Use with a document attribute of type Integer or Long.
Key (string) -- [REQUIRED]
The identifier for the attribute.
Value (dict) -- [REQUIRED]
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
list
An array of documents attributes. Amazon Kendra returns a count for each attribute key specified. You can use this information to help narrow the search for your user.
(dict) --
Information about a document attribute
DocumentAttributeKey (string) --
The unique key for the document attribute.
list
An array of document attributes to include in the response. No other document attributes are included in the response. By default all document attributes are included in the response.
(string) --
string
Sets the type of query. Only results for the specified query type are returned.
integer
Query results are returned in pages the size of the PageSize parameter. By default, Amazon Kendra returns the first page of results. Use this parameter to get result pages after the first one.
integer
Sets the number of results that are returned in each page of results. The default page size is 10. The maximum number of results returned is 100. If you ask for more than 100 results, only 100 are returned.
dict
Provides information that determines how the results of the query are sorted. You can set the field that Amazon Kendra should sort the results on, and specify whether the results should be sorted in ascending or descending order. In the case of ties in sorting the results, the results are sorted by relevance.
If you don't provide sorting configuration, the results are sorted by the relevance that Amazon Kendra determines for the result.
DocumentAttributeKey (string) -- [REQUIRED]
The name of the document attribute used to sort the response. You can use any field that has the Sortable flag set to true.
You can also sort by any of the following built-in attributes:
_category
_created_at
_last_updated_at
_version
_view_count
SortOrder (string) -- [REQUIRED]
The order that the results should be returned in. In case of ties, the relevance assigned to the result by Amazon Kendra is used as the tie-breaker.
dict
The user context token.
Token (string) --
The user context token. It must be a JWT or a JSON token.
string
Provides an identifier for a specific user. The VisitorId should be a unique identifier, such as a GUID. Don't use personally identifiable information, such as the user's email address, as the VisitorId.
dict
Response Syntax
{ 'QueryId': 'string', 'ResultItems': [ { 'Id': 'string', 'Type': 'DOCUMENT'|'QUESTION_ANSWER'|'ANSWER', 'AdditionalAttributes': [ { 'Key': 'string', 'ValueType': 'TEXT_WITH_HIGHLIGHTS_VALUE', 'Value': { 'TextWithHighlightsValue': { 'Text': 'string', 'Highlights': [ { 'BeginOffset': 123, 'EndOffset': 123, 'TopAnswer': True|False }, ] } } }, ], 'DocumentId': 'string', 'DocumentTitle': { 'Text': 'string', 'Highlights': [ { 'BeginOffset': 123, 'EndOffset': 123, 'TopAnswer': True|False }, ] }, 'DocumentExcerpt': { 'Text': 'string', 'Highlights': [ { 'BeginOffset': 123, 'EndOffset': 123, 'TopAnswer': True|False }, ] }, 'DocumentURI': 'string', 'DocumentAttributes': [ { 'Key': 'string', 'Value': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) } }, ], 'ScoreAttributes': { 'ScoreConfidence': 'VERY_HIGH'|'HIGH'|'MEDIUM'|'LOW' }, 'FeedbackToken': 'string' }, ], 'FacetResults': [ { 'DocumentAttributeKey': 'string', 'DocumentAttributeValueType': 'STRING_VALUE'|'STRING_LIST_VALUE'|'LONG_VALUE'|'DATE_VALUE', 'DocumentAttributeValueCountPairs': [ { 'DocumentAttributeValue': { 'StringValue': 'string', 'StringListValue': [ 'string', ], 'LongValue': 123, 'DateValue': datetime(2015, 1, 1) }, 'Count': 123 }, ] }, ], 'TotalNumberOfResults': 123 }
Response Structure
(dict) --
QueryId (string) --
The unique identifier for the search. You use QueryId to identify the search when using the feedback API.
ResultItems (list) --
The results of the search.
(dict) --
A single query result.
A query result contains information about a document returned by the query. This includes the original location of the document, a list of attributes assigned to the document, and relevant text from the document that satisfies the query.
Id (string) --
The unique identifier for the query result.
Type (string) --
The type of document.
AdditionalAttributes (list) --
One or more additional attributes associated with the query result.
(dict) --
An attribute returned from an index query.
Key (string) --
The key that identifies the attribute.
ValueType (string) --
The data type of the Value property.
Value (dict) --
An object that contains the attribute value.
TextWithHighlightsValue (dict) --
The text associated with the attribute and information about the highlight to apply to the text.
Text (string) --
The text to display to the user.
Highlights (list) --
The beginning and end of the text that should be highlighted.
(dict) --
Provides information that you can use to highlight a search result so that your users can quickly identify terms in the response.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
TopAnswer (boolean) --
Indicates whether the response is the best response. True if this is the best response; otherwise, false.
DocumentId (string) --
The unique identifier for the document.
DocumentTitle (dict) --
The title of the document. Contains the text of the title and information for highlighting the relevant terms in the title.
Text (string) --
The text to display to the user.
Highlights (list) --
The beginning and end of the text that should be highlighted.
(dict) --
Provides information that you can use to highlight a search result so that your users can quickly identify terms in the response.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
TopAnswer (boolean) --
Indicates whether the response is the best response. True if this is the best response; otherwise, false.
DocumentExcerpt (dict) --
An extract of the text in the document. Contains information about highlighting the relevant terms in the excerpt.
Text (string) --
The text to display to the user.
Highlights (list) --
The beginning and end of the text that should be highlighted.
(dict) --
Provides information that you can use to highlight a search result so that your users can quickly identify terms in the response.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
TopAnswer (boolean) --
Indicates whether the response is the best response. True if this is the best response; otherwise, false.
DocumentURI (string) --
The URI of the original location of the document.
DocumentAttributes (list) --
An array of document attributes for the document that the query result maps to. For example, the document author (Author) or the source URI (SourceUri) of the document.
(dict) --
A custom attribute value assigned to a document.
Key (string) --
The identifier for the attribute.
Value (dict) --
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
ScoreAttributes (dict) --
Indicates the confidence that Amazon Kendra has that a result matches the query that you provided. Each result is placed into a bin that indicates the confidence, VERY_HIGH, HIGH, MEDIUM and LOW. You can use the score to determine if a response meets the confidence needed for your application.
The field is only set to LOW when the Type field is set to DOCUMENT and Amazon Kendra is not confident that the result matches the query.
ScoreConfidence (string) --
A relative ranking for how well the response matches the query.
FeedbackToken (string) --
A token that identifies a particular result from a particular query. Use this token to provide click-through feedback for the result. For more information, see Submitting feedback.
FacetResults (list) --
Contains the facet results. A FacetResult contains the counts for each attribute key that was specified in the Facets input parameter.
(dict) --
The facet values for the documents in the response.
DocumentAttributeKey (string) --
The key for the facet values. This is the same as the DocumentAttributeKey provided in the query.
DocumentAttributeValueType (string) --
The data type of the facet value. This is the same as the type defined for the index field when it was created.
DocumentAttributeValueCountPairs (list) --
An array of key/value pairs, where the key is the value of the attribute and the count is the number of documents that share the key value.
(dict) --
Provides the count of documents that match a particular attribute when doing a faceted search.
DocumentAttributeValue (dict) --
The value of the attribute. For example, "HR."
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings.
(string) --
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
Count (integer) --
The number of documents in the response that have the attribute value for the key.
TotalNumberOfResults (integer) --
The total number of items found by the search; however, you can only retrieve up to 100 items. For example, if the search found 192 items, you can only retrieve the first 100 of the items.
{'Configuration': {'GoogleDriveConfiguration': {'ExcludeMimeTypes': ['string'], 'ExcludeSharedDrives': ['string'], 'ExcludeUserAccounts': ['string'], 'ExclusionPatterns': ['string'], 'FieldMappings': [{'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string'}], 'InclusionPatterns': ['string'], 'SecretArn': 'string'}}}
Updates an existing Amazon Kendra data source.
See also: AWS API Documentation
Request Syntax
client.update_data_source( Id='string', Name='string', IndexId='string', Configuration={ 'S3Configuration': { 'BucketName': 'string', 'InclusionPrefixes': [ 'string', ], 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'DocumentsMetadataConfiguration': { 'S3Prefix': 'string' }, 'AccessControlListConfiguration': { 'KeyPath': 'string' } }, 'SharePointConfiguration': { 'SharePointVersion': 'SHAREPOINT_ONLINE', 'Urls': [ 'string', ], 'SecretArn': 'string', 'CrawlAttachments': True|False, 'UseChangeLog': True|False, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DocumentTitleFieldName': 'string', 'DisableLocalGroups': True|False }, 'DatabaseConfiguration': { 'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL', 'ConnectionConfiguration': { 'DatabaseHost': 'string', 'DatabasePort': 123, 'DatabaseName': 'string', 'TableName': 'string', 'SecretArn': 'string' }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'ColumnConfiguration': { 'DocumentIdColumnName': 'string', 'DocumentDataColumnName': 'string', 'DocumentTitleColumnName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ChangeDetectingColumns': [ 'string', ] }, 'AclConfiguration': { 'AllowedGroupsColumnName': 'string' }, 'SqlConfiguration': { 'QueryIdentifiersEnclosingOption': 'DOUBLE_QUOTES'|'NONE' } }, 'SalesforceConfiguration': { 'ServerUrl': 'string', 'SecretArn': 'string', 'StandardObjectConfigurations': [ { 'Name': 'ACCOUNT'|'CAMPAIGN'|'CASE'|'CONTACT'|'CONTRACT'|'DOCUMENT'|'GROUP'|'IDEA'|'LEAD'|'OPPORTUNITY'|'PARTNER'|'PRICEBOOK'|'PRODUCT'|'PROFILE'|'SOLUTION'|'TASK'|'USER', 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, ], 'KnowledgeArticleConfiguration': { 'IncludedStates': [ 'DRAFT'|'PUBLISHED'|'ARCHIVED', ], 'StandardKnowledgeArticleTypeConfiguration': { 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'CustomKnowledgeArticleTypeConfigurations': [ { 'Name': 'string', 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, ] }, 'ChatterFeedConfiguration': { 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'IncludeFilterTypes': [ 'ACTIVE_USER'|'STANDARD_USER', ] }, 'CrawlAttachments': True|False, 'StandardObjectAttachmentConfiguration': { 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ] }, 'OneDriveConfiguration': { 'TenantDomain': 'string', 'SecretArn': 'string', 'OneDriveUsers': { 'OneDriveUserList': [ 'string', ], 'OneDriveUserS3Path': { 'Bucket': 'string', 'Key': 'string' } }, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'DisableLocalGroups': True|False }, 'ServiceNowConfiguration': { 'HostUrl': 'string', 'SecretArn': 'string', 'ServiceNowBuildVersion': 'LONDON'|'OTHERS', 'KnowledgeArticleConfiguration': { 'CrawlAttachments': True|False, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ], 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'ServiceCatalogConfiguration': { 'CrawlAttachments': True|False, 'IncludeAttachmentFilePatterns': [ 'string', ], 'ExcludeAttachmentFilePatterns': [ 'string', ], 'DocumentDataFieldName': 'string', 'DocumentTitleFieldName': 'string', 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] } }, 'ConfluenceConfiguration': { 'ServerUrl': 'string', 'SecretArn': 'string', 'Version': 'CLOUD'|'SERVER', 'SpaceConfiguration': { 'CrawlPersonalSpaces': True|False, 'CrawlArchivedSpaces': True|False, 'IncludeSpaces': [ 'string', ], 'ExcludeSpaces': [ 'string', ], 'SpaceFieldMappings': [ { 'DataSourceFieldName': 'DISPLAY_URL'|'ITEM_TYPE'|'SPACE_KEY'|'URL', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'PageConfiguration': { 'PageFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'CONTENT_STATUS'|'CREATED_DATE'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'MODIFIED_DATE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'BlogConfiguration': { 'BlogFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'PUBLISH_DATE'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'AttachmentConfiguration': { 'CrawlAttachments': True|False, 'AttachmentFieldMappings': [ { 'DataSourceFieldName': 'AUTHOR'|'CONTENT_TYPE'|'CREATED_DATE'|'DISPLAY_URL'|'FILE_SIZE'|'ITEM_TYPE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ] }, 'VpcConfiguration': { 'SubnetIds': [ 'string', ], 'SecurityGroupIds': [ 'string', ] }, 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ] }, 'GoogleDriveConfiguration': { 'SecretArn': 'string', 'InclusionPatterns': [ 'string', ], 'ExclusionPatterns': [ 'string', ], 'FieldMappings': [ { 'DataSourceFieldName': 'string', 'DateFieldFormat': 'string', 'IndexFieldName': 'string' }, ], 'ExcludeMimeTypes': [ 'string', ], 'ExcludeUserAccounts': [ 'string', ], 'ExcludeSharedDrives': [ 'string', ] } }, Description='string', Schedule='string', RoleArn='string' )
string
[REQUIRED]
The unique identifier of the data source to update.
string
The name of the data source to update. The name of the data source can't be updated. To rename a data source you must delete the data source and re-create it.
string
[REQUIRED]
The identifier of the index that contains the data source to update.
dict
Configuration information for a Amazon Kendra data source.
S3Configuration (dict) --
Provides information to create a data source connector for a document repository in an Amazon S3 bucket.
BucketName (string) -- [REQUIRED]
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
(string) --
InclusionPatterns (list) --
A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia.
(string) --
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.
For more information about glob patterns, see glob (programming) in Wikipedia.
(string) --
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the AWS S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.
KeyPath (string) --
Path to the AWS S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides information necessary to create a data source connector for a Microsoft SharePoint site.
SharePointVersion (string) -- [REQUIRED]
The version of Microsoft SharePoint that you are using as a data source.
Urls (list) -- [REQUIRED]
The URLs of the Microsoft SharePoint site that contains the documents that should be indexed.
(string) --
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Microsoft SharePoint Data Source. For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
CrawlAttachments (boolean) --
TRUE to include attachments to documents stored in your Microsoft SharePoint site in the index; otherwise, FALSE.
UseChangeLog (boolean) --
Set to TRUE to use the Microsoft SharePoint change log to determine the documents that need to be updated in the index. Depending on the size of the SharePoint change log, it may take longer for Amazon Kendra to use the change log than it takes it to determine the changed documents using the Amazon Kendra document crawler.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the display URL of the SharePoint document.
(string) --
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft SharePoint attributes to custom fields in the Amazon Kendra index. You must first create the index fields using the operation before you map SharePoint attributes. For more information, see Mapping Data Source Fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DisableLocalGroups (boolean) --
A Boolean value that specifies whether local groups are disabled ( True) or enabled ( False).
DatabaseConfiguration (dict) --
Provides information necessary to create a data source connector for a database.
DatabaseEngineType (string) -- [REQUIRED]
The type of database engine that runs the database.
ConnectionConfiguration (dict) -- [REQUIRED]
The information necessary to connect to a database.
DatabaseHost (string) -- [REQUIRED]
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) -- [REQUIRED]
The port that the database uses for connections.
DatabaseName (string) -- [REQUIRED]
The name of the database containing the document data.
TableName (string) -- [REQUIRED]
The name of the table that contains the document data.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of credentials stored in AWS Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source. For more information about AWS Secrets Manager, see What Is AWS Secrets Manager in the AWS Secrets Manager user guide.
VpcConfiguration (dict) --
Provides information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
ColumnConfiguration (dict) -- [REQUIRED]
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) -- [REQUIRED]
The column that provides the document's unique identifier.
DocumentDataColumnName (string) -- [REQUIRED]
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ChangeDetectingColumns (list) -- [REQUIRED]
One to five columns that indicate when a document in the database has changed.
(string) --
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) -- [REQUIRED]
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext field of the Query operation.
SqlConfiguration (dict) --
Provides information about how Amazon Kendra uses quote marks around SQL identifiers when querying a database data source.
QueryIdentifiersEnclosingOption (string) --
Determines whether Amazon Kendra encloses SQL identifiers for tables and column names in double quotes (") when making a database query.
By default, Amazon Kendra passes SQL identifiers the way that they are entered into the data source configuration. It does not change the case of identifiers or enclose them in quotes.
PostgreSQL internally converts uppercase characters to lower case characters in identifiers unless they are quoted. Choosing this option encloses identifiers in quotes so that PostgreSQL does not convert the character's case.
For MySQL databases, you must enable the ansi_quotes option when you set this field to DOUBLE_QUOTES.
SalesforceConfiguration (dict) --
Provides configuration information for data sources that connect to a Salesforce site.
ServerUrl (string) -- [REQUIRED]
The instance URL for the Salesforce site that you want to index.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key/value pairs required to connect to your Salesforce instance. The secret must contain a JSON structure with the following keys:
authenticationUrl - The OAUTH endpoint that Amazon Kendra connects to get an OAUTH token.
consumerKey - The application public key generated when you created your Salesforce application.
consumerSecret - The application private key generated when you created your Salesforce application.
password - The password associated with the user logging in to the Salesforce instance.
securityToken - The token associated with the user account logging in to the Salesforce instance.
username - The user name of the user logging in to the Salesforce instance.
StandardObjectConfigurations (list) --
Specifies the Salesforce standard objects that Amazon Kendra indexes.
(dict) --
Specifies confguration information for indexing a single standard object.
Name (string) -- [REQUIRED]
The name of the standard object.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the field in the standard object table that contains the document contents.
DocumentTitleFieldName (string) --
The name of the field in the standard object table that contains the document titleB.
FieldMappings (list) --
One or more objects that map fields in the standard object to Amazon Kendra index fields. The index field must exist before you can map a Salesforce field to it.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
KnowledgeArticleConfiguration (dict) --
Specifies configuration information for the knowlege article types that Amazon Kendra indexes. Amazon Kendra indexes standard knowledge articles and the standard fields of knowledge articles, or the custom fields of custom knowledge articles, but not both.
IncludedStates (list) -- [REQUIRED]
Specifies the document states that should be included when Amazon Kendra indexes knowledge articles. You must specify at least one state.
(string) --
StandardKnowledgeArticleTypeConfiguration (dict) --
Provides configuration information for standard Salesforce knowledge articles.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the field that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field that contains the document title.
FieldMappings (list) --
One or more objects that map fields in the knowledge article to Amazon Kendra index fields. The index field must exist before you can map a Salesforce field to it.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
CustomKnowledgeArticleTypeConfigurations (list) --
Provides configuration information for custom Salesforce knowledge articles.
(dict) --
Provides configuration information for indexing Salesforce custom articles.
Name (string) -- [REQUIRED]
The name of the configuration.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the field in the custom knowledge article that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field in the custom knowledge article that contains the document title.
FieldMappings (list) --
One or more objects that map fields in the custom knowledge article to fields in the Amazon Kendra index.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ChatterFeedConfiguration (dict) --
Specifies configuration information for Salesforce chatter feeds.
DocumentDataFieldName (string) -- [REQUIRED]
The name of the column in the Salesforce FeedItem table that contains the content to index. Typically this is the Body column.
DocumentTitleFieldName (string) --
The name of the column in the Salesforce FeedItem table that contains the title of the document. This is typically the Title collumn.
FieldMappings (list) --
Maps fields from a Salesforce chatter feed into Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
IncludeFilterTypes (list) --
Filters the documents in the feed based on status of the user. When you specify ACTIVE_USERS only documents from users who have an active account are indexed. When you specify STANDARD_USER only documents for Salesforce standard users are documented. You can specify both.
(string) --
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to Salesforce objects.
StandardObjectAttachmentConfiguration (dict) --
Provides configuration information for processing attachments to Salesforce standard objects.
DocumentTitleFieldName (string) --
The name of the field used for the document title.
FieldMappings (list) --
One or more objects that map fields in attachments to Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
IncludeAttachmentFilePatterns (list) --
A list of regular expression patterns. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The regex is applied to the name of the attached file.
(string) --
ExcludeAttachmentFilePatterns (list) --
A list of regular expression patterns. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an exclusion pattern and an inclusion pattern, the document is not included in the index.
The regex is applied to the name of the attached file.
(string) --
OneDriveConfiguration (dict) --
Provides configuration for data sources that connect to Microsoft OneDrive.
TenantDomain (string) -- [REQUIRED]
The Azure Active Directory domain of the organization.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the user name and password to connect to OneDrive. The user namd should be the application ID for the OneDrive application, and the password is the application key for the OneDrive application.
OneDriveUsers (dict) -- [REQUIRED]
A list of user accounts whose documents should be indexed.
OneDriveUserList (list) --
A list of users whose documents should be indexed. Specify the user names in email format, for example, username@tenantdomain. If you need to index the documents of more than 100 users, use the OneDriveUserS3Path field to specify the location of a file containing a list of users.
(string) --
OneDriveUserS3Path (dict) --
The S3 bucket location of a file containing a list of users whose documents should be indexed.
Bucket (string) -- [REQUIRED]
The name of the S3 bucket that contains the file.
Key (string) -- [REQUIRED]
The name of the file.
InclusionPatterns (list) --
A list of regular expression patterns. Documents that match the pattern are included in the index. Documents that don't match the pattern are excluded from the index. If a document matches both an inclusion pattern and an exclusion pattern, the document is not included in the index.
The exclusion pattern is applied to the file name.
(string) --
ExclusionPatterns (list) --
List of regular expressions applied to documents. Items that match the exclusion pattern are not indexed. If you provide both an inclusion pattern and an exclusion pattern, any item that matches the exclusion pattern isn't indexed.
The exclusion pattern is applied to the file name.
(string) --
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map Microsoft OneDrive fields to custom fields in the Amazon Kendra index. You must first create the index fields before you map OneDrive fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
DisableLocalGroups (boolean) --
A Boolean value that specifies whether local groups are disabled ( True) or enabled ( False).
ServiceNowConfiguration (dict) --
Provides configuration for data sources that connect to ServiceNow instances.
HostUrl (string) -- [REQUIRED]
The ServiceNow instance that the data source connects to. The host endpoint should look like the following: {instance}.service-now.com.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the AWS Secret Manager secret that contains the user name and password required to connect to the ServiceNow instance.
ServiceNowBuildVersion (string) -- [REQUIRED]
The identifier of the release that the ServiceNow host is running. If the host is not running the LONDON release, use OTHERS.
KnowledgeArticleConfiguration (dict) --
Provides configuration information for crawling knowledge articles in the ServiceNow site.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to knowledge articles.
IncludeAttachmentFilePatterns (list) --
List of regular expressions applied to knowledge articles. Items that don't match the inclusion pattern are not indexed. The regex is applied to the field specified in the PatternTargetField.
(string) --
ExcludeAttachmentFilePatterns (list) --
List of regular expressions applied to knowledge articles. Items that don't match the inclusion pattern are not indexed. The regex is applied to the field specified in the PatternTargetField
(string) --
DocumentDataFieldName (string) -- [REQUIRED]
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Mapping between ServiceNow fields and Amazon Kendra index fields. You must create the index field before you map the field.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ServiceCatalogConfiguration (dict) --
Provides configuration information for crawling service catalogs in the ServiceNow site.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should crawl attachments to the service catalog items.
IncludeAttachmentFilePatterns (list) --
Determines the types of file attachments that are included in the index.
(string) --
ExcludeAttachmentFilePatterns (list) --
Determines the types of file attachments that are excluded from the index.
(string) --
DocumentDataFieldName (string) -- [REQUIRED]
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Mapping between ServiceNow fields and Amazon Kendra index fields. You must create the index field before you map the field.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ConfluenceConfiguration (dict) --
Provides configuration information for connecting to a Confluence data source.
ServerUrl (string) -- [REQUIRED]
The URL of your Confluence instance. Use the full URL of the server. For example, https://server.example.com:port/. You can also use an IP address, for example, https://192.168.1.113/.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of an AWS Secrets Manager secret that contains the key/value pairs required to connect to your Confluence server. The secret must contain a JSON structure with the following keys:
username - The user name or email address of a user with administrative privileges for the Confluence server.
password - The password associated with the user logging in to the Confluence server.
Version (string) -- [REQUIRED]
Specifies the version of the Confluence installation that you are connecting to.
SpaceConfiguration (dict) --
Specifies configuration information for indexing Confluence spaces.
CrawlPersonalSpaces (boolean) --
Specifies whether Amazon Kendra should index personal spaces. Users can add restrictions to items in personal spaces. If personal spaces are indexed, queries without user context information may return restricted items from a personal space in their results. For more information, see Filtering on user context.
CrawlArchivedSpaces (boolean) --
Specifies whether Amazon Kendra should index archived spaces.
IncludeSpaces (list) --
A list of space keys for Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are indexed. Spaces that aren't in the list aren't indexed. A space in the list must exist. Otherwise, Amazon Kendra logs an error when the data source is synchronized. If a space is in both the IncludeSpaces and the ExcludeSpaces list, the space is excluded.
(string) --
ExcludeSpaces (list) --
A list of space keys of Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are not indexed. If a space is in both the ExcludeSpaces and the IncludeSpaces list, the space is excluded.
(string) --
SpaceFieldMappings (list) --
Defines how space metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the SpaceFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
PageConfiguration (dict) --
Specifies configuration information for indexing Confluence pages.
PageFieldMappings (list) --
Defines how page metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the PageFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
BlogConfiguration (dict) --
Specifies configuration information for indexing Confluence blogs.
BlogFieldMappings (list) --
Defines how blog metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the BlogFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a blog field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
AttachmentConfiguration (dict) --
Specifies configuration information for indexing attachments to Confluence blogs and pages.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra indexes attachments to the pages and blogs in the Confluence data source.
AttachmentFieldMappings (list) --
Defines how attachment metadata fields should be mapped to index fields. Before you can map a field, you must first create an index field with a matching type using the console or the UpdateIndex operation.
If you specify the AttachentFieldMappings parameter, you must specify at least one field mapping.
(dict) --
Defines the mapping between a field in the Confluence data source to a Amazon Kendra index field.
You must first create the index field using the operation.
DataSourceFieldName (string) --
The name of the field in the data source.
You must first create the index field using the operation.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
VpcConfiguration (dict) --
Specifies the information for connecting to an Amazon VPC.
SubnetIds (list) -- [REQUIRED]
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
(string) --
SecurityGroupIds (list) -- [REQUIRED]
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
(string) --
InclusionPatterns (list) --
A list of regular expression patterns that apply to a URL on the Confluence server. An inclusion pattern can apply to a blog post, a page, a space, or an attachment. Items that match the patterns are included in the index. Items that don't match the pattern are excluded from the index. If an item matches both an inclusion pattern and an exclusion pattern, the item isn't included in the index.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns that apply to a URL on the Confluence server. An exclusion pattern can apply to a blog post, a page, a space, or an attachment. Items that match the pattern are excluded from the index. Items that don't match the pattern are included in the index. If a item matches both an exclusion pattern and an inclusion pattern, the item isn't included in the index.
(string) --
GoogleDriveConfiguration (dict) --
Provides configuration for data sources that connect to Google Drive.
SecretArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of a AWS Secrets Manager secret that contains the credentials required to connect to Google Drive. For more information, see Using a Google Workspace Drive data source.
InclusionPatterns (list) --
A list of regular expression patterns that apply to path on Google Drive. Items that match the pattern are included in the index from both shared drives and users' My Drives. Items that don't match the pattern are excluded from the index. If an item matches both an inclusion pattern and an exclusion pattern, it is excluded from the index.
(string) --
ExclusionPatterns (list) --
A list of regular expression patterns that apply to the path on Google Drive. Items that match the pattern are excluded from the index from both shared drives and users' My Drives. Items that don't match the pattern are included in the index. If an item matches both an exclusion pattern and an inclusion pattern, it is excluded from the index.
(string) --
FieldMappings (list) --
Defines mapping between a field in the Google Drive and a Amazon Kendra index field.
If you are using the console, you can define index fields when creating the mapping. If you are using the API, you must first create the field using the UpdateIndex operation.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex operation.
DataSourceFieldName (string) -- [REQUIRED]
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) -- [REQUIRED]
The name of the field in the index.
ExcludeMimeTypes (list) --
A list of MIME types to exclude from the index. All documents matching the specified MIME type are excluded.
For a list of MIME types, see Using a Google Workspace Drive data source.
(string) --
ExcludeUserAccounts (list) --
A list of email addresses of the users. Documents owned by these users are excluded from the index. Documents shared with excluded users are indexed unless they are excluded in another way.
(string) --
ExcludeSharedDrives (list) --
A list of identifiers or shared drives to exclude from the index. All files and folders stored on the shared drive are excluded.
(string) --
string
The new description for the data source.
string
The new update schedule for the data source.
string
The Amazon Resource Name (ARN) of the new role to use when the data source is accessing resources on your behalf.
None