Amazon Translate

2020/01/08 - Amazon Translate - 4 new api methods

Changes  This release adds a new family of APIs for asynchronous batch translation service that provides option to translate large collection of text or HTML documents stored in Amazon S3 folder. This service accepts a batch of up to 5 GB in size per API call with each document not exceeding 1 MB size and the number of documents not exceeding 1 million per batch. See documentation for more information.

DescribeTextTranslationJob (new) Link ¶

Gets the properties associated with an asycnhronous batch translation job including name, ID, status, source and target languages, input/output S3 buckets, and so on.

See also: AWS API Documentation

Request Syntax

client.describe_text_translation_job(
    JobId='string'
)
type JobId

string

param JobId

[REQUIRED]

The identifier that Amazon Translate generated for the job. The StartTextTranslationJob operation returns this identifier in its response.

rtype

dict

returns

Response Syntax

{
    'TextTranslationJobProperties': {
        'JobId': 'string',
        'JobName': 'string',
        'JobStatus': 'SUBMITTED'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERROR'|'FAILED'|'STOP_REQUESTED'|'STOPPED',
        'JobDetails': {
            'TranslatedDocumentsCount': 123,
            'DocumentsWithErrorsCount': 123,
            'InputDocumentsCount': 123
        },
        'SourceLanguageCode': 'string',
        'TargetLanguageCodes': [
            'string',
        ],
        'TerminologyNames': [
            'string',
        ],
        'Message': 'string',
        'SubmittedTime': datetime(2015, 1, 1),
        'EndTime': datetime(2015, 1, 1),
        'InputDataConfig': {
            'S3Uri': 'string',
            'ContentType': 'string'
        },
        'OutputDataConfig': {
            'S3Uri': 'string'
        },
        'DataAccessRoleArn': 'string'
    }
}

Response Structure

  • (dict) --

    • TextTranslationJobProperties (dict) --

      An object that contains the properties associated with an asynchronous batch translation job.

      • JobId (string) --

        The ID of the translation job.

      • JobName (string) --

        The user-defined name of the translation job.

      • JobStatus (string) --

        The status of the translation job.

      • JobDetails (dict) --

        The number of documents successfully and unsuccessfully processed during the translation job.

        • TranslatedDocumentsCount (integer) --

          The number of documents successfully processed during a translation job.

        • DocumentsWithErrorsCount (integer) --

          The number of documents that could not be processed during a translation job.

        • InputDocumentsCount (integer) --

          The number of documents used as input in a translation job.

      • SourceLanguageCode (string) --

        The language code of the language of the source text. The language must be a language supported by Amazon Translate.

      • TargetLanguageCodes (list) --

        The language code of the language of the target text. The language must be a language supported by Amazon Translate.

        • (string) --

      • TerminologyNames (list) --

        A list containing the names of the terminologies applied to a translation job. Only one terminology can be applied per StartTextTranslationJob request at this time.

        • (string) --

      • Message (string) --

        An explanation of any errors that may have occured during the translation job.

      • SubmittedTime (datetime) --

        The time at which the translation job was submitted.

      • EndTime (datetime) --

        The time at which the translation job ended.

      • InputDataConfig (dict) --

        The input configuration properties that were specified when the job was requested.

        • S3Uri (string) --

          The URI of the AWS S3 folder that contains the input file. The folder must be in the same Region as the API endpoint you are calling.

        • ContentType (string) --

          The multipurpose internet mail extension (MIME) type of the input files. Valid values are text/plain for plaintext files and text/html for HTML files.

      • OutputDataConfig (dict) --

        The output configuration properties that were specified when the job was requested.

        • S3Uri (string) --

          The URI of the S3 folder that contains a translation job's output file. The folder must be in the same Region as the API endpoint that you are calling.

      • DataAccessRoleArn (string) --

        The Amazon Resource Name (ARN) of an AWS Identity Access and Management (IAM) role that granted Amazon Translate read access to the job's input data.

StartTextTranslationJob (new) Link ¶

Starts an asynchronous batch translation job. Batch translation jobs can be used to translate large volumes of text across multiple documents at once. For more information, see async.

Batch translation jobs can be described with the DescribeTextTranslationJob operation, listed with the ListTextTranslationJobs operation, and stopped with the StopTextTranslationJob operation.

Note

Amazon Translate does not support batch translation of multiple source languages at once.

See also: AWS API Documentation

Request Syntax

client.start_text_translation_job(
    JobName='string',
    InputDataConfig={
        'S3Uri': 'string',
        'ContentType': 'string'
    },
    OutputDataConfig={
        'S3Uri': 'string'
    },
    DataAccessRoleArn='string',
    SourceLanguageCode='string',
    TargetLanguageCodes=[
        'string',
    ],
    TerminologyNames=[
        'string',
    ],
    ClientToken='string'
)
type JobName

string

param JobName

The name of the batch translation job to be performed.

type InputDataConfig

dict

param InputDataConfig

[REQUIRED]

Specifies the format and S3 location of the input documents for the translation job.

  • S3Uri (string) -- [REQUIRED]

    The URI of the AWS S3 folder that contains the input file. The folder must be in the same Region as the API endpoint you are calling.

  • ContentType (string) -- [REQUIRED]

    The multipurpose internet mail extension (MIME) type of the input files. Valid values are text/plain for plaintext files and text/html for HTML files.

type OutputDataConfig

dict

param OutputDataConfig

[REQUIRED]

Specifies the S3 folder to which your job output will be saved.

  • S3Uri (string) -- [REQUIRED]

    The URI of the S3 folder that contains a translation job's output file. The folder must be in the same Region as the API endpoint that you are calling.

type DataAccessRoleArn

string

param DataAccessRoleArn

[REQUIRED]

The Amazon Resource Name (ARN) of an AWS Identity Access and Management (IAM) role that grants Amazon Translate read access to your input data. For more nformation, see identity-and-access-management.

type SourceLanguageCode

string

param SourceLanguageCode

[REQUIRED]

The language code of the input language. For a list of language codes, see what-is-languages.

Amazon Translate does not automatically detect a source language during batch translation jobs.

type TargetLanguageCodes

list

param TargetLanguageCodes

[REQUIRED]

The language code of the output language.

  • (string) --

type TerminologyNames

list

param TerminologyNames

The name of the terminology to use in the batch translation job. For a list of available terminologies, use the ListTerminologies operation.

  • (string) --

type ClientToken

string

param ClientToken

[REQUIRED]

The client token of the EC2 instance calling the request. This token is auto-generated when using the Amazon Translate SDK. Otherwise, use the DescribeInstances EC2 operation to retreive an instance's client token. For more information, see Client Tokens in the EC2 User Guide.

This field is autopopulated if not provided.

rtype

dict

returns

Response Syntax

{
    'JobId': 'string',
    'JobStatus': 'SUBMITTED'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERROR'|'FAILED'|'STOP_REQUESTED'|'STOPPED'
}

Response Structure

  • (dict) --

    • JobId (string) --

      The identifier generated for the job. To get the status of a job, use this ID with the DescribeTextTranslationJob operation.

    • JobStatus (string) --

      The status of the job. Possible values include:

      • SUBMITTED - The job has been received and is queued for processing.

      • IN_PROGRESS - Amazon Translate is processing the job.

      • COMPLETED - The job was successfully completed and the output is available.

      • COMPLETED_WITH_ERRORS - The job was completed with errors. The errors can be analyzed in the job's output.

      • FAILED - The job did not complete. To get details, use the DescribeTextTranslationJob operation.

      • STOP_REQUESTED - The user who started the job has requested that it be stopped.

      • STOPPED - The job has been stopped.

StopTextTranslationJob (new) Link ¶

Stops an asynchronous batch translation job that is in progress.

If the job's state is IN_PROGRESS , the job will be marked for termination and put into the STOP_REQUESTED state. If the job completes before it can be stopped, it is put into the COMPLETED state. Otherwise, the job is put into the STOPPED state.

Asynchronous batch translation jobs are started with the StartTextTranslationJob operation. You can use the DescribeTextTranslationJob or ListTextTranslationJobs operations to get a batch translation job's JobId .

See also: AWS API Documentation

Request Syntax

client.stop_text_translation_job(
    JobId='string'
)
type JobId

string

param JobId

[REQUIRED]

The job ID of the job to be stopped.

rtype

dict

returns

Response Syntax

{
    'JobId': 'string',
    'JobStatus': 'SUBMITTED'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERROR'|'FAILED'|'STOP_REQUESTED'|'STOPPED'
}

Response Structure

  • (dict) --

    • JobId (string) --

      The job ID of the stopped batch translation job.

    • JobStatus (string) --

      The status of the designated job. Upon successful completion, the job's status will be STOPPED .

ListTextTranslationJobs (new) Link ¶

Gets a list of the batch translation jobs that you have submitted.

See also: AWS API Documentation

Request Syntax

client.list_text_translation_jobs(
    Filter={
        'JobName': 'string',
        'JobStatus': 'SUBMITTED'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERROR'|'FAILED'|'STOP_REQUESTED'|'STOPPED',
        'SubmittedBeforeTime': datetime(2015, 1, 1),
        'SubmittedAfterTime': datetime(2015, 1, 1)
    },
    NextToken='string',
    MaxResults=123
)
type Filter

dict

param Filter

The parameters that specify which batch translation jobs to retrieve. Filters include job name, job status, and submission time. You can only set one filter at a time.

  • JobName (string) --

    Filters the list of jobs by name.

  • JobStatus (string) --

    Filters the list of jobs based by job status.

  • SubmittedBeforeTime (datetime) --

    Filters the list of jobs based on the time that the job was submitted for processing and returns only the jobs submitted before the specified time. Jobs are returned in ascending order, oldest to newest.

  • SubmittedAfterTime (datetime) --

    Filters the list of jobs based on the time that the job was submitted for processing and returns only the jobs submitted after the specified time. Jobs are returned in descending order, newest to oldest.

type NextToken

string

param NextToken

The token to request the next page of results.

type MaxResults

integer

param MaxResults

The maximum number of results to return in each page. The default value is 100.

rtype

dict

returns

Response Syntax

{
    'TextTranslationJobPropertiesList': [
        {
            'JobId': 'string',
            'JobName': 'string',
            'JobStatus': 'SUBMITTED'|'IN_PROGRESS'|'COMPLETED'|'COMPLETED_WITH_ERROR'|'FAILED'|'STOP_REQUESTED'|'STOPPED',
            'JobDetails': {
                'TranslatedDocumentsCount': 123,
                'DocumentsWithErrorsCount': 123,
                'InputDocumentsCount': 123
            },
            'SourceLanguageCode': 'string',
            'TargetLanguageCodes': [
                'string',
            ],
            'TerminologyNames': [
                'string',
            ],
            'Message': 'string',
            'SubmittedTime': datetime(2015, 1, 1),
            'EndTime': datetime(2015, 1, 1),
            'InputDataConfig': {
                'S3Uri': 'string',
                'ContentType': 'string'
            },
            'OutputDataConfig': {
                'S3Uri': 'string'
            },
            'DataAccessRoleArn': 'string'
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • TextTranslationJobPropertiesList (list) --

      A list containing the properties of each job that is returned.

      • (dict) --

        Provides information about a translation job.

        • JobId (string) --

          The ID of the translation job.

        • JobName (string) --

          The user-defined name of the translation job.

        • JobStatus (string) --

          The status of the translation job.

        • JobDetails (dict) --

          The number of documents successfully and unsuccessfully processed during the translation job.

          • TranslatedDocumentsCount (integer) --

            The number of documents successfully processed during a translation job.

          • DocumentsWithErrorsCount (integer) --

            The number of documents that could not be processed during a translation job.

          • InputDocumentsCount (integer) --

            The number of documents used as input in a translation job.

        • SourceLanguageCode (string) --

          The language code of the language of the source text. The language must be a language supported by Amazon Translate.

        • TargetLanguageCodes (list) --

          The language code of the language of the target text. The language must be a language supported by Amazon Translate.

          • (string) --

        • TerminologyNames (list) --

          A list containing the names of the terminologies applied to a translation job. Only one terminology can be applied per StartTextTranslationJob request at this time.

          • (string) --

        • Message (string) --

          An explanation of any errors that may have occured during the translation job.

        • SubmittedTime (datetime) --

          The time at which the translation job was submitted.

        • EndTime (datetime) --

          The time at which the translation job ended.

        • InputDataConfig (dict) --

          The input configuration properties that were specified when the job was requested.

          • S3Uri (string) --

            The URI of the AWS S3 folder that contains the input file. The folder must be in the same Region as the API endpoint you are calling.

          • ContentType (string) --

            The multipurpose internet mail extension (MIME) type of the input files. Valid values are text/plain for plaintext files and text/html for HTML files.

        • OutputDataConfig (dict) --

          The output configuration properties that were specified when the job was requested.

          • S3Uri (string) --

            The URI of the S3 folder that contains a translation job's output file. The folder must be in the same Region as the API endpoint that you are calling.

        • DataAccessRoleArn (string) --

          The Amazon Resource Name (ARN) of an AWS Identity Access and Management (IAM) role that granted Amazon Translate read access to the job's input data.

    • NextToken (string) --

      The token to use to retreive the next page of results. This value is null when there are no more results to return.