Amazon Transcribe Service

2018/01/19 - Amazon Transcribe Service - 3 new api methods

Changes  Amazon Transcribe Public Preview Release

StartTranscriptionJob (new) Link ¶

Starts an asynchronous job to transcribe speech to text.

See also: AWS API Documentation

Request Syntax

client.start_transcription_job(
    TranscriptionJobName='string',
    LanguageCode='en-US'|'es-US',
    MediaSampleRateHertz=123,
    MediaFormat='mp3'|'mp4'|'wav'|'flac',
    Media={
        'MediaFileUri': 'string'
    }
)
type TranscriptionJobName

string

param TranscriptionJobName

[REQUIRED]

The name of the job. The name must be unique within an AWS account.

type LanguageCode

string

param LanguageCode

[REQUIRED]

The language code for the language used in the input media file.

type MediaSampleRateHertz

integer

param MediaSampleRateHertz

The sample rate, in Hertz, of the audio track in the input media file.

type MediaFormat

string

param MediaFormat

[REQUIRED]

The format of the input media file.

type Media

dict

param Media

[REQUIRED]

An object that describes the input media for a transcription job.

  • MediaFileUri (string) --

    The S3 location of the input media file. The general form is:

    https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

    For example:

    https://s3-us-west-2.amazonaws.com/examplebucket/example.mp4

    https://s3-us-west-2.amazonaws.com/examplebucket/mediadocs/example.mp4

rtype

dict

returns

Response Syntax

{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string'
    }
}

Response Structure

  • (dict) --

    • TranscriptionJob (dict) --

      An object containing details of the asynchronous transcription job.

      • TranscriptionJobName (string) --

        A name to identify the transcription job.

      • TranscriptionJobStatus (string) --

        The identifier assigned to the job when it was created.

      • LanguageCode (string) --

        The language code for the input speech.

      • MediaSampleRateHertz (integer) --

        The sample rate, in Hertz, of the audio track in the input media file.

      • MediaFormat (string) --

        The format of the input media file.

      • Media (dict) --

        An object that describes the input media for a transcription job.

        • MediaFileUri (string) --

          The S3 location of the input media file. The general form is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-west-2.amazonaws.com/examplebucket/example.mp4

          https://s3-us-west-2.amazonaws.com/examplebucket/mediadocs/example.mp4

      • Transcript (dict) --

        An object that describes the output of the transcription job.

        • TranscriptFileUri (string) --

          The S3 location where the transcription result is stored. The general form of this Uri is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-west-2.amazonaws.com/examplebucket/example.json

          https://s3-us-west-2.amazonaws.com/examplebucket/mediadocs/example.json

      • CreationTime (datetime) --

        Timestamp of the date and time that the job was created.

      • CompletionTime (datetime) --

        Timestamp of the date and time that the job completed.

      • FailureReason (string) --

        If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.

ListTranscriptionJobs (new) Link ¶

Lists transcription jobs with the specified status.

See also: AWS API Documentation

Request Syntax

client.list_transcription_jobs(
    Status='IN_PROGRESS'|'FAILED'|'COMPLETED',
    NextToken='string',
    MaxResults=123
)
type Status

string

param Status

[REQUIRED]

When specified, returns only transcription jobs with the specified status.

type NextToken

string

param NextToken

If the result of the previous request to ListTranscriptionJobs was truncated, include the NextToken to fetch the next set of jobs.

type MaxResults

integer

param MaxResults

The maximum number of jobs to return in the response.

rtype

dict

returns

Response Syntax

{
    'Status': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'TranscriptionJobSummaries': [
        {
            'TranscriptionJobName': 'string',
            'CreationTime': datetime(2015, 1, 1),
            'CompletionTime': datetime(2015, 1, 1),
            'LanguageCode': 'en-US'|'es-US',
            'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
            'FailureReason': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • Status (string) --

      The requested status of the jobs returned.

    • NextToken (string) --

      The ListTranscriptionJobs operation returns a page of jobs at a time. The size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListTranscriptionJobs operation to return in the next page of jobs.

    • TranscriptionJobSummaries (list) --

      A list of objects containing summary information for a transcription job.

      • (dict) --

        Provides a summary of information about a transcription job.

        • TranscriptionJobName (string) --

          The name assigned to the transcription job when it was created.

        • CreationTime (datetime) --

          Timestamp of the date and time that the job was created.

        • CompletionTime (datetime) --

          Timestamp of the date and time that the job completed.

        • LanguageCode (string) --

          The language code for the input speech.

        • TranscriptionJobStatus (string) --

          The status of the transcription job. When the status is COMPLETED , use the GetTranscriptionJob operation to get the results of the transcription.

        • FailureReason (string) --

          If the TranscriptionJobStatus field is FAILED , this field contains a description of the error.

GetTranscriptionJob (new) Link ¶

Returns information about a transcription job. To see the status of the job, check the Status field. If the status is COMPLETE , the job is finished and you can find the results at the location specified in the TranscriptionFileUri field.

See also: AWS API Documentation

Request Syntax

client.get_transcription_job(
    TranscriptionJobName='string'
)
type TranscriptionJobName

string

param TranscriptionJobName

[REQUIRED]

The name of the job.

rtype

dict

returns

Response Syntax

{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string'
    }
}

Response Structure

  • (dict) --

    • TranscriptionJob (dict) --

      An object that contains the results of the transcription job.

      • TranscriptionJobName (string) --

        A name to identify the transcription job.

      • TranscriptionJobStatus (string) --

        The identifier assigned to the job when it was created.

      • LanguageCode (string) --

        The language code for the input speech.

      • MediaSampleRateHertz (integer) --

        The sample rate, in Hertz, of the audio track in the input media file.

      • MediaFormat (string) --

        The format of the input media file.

      • Media (dict) --

        An object that describes the input media for a transcription job.

        • MediaFileUri (string) --

          The S3 location of the input media file. The general form is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-west-2.amazonaws.com/examplebucket/example.mp4

          https://s3-us-west-2.amazonaws.com/examplebucket/mediadocs/example.mp4

      • Transcript (dict) --

        An object that describes the output of the transcription job.

        • TranscriptFileUri (string) --

          The S3 location where the transcription result is stored. The general form of this Uri is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-west-2.amazonaws.com/examplebucket/example.json

          https://s3-us-west-2.amazonaws.com/examplebucket/mediadocs/example.json

      • CreationTime (datetime) --

        Timestamp of the date and time that the job was created.

      • CompletionTime (datetime) --

        Timestamp of the date and time that the job completed.

      • FailureReason (string) --

        If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.