Amazon Transcribe Service

2018/04/04 - Amazon Transcribe Service - 5 new3 updated api methods

Changes  Update transcribe client to latest version

GetVocabulary (new) Link ¶

Gets information about a vocabulary.

See also: AWS API Documentation

Request Syntax

client.get_vocabulary(
    VocabularyName='string'
)
type VocabularyName:

string

param VocabularyName:

[REQUIRED]

The name of the vocabulary to return information about. The name is case-sensitive.

rtype:

dict

returns:

Response Syntax

{
    'VocabularyName': 'string',
    'LanguageCode': 'en-US'|'es-US',
    'VocabularyState': 'PENDING'|'READY'|'FAILED',
    'LastModifiedTime': datetime(2015, 1, 1),
    'FailureReason': 'string',
    'DownloadUri': 'string'
}

Response Structure

  • (dict) --

    • VocabularyName (string) --

      The name of the vocabulary to return.

    • LanguageCode (string) --

      The language code of the vocabulary entries.

    • VocabularyState (string) --

      The processing state of the vocabulary.

    • LastModifiedTime (datetime) --

      The date and time that the vocabulary was last modified.

    • FailureReason (string) --

      If the VocabularyState field is FAILED, this field contains information about why the job failed.

    • DownloadUri (string) --

      The S3 location where the vocabulary is stored. Use this URI to get the contents of the vocabulary. The URI is available for a limited time.

ListVocabularies (new) Link ¶

Returns a list of vocabularies that match the specified criteria. If no criteria are specified, returns the entire list of vocabularies.

See also: AWS API Documentation

Request Syntax

client.list_vocabularies(
    NextToken='string',
    MaxResults=123,
    StateEquals='PENDING'|'READY'|'FAILED',
    NameContains='string'
)
type NextToken:

string

param NextToken:

If the result of the previous request to ListVocabularies was truncated, include the NextToken to fetch the next set of jobs.

type MaxResults:

integer

param MaxResults:

The maximum number of vocabularies to return in the response. If there are fewer results in the list, this response contains only the actual results.

type StateEquals:

string

param StateEquals:

When specified, only returns vocabularies with the VocabularyState field equal to the specified state.

type NameContains:

string

param NameContains:

When specified, the vocabularies returned in the list are limited to vocabularies whose name contains the specified string. The search is case-insensitive, ListVocabularies will return both "vocabularyname" and "VocabularyName" in the response list.

rtype:

dict

returns:

Response Syntax

{
    'Status': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'Vocabularies': [
        {
            'VocabularyName': 'string',
            'LanguageCode': 'en-US'|'es-US',
            'LastModifiedTime': datetime(2015, 1, 1),
            'VocabularyState': 'PENDING'|'READY'|'FAILED'
        },
    ]
}

Response Structure

  • (dict) --

    • Status (string) --

      The requested vocabulary state.

    • NextToken (string) --

      The ListVocabularies operation returns a page of vocabularies at a time. The maximum size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListVocabularies operation to return in the next page of jobs.

    • Vocabularies (list) --

      A list of objects that describe the vocabularies that match the search criteria in the request.

      • (dict) --

        Provides information about a custom vocabulary.

        • VocabularyName (string) --

          The name of the vocabulary.

        • LanguageCode (string) --

          The language code of the vocabulary entries.

        • LastModifiedTime (datetime) --

          The date and time that the vocabulary was last modified.

        • VocabularyState (string) --

          The processing state of the vocabulary. If the state is READY you can use the vocabulary in a StartTranscriptionJob request.

DeleteVocabulary (new) Link ¶

Deletes a vocabulary from Amazon Transcribe.

See also: AWS API Documentation

Request Syntax

client.delete_vocabulary(
    VocabularyName='string'
)
type VocabularyName:

string

param VocabularyName:

[REQUIRED]

The name of the vocabulary to delete.

returns:

None

UpdateVocabulary (new) Link ¶

Updates an existing vocabulary with new values.

See also: AWS API Documentation

Request Syntax

client.update_vocabulary(
    VocabularyName='string',
    LanguageCode='en-US'|'es-US',
    Phrases=[
        'string',
    ]
)
type VocabularyName:

string

param VocabularyName:

[REQUIRED]

The name of the vocabulary to update. The name is case-sensitive.

type LanguageCode:

string

param LanguageCode:

[REQUIRED]

The language code of the vocabulary entries.

type Phrases:

list

param Phrases:

[REQUIRED]

An array of strings containing the vocabulary entries.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'VocabularyName': 'string',
    'LanguageCode': 'en-US'|'es-US',
    'LastModifiedTime': datetime(2015, 1, 1),
    'VocabularyState': 'PENDING'|'READY'|'FAILED'
}

Response Structure

  • (dict) --

    • VocabularyName (string) --

      The name of the vocabulary that was updated.

    • LanguageCode (string) --

      The language code of the vocabulary entries.

    • LastModifiedTime (datetime) --

      The date and time that the vocabulary was updated.

    • VocabularyState (string) --

      The processing state of the vocabulary. When the VocabularyState field contains READY the vocabulary is ready to be used in a StartTranscriptionJob request.

CreateVocabulary (new) Link ¶

Creates a new custom vocabulary that you can use to change the way Amazon Transcribe handles transcription of an audio file.

See also: AWS API Documentation

Request Syntax

client.create_vocabulary(
    VocabularyName='string',
    LanguageCode='en-US'|'es-US',
    Phrases=[
        'string',
    ]
)
type VocabularyName:

string

param VocabularyName:

[REQUIRED]

The name of the vocabulary. The name must be unique within an AWS account. The name is case-sensitive.

type LanguageCode:

string

param LanguageCode:

[REQUIRED]

The language code of the vocabulary entries.

type Phrases:

list

param Phrases:

[REQUIRED]

An array of strings that contains the vocabulary entries.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'VocabularyName': 'string',
    'LanguageCode': 'en-US'|'es-US',
    'VocabularyState': 'PENDING'|'READY'|'FAILED',
    'LastModifiedTime': datetime(2015, 1, 1),
    'FailureReason': 'string'
}

Response Structure

  • (dict) --

    • VocabularyName (string) --

      The name of the vocabulary.

    • LanguageCode (string) --

      The language code of the vocabulary entries.

    • VocabularyState (string) --

      The processing state of the vocabulary. When the VocabularyState field contains READY the vocabulary is ready to be used in a StartTranscriptionJob request.

    • LastModifiedTime (datetime) --

      The date and time that the vocabulary was created.

    • FailureReason (string) --

      If the VocabularyState field is FAILED, this field contains information about why the job failed.

GetTranscriptionJob (updated) Link ¶
Changes (response)
{'TranscriptionJob': {'Settings': {'MaxSpeakerLabels': 'integer',
                                   'ShowSpeakerLabels': 'boolean',
                                   'VocabularyName': 'string'}}}

Returns information about a transcription job. To see the status of the job, check the TranscriptionJobStatus field. If the status is COMPLETED, the job is finished and you can find the results at the location specified in the TranscriptionFileUri field.

See also: AWS API Documentation

Request Syntax

client.get_transcription_job(
    TranscriptionJobName='string'
)
type TranscriptionJobName:

string

param TranscriptionJobName:

[REQUIRED]

The name of the job.

rtype:

dict

returns:

Response Syntax

{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'VocabularyName': 'string',
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123
        }
    }
}

Response Structure

  • (dict) --

    • TranscriptionJob (dict) --

      An object that contains the results of the transcription job.

      • TranscriptionJobName (string) --

        A name to identify the transcription job.

      • TranscriptionJobStatus (string) --

        The status of the transcription job.

      • LanguageCode (string) --

        The language code for the input speech.

      • MediaSampleRateHertz (integer) --

        The sample rate, in Hertz, of the audio track in the input media file.

      • MediaFormat (string) --

        The format of the input media file.

      • Media (dict) --

        An object that describes the input media for a transcription job.

        • MediaFileUri (string) --

          The S3 location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-east-1.amazonaws.com/examplebucket/example.mp4

          https://s3-us-east-1.amazonaws.com/examplebucket/mediadocs/example.mp4

          For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide.

      • Transcript (dict) --

        An object that describes the output of the transcription job.

        • TranscriptFileUri (string) --

          The S3 location where the transcription result is stored. Use this URI to access the results of the transcription job.

      • CreationTime (datetime) --

        Timestamp of the date and time that the job was created.

      • CompletionTime (datetime) --

        Timestamp of the date and time that the job completed.

      • FailureReason (string) --

        If the TranscriptionJobStatus field is FAILED, this field contains information about why the job failed.

      • Settings (dict) --

        Optional settings for the transcription job.

        • VocabularyName (string) --

          The name of a vocabulary to use when processing the transcription job.

        • ShowSpeakerLabels (boolean) --

          Determines whether the transcription job should use speaker recognition to identify different speakers in the input audio. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.

        • MaxSpeakerLabels (integer) --

          The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers will be identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.

ListTranscriptionJobs (updated) Link ¶
Changes (request)
{'JobNameContains': 'string'}

Lists transcription jobs with the specified status.

See also: AWS API Documentation

Request Syntax

client.list_transcription_jobs(
    Status='IN_PROGRESS'|'FAILED'|'COMPLETED',
    JobNameContains='string',
    NextToken='string',
    MaxResults=123
)
type Status:

string

param Status:

When specified, returns only transcription jobs with the specified status.

type JobNameContains:

string

param JobNameContains:

When specified, the jobs returned in the list are limited to jobs whose name contains the specified string.

type NextToken:

string

param NextToken:

If the result of the previous request to ListTranscriptionJobs was truncated, include the NextToken to fetch the next set of jobs.

type MaxResults:

integer

param MaxResults:

The maximum number of jobs to return in the response. If there are fewer results in the list, this response contains only the actual results.

rtype:

dict

returns:

Response Syntax

{
    'Status': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'TranscriptionJobSummaries': [
        {
            'TranscriptionJobName': 'string',
            'CreationTime': datetime(2015, 1, 1),
            'CompletionTime': datetime(2015, 1, 1),
            'LanguageCode': 'en-US'|'es-US',
            'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
            'FailureReason': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • Status (string) --

      The requested status of the jobs returned.

    • NextToken (string) --

      The ListTranscriptionJobs operation returns a page of jobs at a time. The maximum size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListTranscriptionJobs operation to return in the next page of jobs.

    • TranscriptionJobSummaries (list) --

      A list of objects containing summary information for a transcription job.

      • (dict) --

        Provides a summary of information about a transcription job.

        • TranscriptionJobName (string) --

          The name assigned to the transcription job when it was created.

        • CreationTime (datetime) --

          Timestamp of the date and time that the job was created.

        • CompletionTime (datetime) --

          Timestamp of the date and time that the job completed.

        • LanguageCode (string) --

          The language code for the input speech.

        • TranscriptionJobStatus (string) --

          The status of the transcription job. When the status is COMPLETED, use the GetTranscriptionJob operation to get the results of the transcription.

        • FailureReason (string) --

          If the TranscriptionJobStatus field is FAILED, this field contains a description of the error.

StartTranscriptionJob (updated) Link ¶
Changes (request, response)
Request
{'Settings': {'MaxSpeakerLabels': 'integer',
              'ShowSpeakerLabels': 'boolean',
              'VocabularyName': 'string'}}
Response
{'TranscriptionJob': {'Settings': {'MaxSpeakerLabels': 'integer',
                                   'ShowSpeakerLabels': 'boolean',
                                   'VocabularyName': 'string'}}}

Starts an asynchronous job to transcribe speech to text.

See also: AWS API Documentation

Request Syntax

client.start_transcription_job(
    TranscriptionJobName='string',
    LanguageCode='en-US'|'es-US',
    MediaSampleRateHertz=123,
    MediaFormat='mp3'|'mp4'|'wav'|'flac',
    Media={
        'MediaFileUri': 'string'
    },
    Settings={
        'VocabularyName': 'string',
        'ShowSpeakerLabels': True|False,
        'MaxSpeakerLabels': 123
    }
)
type TranscriptionJobName:

string

param TranscriptionJobName:

[REQUIRED]

The name of the job. The name must be unique within an AWS account.

type LanguageCode:

string

param LanguageCode:

[REQUIRED]

The language code for the language used in the input media file.

type MediaSampleRateHertz:

integer

param MediaSampleRateHertz:

The sample rate, in Hertz, of the audio track in the input media file.

type MediaFormat:

string

param MediaFormat:

[REQUIRED]

The format of the input media file.

type Media:

dict

param Media:

[REQUIRED]

An object that describes the input media for a transcription job.

  • MediaFileUri (string) --

    The S3 location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:

    https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

    For example:

    https://s3-us-east-1.amazonaws.com/examplebucket/example.mp4

    https://s3-us-east-1.amazonaws.com/examplebucket/mediadocs/example.mp4

    For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide.

type Settings:

dict

param Settings:

A Settings object that provides optional settings for a transcription job.

  • VocabularyName (string) --

    The name of a vocabulary to use when processing the transcription job.

  • ShowSpeakerLabels (boolean) --

    Determines whether the transcription job should use speaker recognition to identify different speakers in the input audio. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.

  • MaxSpeakerLabels (integer) --

    The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers will be identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.

rtype:

dict

returns:

Response Syntax

{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'VocabularyName': 'string',
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123
        }
    }
}

Response Structure

  • (dict) --

    • TranscriptionJob (dict) --

      An object containing details of the asynchronous transcription job.

      • TranscriptionJobName (string) --

        A name to identify the transcription job.

      • TranscriptionJobStatus (string) --

        The status of the transcription job.

      • LanguageCode (string) --

        The language code for the input speech.

      • MediaSampleRateHertz (integer) --

        The sample rate, in Hertz, of the audio track in the input media file.

      • MediaFormat (string) --

        The format of the input media file.

      • Media (dict) --

        An object that describes the input media for a transcription job.

        • MediaFileUri (string) --

          The S3 location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:

          https://<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>

          For example:

          https://s3-us-east-1.amazonaws.com/examplebucket/example.mp4

          https://s3-us-east-1.amazonaws.com/examplebucket/mediadocs/example.mp4

          For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide.

      • Transcript (dict) --

        An object that describes the output of the transcription job.

        • TranscriptFileUri (string) --

          The S3 location where the transcription result is stored. Use this URI to access the results of the transcription job.

      • CreationTime (datetime) --

        Timestamp of the date and time that the job was created.

      • CompletionTime (datetime) --

        Timestamp of the date and time that the job completed.

      • FailureReason (string) --

        If the TranscriptionJobStatus field is FAILED, this field contains information about why the job failed.

      • Settings (dict) --

        Optional settings for the transcription job.

        • VocabularyName (string) --

          The name of a vocabulary to use when processing the transcription job.

        • ShowSpeakerLabels (boolean) --

          Determines whether the transcription job should use speaker recognition to identify different speakers in the input audio. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.

        • MaxSpeakerLabels (integer) --

          The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers will be identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.