Amazon Voice ID

2022/05/25 - Amazon Voice ID - 4 updated api methods

Changes  VoiceID will now automatically expire Speakers if they haven't been accessed for Enrollment, Re-enrollment or Successful Auth for three years. The Speaker APIs now return a "LastAccessedAt" time for Speakers, and the EvaluateSession API returns "SPEAKER_EXPIRED" Auth Decision for EXPIRED Speakers.

DescribeSpeaker (updated) Link ¶
Changes (response)
{'Speaker': {'LastAccessedAt': 'timestamp'}}

Describes the specified speaker.

See also: AWS API Documentation

Request Syntax

client.describe_speaker(
    DomainId='string',
    SpeakerId='string'
)
type DomainId

string

param DomainId

[REQUIRED]

The identifier of the domain that contains the speaker.

type SpeakerId

string

param SpeakerId

[REQUIRED]

The identifier of the speaker you are describing.

rtype

dict

returns

Response Syntax

{
    'Speaker': {
        'CreatedAt': datetime(2015, 1, 1),
        'CustomerSpeakerId': 'string',
        'DomainId': 'string',
        'GeneratedSpeakerId': 'string',
        'LastAccessedAt': datetime(2015, 1, 1),
        'Status': 'ENROLLED'|'EXPIRED'|'OPTED_OUT'|'PENDING',
        'UpdatedAt': datetime(2015, 1, 1)
    }
}

Response Structure

  • (dict) --

    • Speaker (dict) --

      Information about the specified speaker.

      • CreatedAt (datetime) --

        A timestamp showing when the speaker is created.

      • CustomerSpeakerId (string) --

        The client-provided identifier for the speaker.

      • DomainId (string) --

        The identifier of the domain that contains the speaker.

      • GeneratedSpeakerId (string) --

        The service-generated identifier for the speaker.

      • LastAccessedAt (datetime) --

        The timestamp when the speaker was last accessed for enrollment, re-enrollment or a successful authentication. This timestamp is accurate to one hour.

      • Status (string) --

        The current status of the speaker.

      • UpdatedAt (datetime) --

        A timestamp showing the speaker's last update.

EvaluateSession (updated) Link ¶
Changes (response)
{'AuthenticationResult': {'Decision': {'SPEAKER_EXPIRED'}}}

Evaluates a specified session based on audio data accumulated during a streaming Amazon Connect Voice ID call.

See also: AWS API Documentation

Request Syntax

client.evaluate_session(
    DomainId='string',
    SessionNameOrId='string'
)
type DomainId

string

param DomainId

[REQUIRED]

The identifier of the domain where the session started.

type SessionNameOrId

string

param SessionNameOrId

[REQUIRED]

The session identifier, or name of the session, that you want to evaluate. In Voice ID integration, this is the Contact-Id.

rtype

dict

returns

Response Syntax

{
    'AuthenticationResult': {
        'AudioAggregationEndedAt': datetime(2015, 1, 1),
        'AudioAggregationStartedAt': datetime(2015, 1, 1),
        'AuthenticationResultId': 'string',
        'Configuration': {
            'AcceptanceThreshold': 123
        },
        'CustomerSpeakerId': 'string',
        'Decision': 'ACCEPT'|'REJECT'|'NOT_ENOUGH_SPEECH'|'SPEAKER_NOT_ENROLLED'|'SPEAKER_OPTED_OUT'|'SPEAKER_ID_NOT_PROVIDED'|'SPEAKER_EXPIRED',
        'GeneratedSpeakerId': 'string',
        'Score': 123
    },
    'DomainId': 'string',
    'FraudDetectionResult': {
        'AudioAggregationEndedAt': datetime(2015, 1, 1),
        'AudioAggregationStartedAt': datetime(2015, 1, 1),
        'Configuration': {
            'RiskThreshold': 123
        },
        'Decision': 'HIGH_RISK'|'LOW_RISK'|'NOT_ENOUGH_SPEECH',
        'FraudDetectionResultId': 'string',
        'Reasons': [
            'KNOWN_FRAUDSTER',
        ],
        'RiskDetails': {
            'KnownFraudsterRisk': {
                'GeneratedFraudsterId': 'string',
                'RiskScore': 123
            }
        }
    },
    'SessionId': 'string',
    'SessionName': 'string',
    'StreamingStatus': 'PENDING_CONFIGURATION'|'ONGOING'|'ENDED'
}

Response Structure

  • (dict) --

    • AuthenticationResult (dict) --

      Details resulting from the authentication process, such as authentication decision and authentication score.

      • AudioAggregationEndedAt (datetime) --

        A timestamp indicating when audio aggregation ended for this authentication result.

      • AudioAggregationStartedAt (datetime) --

        A timestamp indicating when audio aggregation started for this authentication result.

      • AuthenticationResultId (string) --

        The unique identifier for this authentication result. Because there can be multiple authentications for a given session, this field helps to identify if the returned result is from a previous streaming activity or a new result. Note that in absence of any new streaming activity, AcceptanceThreshold changes, or SpeakerId changes, Voice ID always returns cached Authentication Result for this API.

      • Configuration (dict) --

        The AuthenticationConfiguration used to generate this authentication result.

        • AcceptanceThreshold (integer) --

          The minimum threshold needed to successfully authenticate a speaker.

      • CustomerSpeakerId (string) --

        The client-provided identifier for the speaker whose authentication result is produced. Only present if a SpeakerId is provided for the session.

      • Decision (string) --

        The authentication decision produced by Voice ID, processed against the current session state and streamed audio of the speaker.

      • GeneratedSpeakerId (string) --

        The service-generated identifier for the speaker whose authentication result is produced.

      • Score (integer) --

        The authentication score for the speaker whose authentication result is produced. This value is only present if the authentication decision is either ACCEPT or REJECT .

    • DomainId (string) --

      The identifier of the domain containing the session.

    • FraudDetectionResult (dict) --

      Details resulting from the fraud detection process, such as fraud detection decision and risk score.

      • AudioAggregationEndedAt (datetime) --

        A timestamp indicating when audio aggregation ended for this fraud detection result.

      • AudioAggregationStartedAt (datetime) --

        A timestamp indicating when audio aggregation started for this fraud detection result.

      • Configuration (dict) --

        The FraudDetectionConfiguration used to generate this fraud detection result.

        • RiskThreshold (integer) --

          Threshold value for determining whether the speaker is a fraudster. If the detected risk score calculated by Voice ID is higher than the threshold, the speaker is considered a fraudster.

      • Decision (string) --

        The fraud detection decision produced by Voice ID, processed against the current session state and streamed audio of the speaker.

      • FraudDetectionResultId (string) --

        The unique identifier for this fraud detection result. Given there can be multiple fraud detections for a given session, this field helps in identifying if the returned result is from previous streaming activity or a new result. Note that in the absence of any new streaming activity or risk threshold changes, Voice ID always returns cached Fraud Detection result for this API.

      • Reasons (list) --

        The reason speaker was flagged by the fraud detection system. This is only be populated if fraud detection Decision is HIGH_RISK , and only has one possible value: KNOWN_FRAUDSTER .

        • (string) --

      • RiskDetails (dict) --

        Details about each risk analyzed for this speaker.

        • KnownFraudsterRisk (dict) --

          The details resulting from 'Known Fraudster Risk' analysis of the speaker.

          • GeneratedFraudsterId (string) --

            The identifier of the fraudster that is the closest match to the speaker. If there are no fraudsters registered in a given domain, or if there are no fraudsters with a non-zero RiskScore, this value is null .

          • RiskScore (integer) --

            The score indicating the likelihood the speaker is a known fraudster.

    • SessionId (string) --

      The service-generated identifier of the session.

    • SessionName (string) --

      The client-provided name of the session.

    • StreamingStatus (string) --

      The current status of audio streaming for this session. This field is useful to infer next steps when the Authentication or Fraud Detection results are empty or the decision is NOT_ENOUGH_SPEECH . In this situation, if the StreamingStatus is ONGOING/PENDING_CONFIGURATION , it can mean that the client should call the API again later, once Voice ID has enough audio to produce a result. If the decision remains NOT_ENOUGH_SPEECH even after StreamingStatus is ENDED , it means that the previously streamed session did not have enough speech to perform evaluation, and a new streaming session is needed to try again.

ListSpeakers (updated) Link ¶
Changes (response)
{'SpeakerSummaries': {'LastAccessedAt': 'timestamp'}}

Lists all speakers in a specified domain.

See also: AWS API Documentation

Request Syntax

client.list_speakers(
    DomainId='string',
    MaxResults=123,
    NextToken='string'
)
type DomainId

string

param DomainId

[REQUIRED]

The identifier of the domain.

type MaxResults

integer

param MaxResults

The maximum number of results that are returned per call. You can use NextToken to obtain further pages of results. The default is 100; the maximum allowed page size is also 100.

type NextToken

string

param NextToken

If NextToken is returned, there are more results available. The value of NextToken is a unique pagination token for each page. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours.

rtype

dict

returns

Response Syntax

{
    'NextToken': 'string',
    'SpeakerSummaries': [
        {
            'CreatedAt': datetime(2015, 1, 1),
            'CustomerSpeakerId': 'string',
            'DomainId': 'string',
            'GeneratedSpeakerId': 'string',
            'LastAccessedAt': datetime(2015, 1, 1),
            'Status': 'ENROLLED'|'EXPIRED'|'OPTED_OUT'|'PENDING',
            'UpdatedAt': datetime(2015, 1, 1)
        },
    ]
}

Response Structure

  • (dict) --

    • NextToken (string) --

      If NextToken is returned, there are more results available. The value of NextToken is a unique pagination token for each page. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours.

    • SpeakerSummaries (list) --

      A list containing details about each speaker in the Amazon Web Services account.

      • (dict) --

        Contains a summary of information about a speaker.

        • CreatedAt (datetime) --

          A timestamp showing the speaker's creation time.

        • CustomerSpeakerId (string) --

          The client-provided identifier for the speaker.

        • DomainId (string) --

          The identifier of the domain that contains the speaker.

        • GeneratedSpeakerId (string) --

          The service-generated identifier for the speaker.

        • LastAccessedAt (datetime) --

          The timestamp when the speaker was last accessed for enrollment, re-enrollment or a successful authentication. This timestamp is accurate to one hour.

        • Status (string) --

          The current status of the speaker.

        • UpdatedAt (datetime) --

          A timestamp showing the speaker's last update.

OptOutSpeaker (updated) Link ¶
Changes (response)
{'Speaker': {'LastAccessedAt': 'timestamp'}}

Opts out a speaker from Voice ID system. A speaker can be opted out regardless of whether or not they already exist in the system. If they don't yet exist, a new speaker is created in an opted out state. If they already exist, their existing status is overridden and they are opted out. Enrollment and evaluation authentication requests are rejected for opted out speakers, and opted out speakers have no voice embeddings stored in the system.

See also: AWS API Documentation

Request Syntax

client.opt_out_speaker(
    DomainId='string',
    SpeakerId='string'
)
type DomainId

string

param DomainId

[REQUIRED]

The identifier of the domain containing the speaker.

type SpeakerId

string

param SpeakerId

[REQUIRED]

The identifier of the speaker you want opted-out.

rtype

dict

returns

Response Syntax

{
    'Speaker': {
        'CreatedAt': datetime(2015, 1, 1),
        'CustomerSpeakerId': 'string',
        'DomainId': 'string',
        'GeneratedSpeakerId': 'string',
        'LastAccessedAt': datetime(2015, 1, 1),
        'Status': 'ENROLLED'|'EXPIRED'|'OPTED_OUT'|'PENDING',
        'UpdatedAt': datetime(2015, 1, 1)
    }
}

Response Structure

  • (dict) --

    • Speaker (dict) --

      Details about the opted-out speaker.

      • CreatedAt (datetime) --

        A timestamp showing when the speaker is created.

      • CustomerSpeakerId (string) --

        The client-provided identifier for the speaker.

      • DomainId (string) --

        The identifier of the domain that contains the speaker.

      • GeneratedSpeakerId (string) --

        The service-generated identifier for the speaker.

      • LastAccessedAt (datetime) --

        The timestamp when the speaker was last accessed for enrollment, re-enrollment or a successful authentication. This timestamp is accurate to one hour.

      • Status (string) --

        The current status of the speaker.

      • UpdatedAt (datetime) --

        A timestamp showing the speaker's last update.