Amazon Rekognition

2017/11/22 - Amazon Rekognition - 1 new 6 updated api methods

Changes  This release includes updates to Amazon Rekognition for the following APIs. The new DetectText API allows you to recognize and extract textual content from images. Face Model Versioning has been added to operations that deal with face detection.

DetectText (new) Link ¶

Detects text in the input image and converts it into machine-readable text.

Pass the input image as base64-encoded image bytes or as a reference to an image in an Amazon S3 bucket. If you use the AWS CLI to call Amazon Rekognition operations, you must pass it as a reference to an image in an Amazon S3 bucket. For the AWS CLI, passing image bytes is not supported. The image must be either a .png or .jpeg formatted file.

The DetectText operation returns text in an array of elements, TextDetections . Each TextDetection element provides information about a single word or line of text that was detected in the image.

A word is one or more ISO basic latin script characters that are not separated by spaces. DetectText can detect up to 50 words in an image.

A line is a string of equally spaced words. A line isn't necessarily a complete sentence. For example, a driver's license number is detected as a line. A line ends when there is no aligned text after it. Also, a line ends when there is a large gap between words, relative to the length of the words. This means, depending on the gap between words, Amazon Rekognition may detect multiple lines in text aligned in the same direction. Periods don't represent the end of a line. If a sentence spans multiple lines, the DetectText operation returns multiple lines.

To determine whether a TextDetection element is a line of text or a word, use the TextDetection object Type field.

To be detected, text must be within +/- 30 degrees orientation of the horizontal axis.

For more information, see text-detection.

See also: AWS API Documentation

Request Syntax

client.detect_text(
    Image={
        'Bytes': b'bytes',
        'S3Object': {
            'Bucket': 'string',
            'Name': 'string',
            'Version': 'string'
        }
    }
)
type Image

dict

param Image

[REQUIRED]

The input image as base64-encoded bytes or an Amazon S3 object. If you use the AWS CLI to call Amazon Rekognition operations, you can't pass image bytes.

  • Bytes (bytes) --

    Blob of image bytes up to 5 MBs.

  • S3Object (dict) --

    Identifies an S3 object as the image source.

    • Bucket (string) --

      Name of the S3 bucket.

    • Name (string) --

      S3 object key name.

    • Version (string) --

      If the bucket is versioning enabled, you can specify the object version.

rtype

dict

returns

Response Syntax

{
    'TextDetections': [
        {
            'DetectedText': 'string',
            'Type': 'LINE'|'WORD',
            'Id': 123,
            'ParentId': 123,
            'Confidence': ...,
            'Geometry': {
                'BoundingBox': {
                    'Width': ...,
                    'Height': ...,
                    'Left': ...,
                    'Top': ...
                },
                'Polygon': [
                    {
                        'X': ...,
                        'Y': ...
                    },
                ]
            }
        },
    ]
}

Response Structure

  • (dict) --

    • TextDetections (list) --

      An array of text that was detected in the input image.

      • (dict) --

        Information about a word or line of text detected by .

        The DetectedText field contains the text that Amazon Rekognition detected in the image.

        Every word and line has an identifier ( Id ). Each word belongs to a line and has a parent identifier ( ParentId ) that identifies the line of text in which the word appears. The word Id is also an index for the word within a line of words.

        For more information, see text-detection.

        • DetectedText (string) --

          The word or line of text recognized by Amazon Rekognition.

        • Type (string) --

          The type of text that was detected.

        • Id (integer) --

          The identifier for the detected text. The identifier is only unique for a single call to DetectText .

        • ParentId (integer) --

          The Parent identifier for the detected text identified by the value of ID . If the type of detected text is LINE , the value of ParentId is Null .

        • Confidence (float) --

          The confidence that Amazon Rekognition has in the accuracy of the detected text and the accuracy of the geometry points around the detected text.

        • Geometry (dict) --

          The location of the detected text on the image. Includes an axis aligned coarse bounding box surrounding the text and a finer grain polygon for more accurate spatial information.

          • BoundingBox (dict) --

            An axis-aligned coarse representation of the detected text's location on the image.

            • Width (float) --

              Width of the bounding box as a ratio of the overall image width.

            • Height (float) --

              Height of the bounding box as a ratio of the overall image height.

            • Left (float) --

              Left coordinate of the bounding box as a ratio of overall image width.

            • Top (float) --

              Top coordinate of the bounding box as a ratio of overall image height.

          • Polygon (list) --

            Within the bounding box, a fine-grained polygon around the detected text.

            • (dict) --

              The X and Y coordinates of a point on an image. The X and Y values returned are ratios of the overall image size. For example, if the input image is 700x200 and the operation returns X=0.5 and Y=0.25, then the point is at the (350,50) pixel coordinate on the image.

              An array of Point objects, Polygon , is returned by . Polygon represents a fine-grained polygon around detected text. For more information, see .

              • X (float) --

                The value of the X coordinate for a point on a Polygon .

              • Y (float) --

                The value of the Y coordinate for a point on a Polygon .

CreateCollection (updated) Link ¶
Changes (response)
{'FaceModelVersion': 'string'}

Creates a collection in an AWS Region. You can add faces to the collection using the operation.

For example, you might create collections, one for each of your application users. A user can then index faces using the IndexFaces operation and persist results in a specific collection. Then, a user can search the collection for faces in the user-specific container.

Note

Collection names are case-sensitive.

For an example, see example1.

This operation requires permissions to perform the rekognition:CreateCollection action.

See also: AWS API Documentation

Request Syntax

client.create_collection(
    CollectionId='string'
)
type CollectionId

string

param CollectionId

[REQUIRED]

ID for the collection that you are creating.

rtype

dict

returns

Response Syntax

{
    'StatusCode': 123,
    'CollectionArn': 'string',
    'FaceModelVersion': 'string'
}

Response Structure

  • (dict) --

    • StatusCode (integer) --

      HTTP status code indicating the result of the operation.

    • CollectionArn (string) --

      Amazon Resource Name (ARN) of the collection. You can use this to manage permissions on your resources.

    • FaceModelVersion (string) --

      Version number of the face detection model associated with the collection you are creating.

IndexFaces (updated) Link ¶
Changes (response)
{'FaceModelVersion': 'string'}

Detects faces in the input image and adds them to the specified collection.

Amazon Rekognition does not save the actual faces detected. Instead, the underlying detection algorithm first detects the faces in the input image, and for each face extracts facial features into a feature vector, and stores it in the back-end database. Amazon Rekognition uses feature vectors when performing face match and search operations using the and operations.

If you are using version 1.0 of the face detection model, IndexFaces indexes the 15 largest faces in the input image. Later versions of the face detection model index the 100 largest faces in the input image. To determine which version of the model you are using, check the the value of FaceModelVersion in the response from IndexFaces . For more information, see face-detection-model.

If you provide the optional ExternalImageID for the input image you provided, Amazon Rekognition associates this ID with all faces that it detects. When you call the operation, the response returns the external ID. You can use this external image ID to create a client-side index to associate the faces with each image. You can then use the index to find all faces in an image.

In response, the operation returns an array of metadata for all detected faces. This includes, the bounding box of the detected face, confidence value (indicating the bounding box contains a face), a face ID assigned by the service for each face that is detected and stored, and an image ID assigned by the service for the input image. If you request all facial attributes (using the detectionAttributes parameter, Amazon Rekognition returns detailed facial attributes such as facial landmarks (for example, location of eye and mount) and other facial attributes such gender. If you provide the same image, specify the same collection, and use the same external ID in the IndexFaces operation, Amazon Rekognition doesn't save duplicate face metadata.

The input image is passed either as base64-encoded image bytes or as a reference to an image in an Amazon S3 bucket. If you use the Amazon CLI to call Amazon Rekognition operations, passing image bytes is not supported. The image must be either a PNG or JPEG formatted file.

For an example, see example2.

This operation requires permissions to perform the rekognition:IndexFaces action.

See also: AWS API Documentation

Request Syntax

client.index_faces(
    CollectionId='string',
    Image={
        'Bytes': b'bytes',
        'S3Object': {
            'Bucket': 'string',
            'Name': 'string',
            'Version': 'string'
        }
    },
    ExternalImageId='string',
    DetectionAttributes=[
        'DEFAULT'|'ALL',
    ]
)
type CollectionId

string

param CollectionId

[REQUIRED]

The ID of an existing collection to which you want to add the faces that are detected in the input images.

type Image

dict

param Image

[REQUIRED]

The input image as base64-encoded bytes or an S3 object. If you use the AWS CLI to call Amazon Rekognition operations, passing base64-encoded image bytes is not supported.

  • Bytes (bytes) --

    Blob of image bytes up to 5 MBs.

  • S3Object (dict) --

    Identifies an S3 object as the image source.

    • Bucket (string) --

      Name of the S3 bucket.

    • Name (string) --

      S3 object key name.

    • Version (string) --

      If the bucket is versioning enabled, you can specify the object version.

type ExternalImageId

string

param ExternalImageId

ID you want to assign to all the faces detected in the image.

type DetectionAttributes

list

param DetectionAttributes

An array of facial attributes that you want to be returned. This can be the default list of attributes or all attributes. If you don't specify a value for Attributes or if you specify ["DEFAULT"] , the API returns the following subset of facial attributes: BoundingBox , Confidence , Pose , Quality and Landmarks . If you provide ["ALL"] , all facial attributes are returned but the operation will take longer to complete.

If you provide both, ["ALL", "DEFAULT"] , the service uses a logical AND operator to determine which attributes to return (in this case, all attributes).

  • (string) --

rtype

dict

returns

Response Syntax

{
    'FaceRecords': [
        {
            'Face': {
                'FaceId': 'string',
                'BoundingBox': {
                    'Width': ...,
                    'Height': ...,
                    'Left': ...,
                    'Top': ...
                },
                'ImageId': 'string',
                'ExternalImageId': 'string',
                'Confidence': ...
            },
            'FaceDetail': {
                'BoundingBox': {
                    'Width': ...,
                    'Height': ...,
                    'Left': ...,
                    'Top': ...
                },
                'AgeRange': {
                    'Low': 123,
                    'High': 123
                },
                'Smile': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'Eyeglasses': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'Sunglasses': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'Gender': {
                    'Value': 'Male'|'Female',
                    'Confidence': ...
                },
                'Beard': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'Mustache': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'EyesOpen': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'MouthOpen': {
                    'Value': True|False,
                    'Confidence': ...
                },
                'Emotions': [
                    {
                        'Type': 'HAPPY'|'SAD'|'ANGRY'|'CONFUSED'|'DISGUSTED'|'SURPRISED'|'CALM'|'UNKNOWN',
                        'Confidence': ...
                    },
                ],
                'Landmarks': [
                    {
                        'Type': 'eyeLeft'|'eyeRight'|'nose'|'mouthLeft'|'mouthRight'|'leftEyeBrowLeft'|'leftEyeBrowRight'|'leftEyeBrowUp'|'rightEyeBrowLeft'|'rightEyeBrowRight'|'rightEyeBrowUp'|'leftEyeLeft'|'leftEyeRight'|'leftEyeUp'|'leftEyeDown'|'rightEyeLeft'|'rightEyeRight'|'rightEyeUp'|'rightEyeDown'|'noseLeft'|'noseRight'|'mouthUp'|'mouthDown'|'leftPupil'|'rightPupil',
                        'X': ...,
                        'Y': ...
                    },
                ],
                'Pose': {
                    'Roll': ...,
                    'Yaw': ...,
                    'Pitch': ...
                },
                'Quality': {
                    'Brightness': ...,
                    'Sharpness': ...
                },
                'Confidence': ...
            }
        },
    ],
    'OrientationCorrection': 'ROTATE_0'|'ROTATE_90'|'ROTATE_180'|'ROTATE_270',
    'FaceModelVersion': 'string'
}

Response Structure

  • (dict) --

    • FaceRecords (list) --

      An array of faces detected and added to the collection. For more information, see howitworks-index-faces.

      • (dict) --

        Object containing both the face metadata (stored in the back-end database) and facial attributes that are detected but aren't stored in the database.

        • Face (dict) --

          Describes the face properties such as the bounding box, face ID, image ID of the input image, and external image ID that you assigned.

          • FaceId (string) --

            Unique identifier that Amazon Rekognition assigns to the face.

          • BoundingBox (dict) --

            Bounding box of the face.

            • Width (float) --

              Width of the bounding box as a ratio of the overall image width.

            • Height (float) --

              Height of the bounding box as a ratio of the overall image height.

            • Left (float) --

              Left coordinate of the bounding box as a ratio of overall image width.

            • Top (float) --

              Top coordinate of the bounding box as a ratio of overall image height.

          • ImageId (string) --

            Unique identifier that Amazon Rekognition assigns to the input image.

          • ExternalImageId (string) --

            Identifier that you assign to all the faces in the input image.

          • Confidence (float) --

            Confidence level that the bounding box contains a face (and not a different object such as a tree).

        • FaceDetail (dict) --

          Structure containing attributes of the face that the algorithm detected.

          • BoundingBox (dict) --

            Bounding box of the face.

            • Width (float) --

              Width of the bounding box as a ratio of the overall image width.

            • Height (float) --

              Height of the bounding box as a ratio of the overall image height.

            • Left (float) --

              Left coordinate of the bounding box as a ratio of overall image width.

            • Top (float) --

              Top coordinate of the bounding box as a ratio of overall image height.

          • AgeRange (dict) --

            The estimated age range, in years, for the face. Low represents the lowest estimated age and High represents the highest estimated age.

            • Low (integer) --

              The lowest estimated age.

            • High (integer) --

              The highest estimated age.

          • Smile (dict) --

            Indicates whether or not the face is smiling, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the face is smiling or not.

            • Confidence (float) --

              Level of confidence in the determination.

          • Eyeglasses (dict) --

            Indicates whether or not the face is wearing eye glasses, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the face is wearing eye glasses or not.

            • Confidence (float) --

              Level of confidence in the determination.

          • Sunglasses (dict) --

            Indicates whether or not the face is wearing sunglasses, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the face is wearing sunglasses or not.

            • Confidence (float) --

              Level of confidence in the determination.

          • Gender (dict) --

            Gender of the face and the confidence level in the determination.

            • Value (string) --

              Gender of the face.

            • Confidence (float) --

              Level of confidence in the determination.

          • Beard (dict) --

            Indicates whether or not the face has a beard, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the face has beard or not.

            • Confidence (float) --

              Level of confidence in the determination.

          • Mustache (dict) --

            Indicates whether or not the face has a mustache, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the face has mustache or not.

            • Confidence (float) --

              Level of confidence in the determination.

          • EyesOpen (dict) --

            Indicates whether or not the eyes on the face are open, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the eyes on the face are open.

            • Confidence (float) --

              Level of confidence in the determination.

          • MouthOpen (dict) --

            Indicates whether or not the mouth on the face is open, and the confidence level in the determination.

            • Value (boolean) --

              Boolean value that indicates whether the mouth on the face is open or not.

            • Confidence (float) --

              Level of confidence in the determination.

          • Emotions (list) --

            The emotions detected on the face, and the confidence level in the determination. For example, HAPPY, SAD, and ANGRY.

            • (dict) --

              The emotions detected on the face, and the confidence level in the determination. For example, HAPPY, SAD, and ANGRY.

              • Type (string) --

                Type of emotion detected.

              • Confidence (float) --

                Level of confidence in the determination.

          • Landmarks (list) --

            Indicates the location of landmarks on the face.

            • (dict) --

              Indicates the location of the landmark on the face.

              • Type (string) --

                Type of the landmark.

              • X (float) --

                x-coordinate from the top left of the landmark expressed as the ratio of the width of the image. For example, if the images is 700x200 and the x-coordinate of the landmark is at 350 pixels, this value is 0.5.

              • Y (float) --

                y-coordinate from the top left of the landmark expressed as the ratio of the height of the image. For example, if the images is 700x200 and the y-coordinate of the landmark is at 100 pixels, this value is 0.5.

          • Pose (dict) --

            Indicates the pose of the face as determined by its pitch, roll, and yaw.

            • Roll (float) --

              Value representing the face rotation on the roll axis.

            • Yaw (float) --

              Value representing the face rotation on the yaw axis.

            • Pitch (float) --

              Value representing the face rotation on the pitch axis.

          • Quality (dict) --

            Identifies image brightness and sharpness.

            • Brightness (float) --

              Value representing brightness of the face. The service returns a value between 0 and 100 (inclusive). A higher value indicates a brighter face image.

            • Sharpness (float) --

              Value representing sharpness of the face. The service returns a value between 0 and 100 (inclusive). A higher value indicates a sharper face image.

          • Confidence (float) --

            Confidence level that the bounding box contains a face (and not a different object such as a tree).

    • OrientationCorrection (string) --

      The orientation of the input image (counterclockwise direction). If your application displays the image, you can use this value to correct image orientation. The bounding box coordinates returned in FaceRecords represent face locations before the image orientation is corrected.

      Note

      If the input image is in jpeg format, it might contain exchangeable image (Exif) metadata. If so, and the Exif metadata populates the orientation field, the value of OrientationCorrection is null and the bounding box coordinates in FaceRecords represent face locations after Exif metadata is used to correct the image orientation. Images in .png format don't contain Exif metadata.

    • FaceModelVersion (string) --

      Version number of the face detection model associated with the input collection ( CollectionId ).

ListCollections (updated) Link ¶
Changes (response)
{'FaceModelVersions': ['string']}

Returns list of collection IDs in your account. If the result is truncated, the response also provides a NextToken that you can use in the subsequent request to fetch the next set of collection IDs.

For an example, see example1.

This operation requires permissions to perform the rekognition:ListCollections action.

See also: AWS API Documentation

Request Syntax

client.list_collections(
    NextToken='string',
    MaxResults=123
)
type NextToken

string

param NextToken

Pagination token from the previous response.

type MaxResults

integer

param MaxResults

Maximum number of collection IDs to return.

rtype

dict

returns

Response Syntax

{
    'CollectionIds': [
        'string',
    ],
    'NextToken': 'string',
    'FaceModelVersions': [
        'string',
    ]
}

Response Structure

  • (dict) --

    • CollectionIds (list) --

      An array of collection IDs.

      • (string) --

    • NextToken (string) --

      If the result is truncated, the response provides a NextToken that you can use in the subsequent request to fetch the next set of collection IDs.

    • FaceModelVersions (list) --

      Version numbers of the face detection models associated with the collections in the array CollectionIds . For example, the value of FaceModelVersions[2] is the version number for the face detection model used by the collection in CollectionId[2] .

      • (string) --

ListFaces (updated) Link ¶
Changes (response)
{'FaceModelVersion': 'string'}

Returns metadata for faces in the specified collection. This metadata includes information such as the bounding box coordinates, the confidence (that the bounding box contains a face), and face ID. For an example, see example3.

This operation requires permissions to perform the rekognition:ListFaces action.

See also: AWS API Documentation

Request Syntax

client.list_faces(
    CollectionId='string',
    NextToken='string',
    MaxResults=123
)
type CollectionId

string

param CollectionId

[REQUIRED]

ID of the collection from which to list the faces.

type NextToken

string

param NextToken

If the previous response was incomplete (because there is more data to retrieve), Amazon Rekognition returns a pagination token in the response. You can use this pagination token to retrieve the next set of faces.

type MaxResults

integer

param MaxResults

Maximum number of faces to return.

rtype

dict

returns

Response Syntax

{
    'Faces': [
        {
            'FaceId': 'string',
            'BoundingBox': {
                'Width': ...,
                'Height': ...,
                'Left': ...,
                'Top': ...
            },
            'ImageId': 'string',
            'ExternalImageId': 'string',
            'Confidence': ...
        },
    ],
    'NextToken': 'string',
    'FaceModelVersion': 'string'
}

Response Structure

  • (dict) --

    • Faces (list) --

      An array of Face objects.

      • (dict) --

        Describes the face properties such as the bounding box, face ID, image ID of the input image, and external image ID that you assigned.

        • FaceId (string) --

          Unique identifier that Amazon Rekognition assigns to the face.

        • BoundingBox (dict) --

          Bounding box of the face.

          • Width (float) --

            Width of the bounding box as a ratio of the overall image width.

          • Height (float) --

            Height of the bounding box as a ratio of the overall image height.

          • Left (float) --

            Left coordinate of the bounding box as a ratio of overall image width.

          • Top (float) --

            Top coordinate of the bounding box as a ratio of overall image height.

        • ImageId (string) --

          Unique identifier that Amazon Rekognition assigns to the input image.

        • ExternalImageId (string) --

          Identifier that you assign to all the faces in the input image.

        • Confidence (float) --

          Confidence level that the bounding box contains a face (and not a different object such as a tree).

    • NextToken (string) --

      If the response is truncated, Amazon Rekognition returns this token that you can use in the subsequent request to retrieve the next set of faces.

    • FaceModelVersion (string) --

      Version number of the face detection model associated with the input collection ( CollectionId ).

SearchFaces (updated) Link ¶
Changes (response)
{'FaceModelVersion': 'string'}

For a given input face ID, searches for matching faces in the collection the face belongs to. You get a face ID when you add a face to the collection using the IndexFaces operation. The operation compares the features of the input face with faces in the specified collection.

Note

You can also search faces without indexing faces by using the SearchFacesByImage operation.

The operation response returns an array of faces that match, ordered by similarity score with the highest similarity first. More specifically, it is an array of metadata for each face match that is found. Along with the metadata, the response also includes a confidence value for each face match, indicating the confidence that the specific face matches the input face.

For an example, see example3.

This operation requires permissions to perform the rekognition:SearchFaces action.

See also: AWS API Documentation

Request Syntax

client.search_faces(
    CollectionId='string',
    FaceId='string',
    MaxFaces=123,
    FaceMatchThreshold=...
)
type CollectionId

string

param CollectionId

[REQUIRED]

ID of the collection the face belongs to.

type FaceId

string

param FaceId

[REQUIRED]

ID of a face to find matches for in the collection.

type MaxFaces

integer

param MaxFaces

Maximum number of faces to return. The operation returns the maximum number of faces with the highest confidence in the match.

type FaceMatchThreshold

float

param FaceMatchThreshold

Optional value specifying the minimum confidence in the face match to return. For example, don't return any matches where confidence in matches is less than 70%.

rtype

dict

returns

Response Syntax

{
    'SearchedFaceId': 'string',
    'FaceMatches': [
        {
            'Similarity': ...,
            'Face': {
                'FaceId': 'string',
                'BoundingBox': {
                    'Width': ...,
                    'Height': ...,
                    'Left': ...,
                    'Top': ...
                },
                'ImageId': 'string',
                'ExternalImageId': 'string',
                'Confidence': ...
            }
        },
    ],
    'FaceModelVersion': 'string'
}

Response Structure

  • (dict) --

    • SearchedFaceId (string) --

      ID of the face that was searched for matches in a collection.

    • FaceMatches (list) --

      An array of faces that matched the input face, along with the confidence in the match.

      • (dict) --

        Provides face metadata. In addition, it also provides the confidence in the match of this face with the input face.

        • Similarity (float) --

          Confidence in the match of this face with the input face.

        • Face (dict) --

          Describes the face properties such as the bounding box, face ID, image ID of the source image, and external image ID that you assigned.

          • FaceId (string) --

            Unique identifier that Amazon Rekognition assigns to the face.

          • BoundingBox (dict) --

            Bounding box of the face.

            • Width (float) --

              Width of the bounding box as a ratio of the overall image width.

            • Height (float) --

              Height of the bounding box as a ratio of the overall image height.

            • Left (float) --

              Left coordinate of the bounding box as a ratio of overall image width.

            • Top (float) --

              Top coordinate of the bounding box as a ratio of overall image height.

          • ImageId (string) --

            Unique identifier that Amazon Rekognition assigns to the input image.

          • ExternalImageId (string) --

            Identifier that you assign to all the faces in the input image.

          • Confidence (float) --

            Confidence level that the bounding box contains a face (and not a different object such as a tree).

    • FaceModelVersion (string) --

      Version number of the face detection model associated with the input collection ( CollectionId ).

SearchFacesByImage (updated) Link ¶
Changes (response)
{'FaceModelVersion': 'string'}

For a given input image, first detects the largest face in the image, and then searches the specified collection for matching faces. The operation compares the features of the input face with faces in the specified collection.

Note

To search for all faces in an input image, you might first call the operation, and then use the face IDs returned in subsequent calls to the operation.

You can also call the DetectFaces operation and use the bounding boxes in the response to make face crops, which then you can pass in to the SearchFacesByImage operation.

You pass the input image either as base64-encoded image bytes or as a reference to an image in an Amazon S3 bucket. If you use the Amazon CLI to call Amazon Rekognition operations, passing image bytes is not supported. The image must be either a PNG or JPEG formatted file.

The response returns an array of faces that match, ordered by similarity score with the highest similarity first. More specifically, it is an array of metadata for each face match found. Along with the metadata, the response also includes a similarity indicating how similar the face is to the input face. In the response, the operation also returns the bounding box (and a confidence level that the bounding box contains a face) of the face that Amazon Rekognition used for the input image.

For an example, see example3.

This operation requires permissions to perform the rekognition:SearchFacesByImage action.

See also: AWS API Documentation

Request Syntax

client.search_faces_by_image(
    CollectionId='string',
    Image={
        'Bytes': b'bytes',
        'S3Object': {
            'Bucket': 'string',
            'Name': 'string',
            'Version': 'string'
        }
    },
    MaxFaces=123,
    FaceMatchThreshold=...
)
type CollectionId

string

param CollectionId

[REQUIRED]

ID of the collection to search.

type Image

dict

param Image

[REQUIRED]

The input image as base64-encoded bytes or an S3 object. If you use the AWS CLI to call Amazon Rekognition operations, passing base64-encoded image bytes is not supported.

  • Bytes (bytes) --

    Blob of image bytes up to 5 MBs.

  • S3Object (dict) --

    Identifies an S3 object as the image source.

    • Bucket (string) --

      Name of the S3 bucket.

    • Name (string) --

      S3 object key name.

    • Version (string) --

      If the bucket is versioning enabled, you can specify the object version.

type MaxFaces

integer

param MaxFaces

Maximum number of faces to return. The operation returns the maximum number of faces with the highest confidence in the match.

type FaceMatchThreshold

float

param FaceMatchThreshold

(Optional) Specifies the minimum confidence in the face match to return. For example, don't return any matches where confidence in matches is less than 70%.

rtype

dict

returns

Response Syntax

{
    'SearchedFaceBoundingBox': {
        'Width': ...,
        'Height': ...,
        'Left': ...,
        'Top': ...
    },
    'SearchedFaceConfidence': ...,
    'FaceMatches': [
        {
            'Similarity': ...,
            'Face': {
                'FaceId': 'string',
                'BoundingBox': {
                    'Width': ...,
                    'Height': ...,
                    'Left': ...,
                    'Top': ...
                },
                'ImageId': 'string',
                'ExternalImageId': 'string',
                'Confidence': ...
            }
        },
    ],
    'FaceModelVersion': 'string'
}

Response Structure

  • (dict) --

    • SearchedFaceBoundingBox (dict) --

      The bounding box around the face in the input image that Amazon Rekognition used for the search.

      • Width (float) --

        Width of the bounding box as a ratio of the overall image width.

      • Height (float) --

        Height of the bounding box as a ratio of the overall image height.

      • Left (float) --

        Left coordinate of the bounding box as a ratio of overall image width.

      • Top (float) --

        Top coordinate of the bounding box as a ratio of overall image height.

    • SearchedFaceConfidence (float) --

      The level of confidence that the searchedFaceBoundingBox , contains a face.

    • FaceMatches (list) --

      An array of faces that match the input face, along with the confidence in the match.

      • (dict) --

        Provides face metadata. In addition, it also provides the confidence in the match of this face with the input face.

        • Similarity (float) --

          Confidence in the match of this face with the input face.

        • Face (dict) --

          Describes the face properties such as the bounding box, face ID, image ID of the source image, and external image ID that you assigned.

          • FaceId (string) --

            Unique identifier that Amazon Rekognition assigns to the face.

          • BoundingBox (dict) --

            Bounding box of the face.

            • Width (float) --

              Width of the bounding box as a ratio of the overall image width.

            • Height (float) --

              Height of the bounding box as a ratio of the overall image height.

            • Left (float) --

              Left coordinate of the bounding box as a ratio of overall image width.

            • Top (float) --

              Top coordinate of the bounding box as a ratio of overall image height.

          • ImageId (string) --

            Unique identifier that Amazon Rekognition assigns to the input image.

          • ExternalImageId (string) --

            Identifier that you assign to all the faces in the input image.

          • Confidence (float) --

            Confidence level that the bounding box contains a face (and not a different object such as a tree).

    • FaceModelVersion (string) --

      Version number of the face detection model associated with the input collection ( CollectionId ).