Amazon SageMaker Service

2021/02/22 - Amazon SageMaker Service - 4 updated api methods

Changes  Update sagemaker client to latest version

CreateEndpointConfig (updated) Link ¶
Changes (request)
{'ProductionVariants': {'CoreDumpConfig': {'DestinationS3Uri': 'string',
                                           'KmsKeyId': 'string'}}}

Creates an endpoint configuration that Amazon SageMaker hosting services uses to deploy models. In the configuration, you identify one or more models, created using the CreateModel API, to deploy and the resources that you want Amazon SageMaker to provision. Then you call the CreateEndpoint API.

Note

Use this API if you want to use Amazon SageMaker hosting services to deploy models into production.

In the request, you define a ProductionVariant , for each model that you want to deploy. Each ProductionVariant parameter also describes the resources that you want Amazon SageMaker to provision. This includes the number and type of ML compute instances to deploy.

If you are hosting multiple models, you also assign a VariantWeight to specify how much traffic you want to allocate to each model. For example, suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1 for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to model B.

For an example that calls this method when deploying a model to Amazon SageMaker hosting services, see Deploy the Model to Amazon SageMaker Hosting Services (AWS SDK for Python (Boto 3)).

Note

When you call CreateEndpoint, a load call is made to DynamoDB to verify that your endpoint configuration exists. When you read data from a DynamoDB table supporting Eventually Consistent Reads, the response might not reflect the results of a recently completed write operation. The response might include some stale data. If the dependent entities are not yet in DynamoDB, this causes a validation error. If you repeat your read request after a short time, the response should return the latest data. So retry logic is recommended to handle these possible issues. We also recommend that customers call DescribeEndpointConfig before calling CreateEndpoint to minimize the potential impact of a DynamoDB eventually consistent read.

See also: AWS API Documentation

Request Syntax

client.create_endpoint_config(
    EndpointConfigName='string',
    ProductionVariants=[
        {
            'VariantName': 'string',
            'ModelName': 'string',
            'InitialInstanceCount': 123,
            'InstanceType': 'ml.t2.medium'|'ml.t2.large'|'ml.t2.xlarge'|'ml.t2.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.m5d.large'|'ml.m5d.xlarge'|'ml.m5d.2xlarge'|'ml.m5d.4xlarge'|'ml.m5d.12xlarge'|'ml.m5d.24xlarge'|'ml.c4.large'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.c5d.large'|'ml.c5d.xlarge'|'ml.c5d.2xlarge'|'ml.c5d.4xlarge'|'ml.c5d.9xlarge'|'ml.c5d.18xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.12xlarge'|'ml.r5.24xlarge'|'ml.r5d.large'|'ml.r5d.xlarge'|'ml.r5d.2xlarge'|'ml.r5d.4xlarge'|'ml.r5d.12xlarge'|'ml.r5d.24xlarge'|'ml.inf1.xlarge'|'ml.inf1.2xlarge'|'ml.inf1.6xlarge'|'ml.inf1.24xlarge',
            'InitialVariantWeight': ...,
            'AcceleratorType': 'ml.eia1.medium'|'ml.eia1.large'|'ml.eia1.xlarge'|'ml.eia2.medium'|'ml.eia2.large'|'ml.eia2.xlarge',
            'CoreDumpConfig': {
                'DestinationS3Uri': 'string',
                'KmsKeyId': 'string'
            }
        },
    ],
    DataCaptureConfig={
        'EnableCapture': True|False,
        'InitialSamplingPercentage': 123,
        'DestinationS3Uri': 'string',
        'KmsKeyId': 'string',
        'CaptureOptions': [
            {
                'CaptureMode': 'Input'|'Output'
            },
        ],
        'CaptureContentTypeHeader': {
            'CsvContentTypes': [
                'string',
            ],
            'JsonContentTypes': [
                'string',
            ]
        }
    },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ],
    KmsKeyId='string'
)
type EndpointConfigName

string

param EndpointConfigName

[REQUIRED]

The name of the endpoint configuration. You specify this name in a CreateEndpoint request.

type ProductionVariants

list

param ProductionVariants

[REQUIRED]

An list of ProductionVariant objects, one for each model that you want to host at this endpoint.

  • (dict) --

    Identifies a model that you want to host and the resources to deploy for hosting it. If you are deploying multiple models, tell Amazon SageMaker how to distribute traffic among the models by specifying variant weights.

    • VariantName (string) -- [REQUIRED]

      The name of the production variant.

    • ModelName (string) -- [REQUIRED]

      The name of the model that you want to host. This is the name that you specified when creating the model.

    • InitialInstanceCount (integer) -- [REQUIRED]

      Number of instances to launch initially.

    • InstanceType (string) -- [REQUIRED]

      The ML compute instance type.

    • InitialVariantWeight (float) --

      Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of all VariantWeight values across all ProductionVariants. If unspecified, it defaults to 1.0.

    • AcceleratorType (string) --

      The size of the Elastic Inference (EI) instance to use for the production variant. EI instances provide on-demand GPU computing for inference. For more information, see Using Elastic Inference in Amazon SageMaker.

    • CoreDumpConfig (dict) --

      Specifies configuration for a core dump from the model container when the process crashes.

      • DestinationS3Uri (string) -- [REQUIRED]

        The Amazon S3 bucket to send the core dump to.

      • KmsKeyId (string) --

        The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt the core dump data at rest using Amazon S3 server-side encryption. The KmsKeyId can be any of the following formats:

        • // KMS Key ID "1234abcd-12ab-34cd-56ef-1234567890ab"

        • // Amazon Resource Name (ARN) of a KMS Key "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab"

        • // KMS Key Alias "alias/ExampleAlias"

        • // Amazon Resource Name (ARN) of a KMS Key Alias "arn:aws:kms:us-west-2:111122223333:alias/ExampleAlias"

        If you use a KMS key ID or an alias of your master key, the Amazon SageMaker execution role must include permissions to call kms:Encrypt . If you don't provide a KMS key ID, Amazon SageMaker uses the default KMS key for Amazon S3 for your role's account. Amazon SageMaker uses server-side encryption with KMS-managed keys for OutputDataConfig . If you use a bucket policy with an s3:PutObject permission that only allows objects with server-side encryption, set the condition key of s3:x-amz-server-side-encryption to "aws:kms" . For more information, see KMS-Managed Encryption Keys in the Amazon Simple Storage Service Developer Guide.

        The KMS key policy must grant permission to the IAM role that you specify in your CreateEndpoint and UpdateEndpoint requests. For more information, see Using Key Policies in AWS KMS in the AWS Key Management Service Developer Guide .

type DataCaptureConfig

dict

param DataCaptureConfig
  • EnableCapture (boolean) --

  • InitialSamplingPercentage (integer) -- [REQUIRED]

  • DestinationS3Uri (string) -- [REQUIRED]

  • KmsKeyId (string) --

  • CaptureOptions (list) -- [REQUIRED]

    • (dict) --

      • CaptureMode (string) -- [REQUIRED]

  • CaptureContentTypeHeader (dict) --

    • CsvContentTypes (list) --

      • (string) --

    • JsonContentTypes (list) --

      • (string) --

type Tags

list

param Tags

An array of key-value pairs. You can use tags to categorize your AWS resources in different ways, for example, by purpose, owner, or environment. For more information, see Tagging AWS Resources.

  • (dict) --

    Describes a tag.

    • Key (string) -- [REQUIRED]

      The tag key.

    • Value (string) -- [REQUIRED]

      The tag value.

type KmsKeyId

string

param KmsKeyId

The Amazon Resource Name (ARN) of a AWS Key Management Service key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint.

The KmsKeyId can be any of the following formats:

  • Key ID: 1234abcd-12ab-34cd-56ef-1234567890ab

  • Key ARN: arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab

  • Alias name: alias/ExampleAlias

  • Alias name ARN: arn:aws:kms:us-west-2:111122223333:alias/ExampleAlias

The KMS key policy must grant permission to the IAM role that you specify in your CreateEndpoint , UpdateEndpoint requests. For more information, refer to the AWS Key Management Service section Using Key Policies in AWS KMS

Note

Certain Nitro-based instances include local storage, dependent on the instance type. Local storage volumes are encrypted using a hardware module on the instance. You can't request a KmsKeyId when using an instance type with local storage. If any of the models that you specify in the ProductionVariants parameter use nitro-based instances with local storage, do not specify a value for the KmsKeyId parameter. If you specify a value for KmsKeyId when using any nitro-based instances with local storage, the call to CreateEndpointConfig fails.

For a list of instance types that support local instance storage, see Instance Store Volumes.

For more information about local instance storage encryption, see SSD Instance Store Volumes.

rtype

dict

returns

Response Syntax

{
    'EndpointConfigArn': 'string'
}

Response Structure

  • (dict) --

    • EndpointConfigArn (string) --

      The Amazon Resource Name (ARN) of the endpoint configuration.

CreateModel (updated) Link ¶
Changes (request)
{'InferenceExecutionConfig': {'Mode': 'Serial | Direct'}}

Creates a model in Amazon SageMaker. In the request, you name the model and describe a primary container. For the primary container, you specify the Docker image that contains inference code, artifacts (from prior training), and a custom environment map that the inference code uses when you deploy the model for predictions.

Use this API to create a model if you want to use Amazon SageMaker hosting services or run a batch transform job.

To host your model, you create an endpoint configuration with the CreateEndpointConfig API, and then create an endpoint with the CreateEndpoint API. Amazon SageMaker then deploys all of the containers that you defined for the model in the hosting environment.

For an example that calls this method when deploying a model to Amazon SageMaker hosting services, see Deploy the Model to Amazon SageMaker Hosting Services (AWS SDK for Python (Boto 3)).

To run a batch transform using your model, you start a job with the CreateTransformJob API. Amazon SageMaker uses your model and your dataset to get inferences which are then saved to a specified S3 location.

In the CreateModel request, you must define a container with the PrimaryContainer parameter.

In the request, you also provide an IAM role that Amazon SageMaker can assume to access model artifacts and docker image for deployment on ML compute hosting instances or for batch transform jobs. In addition, you also use the IAM role to manage permissions the inference code needs. For example, if the inference code access any other AWS resources, you grant necessary permissions via this role.

See also: AWS API Documentation

Request Syntax

client.create_model(
    ModelName='string',
    PrimaryContainer={
        'ContainerHostname': 'string',
        'Image': 'string',
        'ImageConfig': {
            'RepositoryAccessMode': 'Platform'|'Vpc'
        },
        'Mode': 'SingleModel'|'MultiModel',
        'ModelDataUrl': 'string',
        'Environment': {
            'string': 'string'
        },
        'ModelPackageName': 'string',
        'MultiModelConfig': {
            'ModelCacheSetting': 'Enabled'|'Disabled'
        }
    },
    Containers=[
        {
            'ContainerHostname': 'string',
            'Image': 'string',
            'ImageConfig': {
                'RepositoryAccessMode': 'Platform'|'Vpc'
            },
            'Mode': 'SingleModel'|'MultiModel',
            'ModelDataUrl': 'string',
            'Environment': {
                'string': 'string'
            },
            'ModelPackageName': 'string',
            'MultiModelConfig': {
                'ModelCacheSetting': 'Enabled'|'Disabled'
            }
        },
    ],
    InferenceExecutionConfig={
        'Mode': 'Serial'|'Direct'
    },
    ExecutionRoleArn='string',
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ],
    VpcConfig={
        'SecurityGroupIds': [
            'string',
        ],
        'Subnets': [
            'string',
        ]
    },
    EnableNetworkIsolation=True|False
)
type ModelName

string

param ModelName

[REQUIRED]

The name of the new model.

type PrimaryContainer

dict

param PrimaryContainer

The location of the primary docker image containing inference code, associated artifacts, and custom environment map that the inference code uses when the model is deployed for predictions.

  • ContainerHostname (string) --

    This parameter is ignored for models that contain only a PrimaryContainer .

    When a ContainerDefinition is part of an inference pipeline, the value of the parameter uniquely identifies the container for the purposes of logging and metrics. For information, see Use Logs and Metrics to Monitor an Inference Pipeline. If you don't specify a value for this parameter for a ContainerDefinition that is part of an inference pipeline, a unique name is automatically assigned based on the position of the ContainerDefinition in the pipeline. If you specify a value for the ContainerHostName for any ContainerDefinition that is part of an inference pipeline, you must specify a value for the ContainerHostName parameter of every ContainerDefinition in that pipeline.

  • Image (string) --

    The path where inference code is stored. This can be either in Amazon EC2 Container Registry or in a Docker registry that is accessible from the same VPC that you configure for your endpoint. If you are using your own custom algorithm instead of an algorithm provided by Amazon SageMaker, the inference code must meet Amazon SageMaker requirements. Amazon SageMaker supports both registry/repository[:tag] and registry/repository[@digest] image path formats. For more information, see Using Your Own Algorithms with Amazon SageMaker

  • ImageConfig (dict) --

    Specifies whether the model container is in Amazon ECR or a private Docker registry accessible from your Amazon Virtual Private Cloud (VPC). For information about storing containers in a private Docker registry, see Use a Private Docker Registry for Real-Time Inference Containers

    • RepositoryAccessMode (string) -- [REQUIRED]

      Set this to one of the following values:

      • Platform - The model image is hosted in Amazon ECR.

      • Vpc - The model image is hosted in a private Docker registry in your VPC.

  • Mode (string) --

    Whether the container hosts a single model or multiple models.

  • ModelDataUrl (string) --

    The S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix). The S3 path is required for Amazon SageMaker built-in algorithms, but not if you use your own algorithms. For more information on built-in algorithms, see Common Parameters.

    Note

    The model artifacts must be in an S3 bucket that is in the same region as the model or endpoint you are creating.

    If you provide a value for this parameter, Amazon SageMaker uses AWS Security Token Service to download model artifacts from the S3 path you provide. AWS STS is activated in your IAM user account by default. If you previously deactivated AWS STS for a region, you need to reactivate AWS STS for that region. For more information, see Activating and Deactivating AWS STS in an AWS Region in the AWS Identity and Access Management User Guide .

    Warning

    If you use a built-in algorithm to create a model, Amazon SageMaker requires that you provide a S3 path to the model artifacts in ModelDataUrl .

  • Environment (dict) --

    The environment variables to set in the Docker container. Each key and value in the Environment string to string map can have length of up to 1024. We support up to 16 entries in the map.

    • (string) --

      • (string) --

  • ModelPackageName (string) --

    The name or Amazon Resource Name (ARN) of the model package to use to create the model.

  • MultiModelConfig (dict) --

    Specifies additional configuration for multi-model endpoints.

    • ModelCacheSetting (string) --

      Whether to cache models for a multi-model endpoint. By default, multi-model endpoints cache models so that a model does not have to be loaded into memory each time it is invoked. Some use cases do not benefit from model caching. For example, if an endpoint hosts a large number of models that are each invoked infrequently, the endpoint might perform better if you disable model caching. To disable model caching, set the value of this parameter to Disabled .

type Containers

list

param Containers

Specifies the containers in the inference pipeline.

  • (dict) --

    Describes the container, as part of model definition.

    • ContainerHostname (string) --

      This parameter is ignored for models that contain only a PrimaryContainer .

      When a ContainerDefinition is part of an inference pipeline, the value of the parameter uniquely identifies the container for the purposes of logging and metrics. For information, see Use Logs and Metrics to Monitor an Inference Pipeline. If you don't specify a value for this parameter for a ContainerDefinition that is part of an inference pipeline, a unique name is automatically assigned based on the position of the ContainerDefinition in the pipeline. If you specify a value for the ContainerHostName for any ContainerDefinition that is part of an inference pipeline, you must specify a value for the ContainerHostName parameter of every ContainerDefinition in that pipeline.

    • Image (string) --

      The path where inference code is stored. This can be either in Amazon EC2 Container Registry or in a Docker registry that is accessible from the same VPC that you configure for your endpoint. If you are using your own custom algorithm instead of an algorithm provided by Amazon SageMaker, the inference code must meet Amazon SageMaker requirements. Amazon SageMaker supports both registry/repository[:tag] and registry/repository[@digest] image path formats. For more information, see Using Your Own Algorithms with Amazon SageMaker

    • ImageConfig (dict) --

      Specifies whether the model container is in Amazon ECR or a private Docker registry accessible from your Amazon Virtual Private Cloud (VPC). For information about storing containers in a private Docker registry, see Use a Private Docker Registry for Real-Time Inference Containers

      • RepositoryAccessMode (string) -- [REQUIRED]

        Set this to one of the following values:

        • Platform - The model image is hosted in Amazon ECR.

        • Vpc - The model image is hosted in a private Docker registry in your VPC.

    • Mode (string) --

      Whether the container hosts a single model or multiple models.

    • ModelDataUrl (string) --

      The S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix). The S3 path is required for Amazon SageMaker built-in algorithms, but not if you use your own algorithms. For more information on built-in algorithms, see Common Parameters.

      Note

      The model artifacts must be in an S3 bucket that is in the same region as the model or endpoint you are creating.

      If you provide a value for this parameter, Amazon SageMaker uses AWS Security Token Service to download model artifacts from the S3 path you provide. AWS STS is activated in your IAM user account by default. If you previously deactivated AWS STS for a region, you need to reactivate AWS STS for that region. For more information, see Activating and Deactivating AWS STS in an AWS Region in the AWS Identity and Access Management User Guide .

      Warning

      If you use a built-in algorithm to create a model, Amazon SageMaker requires that you provide a S3 path to the model artifacts in ModelDataUrl .

    • Environment (dict) --

      The environment variables to set in the Docker container. Each key and value in the Environment string to string map can have length of up to 1024. We support up to 16 entries in the map.

      • (string) --

        • (string) --

    • ModelPackageName (string) --

      The name or Amazon Resource Name (ARN) of the model package to use to create the model.

    • MultiModelConfig (dict) --

      Specifies additional configuration for multi-model endpoints.

      • ModelCacheSetting (string) --

        Whether to cache models for a multi-model endpoint. By default, multi-model endpoints cache models so that a model does not have to be loaded into memory each time it is invoked. Some use cases do not benefit from model caching. For example, if an endpoint hosts a large number of models that are each invoked infrequently, the endpoint might perform better if you disable model caching. To disable model caching, set the value of this parameter to Disabled .

type InferenceExecutionConfig

dict

param InferenceExecutionConfig

Specifies details of how containers in a multi-container endpoint are called.

  • Mode (string) -- [REQUIRED]

    How containers in a multi-container are run. The following values are valid.

    • SERIAL - Containers run as a serial pipeline.

    • DIRECT - Only the individual container that you specify is run.

type ExecutionRoleArn

string

param ExecutionRoleArn

[REQUIRED]

The Amazon Resource Name (ARN) of the IAM role that Amazon SageMaker can assume to access model artifacts and docker image for deployment on ML compute instances or for batch transform jobs. Deploying on ML compute instances is part of model hosting. For more information, see Amazon SageMaker Roles.

Note

To be able to pass this role to Amazon SageMaker, the caller of this API must have the iam:PassRole permission.

type Tags

list

param Tags

An array of key-value pairs. You can use tags to categorize your AWS resources in different ways, for example, by purpose, owner, or environment. For more information, see Tagging AWS Resources.

  • (dict) --

    Describes a tag.

    • Key (string) -- [REQUIRED]

      The tag key.

    • Value (string) -- [REQUIRED]

      The tag value.

type VpcConfig

dict

param VpcConfig

A VpcConfig object that specifies the VPC that you want your model to connect to. Control access to and from your model container by configuring the VPC. VpcConfig is used in hosting services and in batch transform. For more information, see Protect Endpoints by Using an Amazon Virtual Private Cloud and Protect Data in Batch Transform Jobs by Using an Amazon Virtual Private Cloud.

  • SecurityGroupIds (list) -- [REQUIRED]

    The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

    • (string) --

  • Subnets (list) -- [REQUIRED]

    The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

    • (string) --

type EnableNetworkIsolation

boolean

param EnableNetworkIsolation

Isolates the model container. No inbound or outbound network calls can be made to or from the model container.

rtype

dict

returns

Response Syntax

{
    'ModelArn': 'string'
}

Response Structure

  • (dict) --

    • ModelArn (string) --

      The ARN of the model created in Amazon SageMaker.

DescribeEndpointConfig (updated) Link ¶
Changes (response)
{'ProductionVariants': {'CoreDumpConfig': {'DestinationS3Uri': 'string',
                                           'KmsKeyId': 'string'}}}

Returns the description of an endpoint configuration created using the CreateEndpointConfig API.

See also: AWS API Documentation

Request Syntax

client.describe_endpoint_config(
    EndpointConfigName='string'
)
type EndpointConfigName

string

param EndpointConfigName

[REQUIRED]

The name of the endpoint configuration.

rtype

dict

returns

Response Syntax

{
    'EndpointConfigName': 'string',
    'EndpointConfigArn': 'string',
    'ProductionVariants': [
        {
            'VariantName': 'string',
            'ModelName': 'string',
            'InitialInstanceCount': 123,
            'InstanceType': 'ml.t2.medium'|'ml.t2.large'|'ml.t2.xlarge'|'ml.t2.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.m5d.large'|'ml.m5d.xlarge'|'ml.m5d.2xlarge'|'ml.m5d.4xlarge'|'ml.m5d.12xlarge'|'ml.m5d.24xlarge'|'ml.c4.large'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.c5d.large'|'ml.c5d.xlarge'|'ml.c5d.2xlarge'|'ml.c5d.4xlarge'|'ml.c5d.9xlarge'|'ml.c5d.18xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.12xlarge'|'ml.r5.24xlarge'|'ml.r5d.large'|'ml.r5d.xlarge'|'ml.r5d.2xlarge'|'ml.r5d.4xlarge'|'ml.r5d.12xlarge'|'ml.r5d.24xlarge'|'ml.inf1.xlarge'|'ml.inf1.2xlarge'|'ml.inf1.6xlarge'|'ml.inf1.24xlarge',
            'InitialVariantWeight': ...,
            'AcceleratorType': 'ml.eia1.medium'|'ml.eia1.large'|'ml.eia1.xlarge'|'ml.eia2.medium'|'ml.eia2.large'|'ml.eia2.xlarge',
            'CoreDumpConfig': {
                'DestinationS3Uri': 'string',
                'KmsKeyId': 'string'
            }
        },
    ],
    'DataCaptureConfig': {
        'EnableCapture': True|False,
        'InitialSamplingPercentage': 123,
        'DestinationS3Uri': 'string',
        'KmsKeyId': 'string',
        'CaptureOptions': [
            {
                'CaptureMode': 'Input'|'Output'
            },
        ],
        'CaptureContentTypeHeader': {
            'CsvContentTypes': [
                'string',
            ],
            'JsonContentTypes': [
                'string',
            ]
        }
    },
    'KmsKeyId': 'string',
    'CreationTime': datetime(2015, 1, 1)
}

Response Structure

  • (dict) --

    • EndpointConfigName (string) --

      Name of the Amazon SageMaker endpoint configuration.

    • EndpointConfigArn (string) --

      The Amazon Resource Name (ARN) of the endpoint configuration.

    • ProductionVariants (list) --

      An array of ProductionVariant objects, one for each model that you want to host at this endpoint.

      • (dict) --

        Identifies a model that you want to host and the resources to deploy for hosting it. If you are deploying multiple models, tell Amazon SageMaker how to distribute traffic among the models by specifying variant weights.

        • VariantName (string) --

          The name of the production variant.

        • ModelName (string) --

          The name of the model that you want to host. This is the name that you specified when creating the model.

        • InitialInstanceCount (integer) --

          Number of instances to launch initially.

        • InstanceType (string) --

          The ML compute instance type.

        • InitialVariantWeight (float) --

          Determines initial traffic distribution among all of the models that you specify in the endpoint configuration. The traffic to a production variant is determined by the ratio of the VariantWeight to the sum of all VariantWeight values across all ProductionVariants. If unspecified, it defaults to 1.0.

        • AcceleratorType (string) --

          The size of the Elastic Inference (EI) instance to use for the production variant. EI instances provide on-demand GPU computing for inference. For more information, see Using Elastic Inference in Amazon SageMaker.

        • CoreDumpConfig (dict) --

          Specifies configuration for a core dump from the model container when the process crashes.

          • DestinationS3Uri (string) --

            The Amazon S3 bucket to send the core dump to.

          • KmsKeyId (string) --

            The AWS Key Management Service (AWS KMS) key that Amazon SageMaker uses to encrypt the core dump data at rest using Amazon S3 server-side encryption. The KmsKeyId can be any of the following formats:

            • // KMS Key ID "1234abcd-12ab-34cd-56ef-1234567890ab"

            • // Amazon Resource Name (ARN) of a KMS Key "arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab"

            • // KMS Key Alias "alias/ExampleAlias"

            • // Amazon Resource Name (ARN) of a KMS Key Alias "arn:aws:kms:us-west-2:111122223333:alias/ExampleAlias"

            If you use a KMS key ID or an alias of your master key, the Amazon SageMaker execution role must include permissions to call kms:Encrypt . If you don't provide a KMS key ID, Amazon SageMaker uses the default KMS key for Amazon S3 for your role's account. Amazon SageMaker uses server-side encryption with KMS-managed keys for OutputDataConfig . If you use a bucket policy with an s3:PutObject permission that only allows objects with server-side encryption, set the condition key of s3:x-amz-server-side-encryption to "aws:kms" . For more information, see KMS-Managed Encryption Keys in the Amazon Simple Storage Service Developer Guide.

            The KMS key policy must grant permission to the IAM role that you specify in your CreateEndpoint and UpdateEndpoint requests. For more information, see Using Key Policies in AWS KMS in the AWS Key Management Service Developer Guide .

    • DataCaptureConfig (dict) --

      • EnableCapture (boolean) --

      • InitialSamplingPercentage (integer) --

      • DestinationS3Uri (string) --

      • KmsKeyId (string) --

      • CaptureOptions (list) --

        • (dict) --

          • CaptureMode (string) --

      • CaptureContentTypeHeader (dict) --

        • CsvContentTypes (list) --

          • (string) --

        • JsonContentTypes (list) --

          • (string) --

    • KmsKeyId (string) --

      AWS KMS key ID Amazon SageMaker uses to encrypt data when storing it on the ML storage volume attached to the instance.

    • CreationTime (datetime) --

      A timestamp that shows when the endpoint configuration was created.

DescribeModel (updated) Link ¶
Changes (response)
{'InferenceExecutionConfig': {'Mode': 'Serial | Direct'}}

Describes a model that you created using the CreateModel API.

See also: AWS API Documentation

Request Syntax

client.describe_model(
    ModelName='string'
)
type ModelName

string

param ModelName

[REQUIRED]

The name of the model.

rtype

dict

returns

Response Syntax

{
    'ModelName': 'string',
    'PrimaryContainer': {
        'ContainerHostname': 'string',
        'Image': 'string',
        'ImageConfig': {
            'RepositoryAccessMode': 'Platform'|'Vpc'
        },
        'Mode': 'SingleModel'|'MultiModel',
        'ModelDataUrl': 'string',
        'Environment': {
            'string': 'string'
        },
        'ModelPackageName': 'string',
        'MultiModelConfig': {
            'ModelCacheSetting': 'Enabled'|'Disabled'
        }
    },
    'Containers': [
        {
            'ContainerHostname': 'string',
            'Image': 'string',
            'ImageConfig': {
                'RepositoryAccessMode': 'Platform'|'Vpc'
            },
            'Mode': 'SingleModel'|'MultiModel',
            'ModelDataUrl': 'string',
            'Environment': {
                'string': 'string'
            },
            'ModelPackageName': 'string',
            'MultiModelConfig': {
                'ModelCacheSetting': 'Enabled'|'Disabled'
            }
        },
    ],
    'InferenceExecutionConfig': {
        'Mode': 'Serial'|'Direct'
    },
    'ExecutionRoleArn': 'string',
    'VpcConfig': {
        'SecurityGroupIds': [
            'string',
        ],
        'Subnets': [
            'string',
        ]
    },
    'CreationTime': datetime(2015, 1, 1),
    'ModelArn': 'string',
    'EnableNetworkIsolation': True|False
}

Response Structure

  • (dict) --

    • ModelName (string) --

      Name of the Amazon SageMaker model.

    • PrimaryContainer (dict) --

      The location of the primary inference code, associated artifacts, and custom environment map that the inference code uses when it is deployed in production.

      • ContainerHostname (string) --

        This parameter is ignored for models that contain only a PrimaryContainer .

        When a ContainerDefinition is part of an inference pipeline, the value of the parameter uniquely identifies the container for the purposes of logging and metrics. For information, see Use Logs and Metrics to Monitor an Inference Pipeline. If you don't specify a value for this parameter for a ContainerDefinition that is part of an inference pipeline, a unique name is automatically assigned based on the position of the ContainerDefinition in the pipeline. If you specify a value for the ContainerHostName for any ContainerDefinition that is part of an inference pipeline, you must specify a value for the ContainerHostName parameter of every ContainerDefinition in that pipeline.

      • Image (string) --

        The path where inference code is stored. This can be either in Amazon EC2 Container Registry or in a Docker registry that is accessible from the same VPC that you configure for your endpoint. If you are using your own custom algorithm instead of an algorithm provided by Amazon SageMaker, the inference code must meet Amazon SageMaker requirements. Amazon SageMaker supports both registry/repository[:tag] and registry/repository[@digest] image path formats. For more information, see Using Your Own Algorithms with Amazon SageMaker

      • ImageConfig (dict) --

        Specifies whether the model container is in Amazon ECR or a private Docker registry accessible from your Amazon Virtual Private Cloud (VPC). For information about storing containers in a private Docker registry, see Use a Private Docker Registry for Real-Time Inference Containers

        • RepositoryAccessMode (string) --

          Set this to one of the following values:

          • Platform - The model image is hosted in Amazon ECR.

          • Vpc - The model image is hosted in a private Docker registry in your VPC.

      • Mode (string) --

        Whether the container hosts a single model or multiple models.

      • ModelDataUrl (string) --

        The S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix). The S3 path is required for Amazon SageMaker built-in algorithms, but not if you use your own algorithms. For more information on built-in algorithms, see Common Parameters.

        Note

        The model artifacts must be in an S3 bucket that is in the same region as the model or endpoint you are creating.

        If you provide a value for this parameter, Amazon SageMaker uses AWS Security Token Service to download model artifacts from the S3 path you provide. AWS STS is activated in your IAM user account by default. If you previously deactivated AWS STS for a region, you need to reactivate AWS STS for that region. For more information, see Activating and Deactivating AWS STS in an AWS Region in the AWS Identity and Access Management User Guide .

        Warning

        If you use a built-in algorithm to create a model, Amazon SageMaker requires that you provide a S3 path to the model artifacts in ModelDataUrl .

      • Environment (dict) --

        The environment variables to set in the Docker container. Each key and value in the Environment string to string map can have length of up to 1024. We support up to 16 entries in the map.

        • (string) --

          • (string) --

      • ModelPackageName (string) --

        The name or Amazon Resource Name (ARN) of the model package to use to create the model.

      • MultiModelConfig (dict) --

        Specifies additional configuration for multi-model endpoints.

        • ModelCacheSetting (string) --

          Whether to cache models for a multi-model endpoint. By default, multi-model endpoints cache models so that a model does not have to be loaded into memory each time it is invoked. Some use cases do not benefit from model caching. For example, if an endpoint hosts a large number of models that are each invoked infrequently, the endpoint might perform better if you disable model caching. To disable model caching, set the value of this parameter to Disabled .

    • Containers (list) --

      The containers in the inference pipeline.

      • (dict) --

        Describes the container, as part of model definition.

        • ContainerHostname (string) --

          This parameter is ignored for models that contain only a PrimaryContainer .

          When a ContainerDefinition is part of an inference pipeline, the value of the parameter uniquely identifies the container for the purposes of logging and metrics. For information, see Use Logs and Metrics to Monitor an Inference Pipeline. If you don't specify a value for this parameter for a ContainerDefinition that is part of an inference pipeline, a unique name is automatically assigned based on the position of the ContainerDefinition in the pipeline. If you specify a value for the ContainerHostName for any ContainerDefinition that is part of an inference pipeline, you must specify a value for the ContainerHostName parameter of every ContainerDefinition in that pipeline.

        • Image (string) --

          The path where inference code is stored. This can be either in Amazon EC2 Container Registry or in a Docker registry that is accessible from the same VPC that you configure for your endpoint. If you are using your own custom algorithm instead of an algorithm provided by Amazon SageMaker, the inference code must meet Amazon SageMaker requirements. Amazon SageMaker supports both registry/repository[:tag] and registry/repository[@digest] image path formats. For more information, see Using Your Own Algorithms with Amazon SageMaker

        • ImageConfig (dict) --

          Specifies whether the model container is in Amazon ECR or a private Docker registry accessible from your Amazon Virtual Private Cloud (VPC). For information about storing containers in a private Docker registry, see Use a Private Docker Registry for Real-Time Inference Containers

          • RepositoryAccessMode (string) --

            Set this to one of the following values:

            • Platform - The model image is hosted in Amazon ECR.

            • Vpc - The model image is hosted in a private Docker registry in your VPC.

        • Mode (string) --

          Whether the container hosts a single model or multiple models.

        • ModelDataUrl (string) --

          The S3 path where the model artifacts, which result from model training, are stored. This path must point to a single gzip compressed tar archive (.tar.gz suffix). The S3 path is required for Amazon SageMaker built-in algorithms, but not if you use your own algorithms. For more information on built-in algorithms, see Common Parameters.

          Note

          The model artifacts must be in an S3 bucket that is in the same region as the model or endpoint you are creating.

          If you provide a value for this parameter, Amazon SageMaker uses AWS Security Token Service to download model artifacts from the S3 path you provide. AWS STS is activated in your IAM user account by default. If you previously deactivated AWS STS for a region, you need to reactivate AWS STS for that region. For more information, see Activating and Deactivating AWS STS in an AWS Region in the AWS Identity and Access Management User Guide .

          Warning

          If you use a built-in algorithm to create a model, Amazon SageMaker requires that you provide a S3 path to the model artifacts in ModelDataUrl .

        • Environment (dict) --

          The environment variables to set in the Docker container. Each key and value in the Environment string to string map can have length of up to 1024. We support up to 16 entries in the map.

          • (string) --

            • (string) --

        • ModelPackageName (string) --

          The name or Amazon Resource Name (ARN) of the model package to use to create the model.

        • MultiModelConfig (dict) --

          Specifies additional configuration for multi-model endpoints.

          • ModelCacheSetting (string) --

            Whether to cache models for a multi-model endpoint. By default, multi-model endpoints cache models so that a model does not have to be loaded into memory each time it is invoked. Some use cases do not benefit from model caching. For example, if an endpoint hosts a large number of models that are each invoked infrequently, the endpoint might perform better if you disable model caching. To disable model caching, set the value of this parameter to Disabled .

    • InferenceExecutionConfig (dict) --

      Specifies details of how containers in a multi-container endpoint are called.

      • Mode (string) --

        How containers in a multi-container are run. The following values are valid.

        • SERIAL - Containers run as a serial pipeline.

        • DIRECT - Only the individual container that you specify is run.

    • ExecutionRoleArn (string) --

      The Amazon Resource Name (ARN) of the IAM role that you specified for the model.

    • VpcConfig (dict) --

      A VpcConfig object that specifies the VPC that this model has access to. For more information, see Protect Endpoints by Using an Amazon Virtual Private Cloud

      • SecurityGroupIds (list) --

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) --

      • Subnets (list) --

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) --

    • CreationTime (datetime) --

      A timestamp that shows when the model was created.

    • ModelArn (string) --

      The Amazon Resource Name (ARN) of the model.

    • EnableNetworkIsolation (boolean) --

      If True , no inbound or outbound network calls can be made to or from the model container.