Amazon SageMaker Service

2025/07/15 - Amazon SageMaker Service - 2 new9 updated api methods

Changes  This release adds support for a new Restricted instance group type to enable a specialized environment for running Nova customization jobs on SageMaker HyperPod clusters. This release also adds support for SageMaker pipeline versioning.

ListPipelineVersions (new) Link ¶

Gets a list of all versions of the pipeline.

See also: AWS API Documentation

Request Syntax

client.list_pipeline_versions(
    PipelineName='string',
    CreatedAfter=datetime(2015, 1, 1),
    CreatedBefore=datetime(2015, 1, 1),
    SortOrder='Ascending'|'Descending',
    NextToken='string',
    MaxResults=123
)
type PipelineName:

string

param PipelineName:

[REQUIRED]

The Amazon Resource Name (ARN) of the pipeline.

type CreatedAfter:

datetime

param CreatedAfter:

A filter that returns the pipeline versions that were created after a specified time.

type CreatedBefore:

datetime

param CreatedBefore:

A filter that returns the pipeline versions that were created before a specified time.

type SortOrder:

string

param SortOrder:

The sort order for the results.

type NextToken:

string

param NextToken:

If the result of the previous ListPipelineVersions request was truncated, the response includes a NextToken. To retrieve the next set of pipeline versions, use this token in your next request.

type MaxResults:

integer

param MaxResults:

The maximum number of pipeline versions to return in the response.

rtype:

dict

returns:

Response Syntax

{
    'PipelineVersionSummaries': [
        {
            'PipelineArn': 'string',
            'PipelineVersionId': 123,
            'CreationTime': datetime(2015, 1, 1),
            'PipelineVersionDescription': 'string',
            'PipelineVersionDisplayName': 'string',
            'LastExecutionPipelineExecutionArn': 'string'
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • PipelineVersionSummaries (list) --

      Contains a sorted list of pipeline version summary objects matching the specified filters. Each version summary includes the pipeline version ID, the creation date, and the last pipeline execution created from that version. This list can be empty.

      • (dict) --

        The summary of the pipeline version.

        • PipelineArn (string) --

          The Amazon Resource Name (ARN) of the pipeline.

        • PipelineVersionId (integer) --

          The ID of the pipeline version.

        • CreationTime (datetime) --

          The creation time of the pipeline version.

        • PipelineVersionDescription (string) --

          The description of the pipeline version.

        • PipelineVersionDisplayName (string) --

          The display name of the pipeline version.

        • LastExecutionPipelineExecutionArn (string) --

          The Amazon Resource Name (ARN) of the most recent pipeline execution created from this pipeline version.

    • NextToken (string) --

      If the result of the previous ListPipelineVersions request was truncated, the response includes a NextToken. To retrieve the next set of pipeline versions, use this token in your next request.

UpdatePipelineVersion (new) Link ¶

Updates a pipeline version.

See also: AWS API Documentation

Request Syntax

client.update_pipeline_version(
    PipelineArn='string',
    PipelineVersionId=123,
    PipelineVersionDisplayName='string',
    PipelineVersionDescription='string'
)
type PipelineArn:

string

param PipelineArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the pipeline.

type PipelineVersionId:

integer

param PipelineVersionId:

[REQUIRED]

The pipeline version ID to update.

type PipelineVersionDisplayName:

string

param PipelineVersionDisplayName:

The display name of the pipeline version.

type PipelineVersionDescription:

string

param PipelineVersionDescription:

The description of the pipeline version.

rtype:

dict

returns:

Response Syntax

{
    'PipelineArn': 'string',
    'PipelineVersionId': 123
}

Response Structure

  • (dict) --

    • PipelineArn (string) --

      The Amazon Resource Name (ARN) of the pipeline.

    • PipelineVersionId (integer) --

      The ID of the pipeline version.

CreateCluster (updated) Link ¶
Changes (request)
{'RestrictedInstanceGroups': [{'EnvironmentConfig': {'FSxLustreConfig': {'PerUnitStorageThroughput': 'integer',
                                                                         'SizeInGiB': 'integer'}},
                               'ExecutionRole': 'string',
                               'InstanceCount': 'integer',
                               'InstanceGroupName': 'string',
                               'InstanceStorageConfigs': [{'EbsVolumeConfig': {'VolumeSizeInGB': 'integer'}}],
                               'InstanceType': 'ml.p4d.24xlarge | '
                                               'ml.p4de.24xlarge | '
                                               'ml.p5.48xlarge | '
                                               'ml.trn1.32xlarge | '
                                               'ml.trn1n.32xlarge | '
                                               'ml.g5.xlarge | ml.g5.2xlarge | '
                                               'ml.g5.4xlarge | ml.g5.8xlarge '
                                               '| ml.g5.12xlarge | '
                                               'ml.g5.16xlarge | '
                                               'ml.g5.24xlarge | '
                                               'ml.g5.48xlarge | ml.c5.large | '
                                               'ml.c5.xlarge | ml.c5.2xlarge | '
                                               'ml.c5.4xlarge | ml.c5.9xlarge '
                                               '| ml.c5.12xlarge | '
                                               'ml.c5.18xlarge | '
                                               'ml.c5.24xlarge | ml.c5n.large '
                                               '| ml.c5n.2xlarge | '
                                               'ml.c5n.4xlarge | '
                                               'ml.c5n.9xlarge | '
                                               'ml.c5n.18xlarge | ml.m5.large '
                                               '| ml.m5.xlarge | ml.m5.2xlarge '
                                               '| ml.m5.4xlarge | '
                                               'ml.m5.8xlarge | ml.m5.12xlarge '
                                               '| ml.m5.16xlarge | '
                                               'ml.m5.24xlarge | ml.t3.medium '
                                               '| ml.t3.large | ml.t3.xlarge | '
                                               'ml.t3.2xlarge | ml.g6.xlarge | '
                                               'ml.g6.2xlarge | ml.g6.4xlarge '
                                               '| ml.g6.8xlarge | '
                                               'ml.g6.16xlarge | '
                                               'ml.g6.12xlarge | '
                                               'ml.g6.24xlarge | '
                                               'ml.g6.48xlarge | '
                                               'ml.gr6.4xlarge | '
                                               'ml.gr6.8xlarge | ml.g6e.xlarge '
                                               '| ml.g6e.2xlarge | '
                                               'ml.g6e.4xlarge | '
                                               'ml.g6e.8xlarge | '
                                               'ml.g6e.16xlarge | '
                                               'ml.g6e.12xlarge | '
                                               'ml.g6e.24xlarge | '
                                               'ml.g6e.48xlarge | '
                                               'ml.p5e.48xlarge | '
                                               'ml.p5en.48xlarge | '
                                               'ml.p6-b200.48xlarge | '
                                               'ml.trn2.48xlarge | '
                                               'ml.c6i.large | ml.c6i.xlarge | '
                                               'ml.c6i.2xlarge | '
                                               'ml.c6i.4xlarge | '
                                               'ml.c6i.8xlarge | '
                                               'ml.c6i.12xlarge | '
                                               'ml.c6i.16xlarge | '
                                               'ml.c6i.24xlarge | '
                                               'ml.c6i.32xlarge | ml.m6i.large '
                                               '| ml.m6i.xlarge | '
                                               'ml.m6i.2xlarge | '
                                               'ml.m6i.4xlarge | '
                                               'ml.m6i.8xlarge | '
                                               'ml.m6i.12xlarge | '
                                               'ml.m6i.16xlarge | '
                                               'ml.m6i.24xlarge | '
                                               'ml.m6i.32xlarge | ml.r6i.large '
                                               '| ml.r6i.xlarge | '
                                               'ml.r6i.2xlarge | '
                                               'ml.r6i.4xlarge | '
                                               'ml.r6i.8xlarge | '
                                               'ml.r6i.12xlarge | '
                                               'ml.r6i.16xlarge | '
                                               'ml.r6i.24xlarge | '
                                               'ml.r6i.32xlarge | '
                                               'ml.i3en.large | ml.i3en.xlarge '
                                               '| ml.i3en.2xlarge | '
                                               'ml.i3en.3xlarge | '
                                               'ml.i3en.6xlarge | '
                                               'ml.i3en.12xlarge | '
                                               'ml.i3en.24xlarge | '
                                               'ml.m7i.large | ml.m7i.xlarge | '
                                               'ml.m7i.2xlarge | '
                                               'ml.m7i.4xlarge | '
                                               'ml.m7i.8xlarge | '
                                               'ml.m7i.12xlarge | '
                                               'ml.m7i.16xlarge | '
                                               'ml.m7i.24xlarge | '
                                               'ml.m7i.48xlarge | ml.r7i.large '
                                               '| ml.r7i.xlarge | '
                                               'ml.r7i.2xlarge | '
                                               'ml.r7i.4xlarge | '
                                               'ml.r7i.8xlarge | '
                                               'ml.r7i.12xlarge | '
                                               'ml.r7i.16xlarge | '
                                               'ml.r7i.24xlarge | '
                                               'ml.r7i.48xlarge',
                               'OnStartDeepHealthChecks': ['InstanceStress | '
                                                           'InstanceConnectivity'],
                               'OverrideVpcConfig': {'SecurityGroupIds': ['string'],
                                                     'Subnets': ['string']},
                               'ScheduledUpdateConfig': {'DeploymentConfig': {'AutoRollbackConfiguration': [{'AlarmName': 'string'}],
                                                                              'RollingUpdatePolicy': {'MaximumBatchSize': {'Type': 'INSTANCE_COUNT '
                                                                                                                                   '| '
                                                                                                                                   'CAPACITY_PERCENTAGE',
                                                                                                                           'Value': 'integer'},
                                                                                                      'RollbackMaximumBatchSize': {'Type': 'INSTANCE_COUNT '
                                                                                                                                           '| '
                                                                                                                                           'CAPACITY_PERCENTAGE',
                                                                                                                                   'Value': 'integer'}},
                                                                              'WaitIntervalInSeconds': 'integer'},
                                                         'ScheduleExpression': 'string'},
                               'ThreadsPerCore': 'integer',
                               'TrainingPlanArn': 'string'}]}

Creates a SageMaker HyperPod cluster. SageMaker HyperPod is a capability of SageMaker for creating and managing persistent clusters for developing large machine learning models, such as large language models (LLMs) and diffusion models. To learn more, see Amazon SageMaker HyperPod in the Amazon SageMaker Developer Guide.

See also: AWS API Documentation

Request Syntax

client.create_cluster(
    ClusterName='string',
    InstanceGroups=[
        {
            'InstanceCount': 123,
            'InstanceGroupName': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'LifeCycleConfig': {
                'SourceS3Uri': 'string',
                'OnCreate': 'string'
            },
            'ExecutionRole': 'string',
            'ThreadsPerCore': 123,
            'InstanceStorageConfigs': [
                {
                    'EbsVolumeConfig': {
                        'VolumeSizeInGB': 123
                    }
                },
            ],
            'OnStartDeepHealthChecks': [
                'InstanceStress'|'InstanceConnectivity',
            ],
            'TrainingPlanArn': 'string',
            'OverrideVpcConfig': {
                'SecurityGroupIds': [
                    'string',
                ],
                'Subnets': [
                    'string',
                ]
            },
            'ScheduledUpdateConfig': {
                'ScheduleExpression': 'string',
                'DeploymentConfig': {
                    'RollingUpdatePolicy': {
                        'MaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        },
                        'RollbackMaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        }
                    },
                    'WaitIntervalInSeconds': 123,
                    'AutoRollbackConfiguration': [
                        {
                            'AlarmName': 'string'
                        },
                    ]
                }
            }
        },
    ],
    RestrictedInstanceGroups=[
        {
            'InstanceCount': 123,
            'InstanceGroupName': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'ExecutionRole': 'string',
            'ThreadsPerCore': 123,
            'InstanceStorageConfigs': [
                {
                    'EbsVolumeConfig': {
                        'VolumeSizeInGB': 123
                    }
                },
            ],
            'OnStartDeepHealthChecks': [
                'InstanceStress'|'InstanceConnectivity',
            ],
            'TrainingPlanArn': 'string',
            'OverrideVpcConfig': {
                'SecurityGroupIds': [
                    'string',
                ],
                'Subnets': [
                    'string',
                ]
            },
            'ScheduledUpdateConfig': {
                'ScheduleExpression': 'string',
                'DeploymentConfig': {
                    'RollingUpdatePolicy': {
                        'MaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        },
                        'RollbackMaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        }
                    },
                    'WaitIntervalInSeconds': 123,
                    'AutoRollbackConfiguration': [
                        {
                            'AlarmName': 'string'
                        },
                    ]
                }
            },
            'EnvironmentConfig': {
                'FSxLustreConfig': {
                    'SizeInGiB': 123,
                    'PerUnitStorageThroughput': 123
                }
            }
        },
    ],
    VpcConfig={
        'SecurityGroupIds': [
            'string',
        ],
        'Subnets': [
            'string',
        ]
    },
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ],
    Orchestrator={
        'Eks': {
            'ClusterArn': 'string'
        }
    },
    NodeRecovery='Automatic'|'None'
)
type ClusterName:

string

param ClusterName:

[REQUIRED]

The name for the new SageMaker HyperPod cluster.

type InstanceGroups:

list

param InstanceGroups:

The instance groups to be created in the SageMaker HyperPod cluster.

  • (dict) --

    The specifications of an instance group that you need to define.

    • InstanceCount (integer) -- [REQUIRED]

      Specifies the number of instances to add to the instance group of a SageMaker HyperPod cluster.

    • InstanceGroupName (string) -- [REQUIRED]

      Specifies the name of the instance group.

    • InstanceType (string) -- [REQUIRED]

      Specifies the instance type of the instance group.

    • LifeCycleConfig (dict) -- [REQUIRED]

      Specifies the LifeCycle configuration for the instance group.

      • SourceS3Uri (string) -- [REQUIRED]

        An Amazon S3 bucket path where your lifecycle scripts are stored.

      • OnCreate (string) -- [REQUIRED]

        The file name of the entrypoint script of lifecycle scripts under SourceS3Uri. This entrypoint script runs during cluster creation.

    • ExecutionRole (string) -- [REQUIRED]

      Specifies an IAM execution role to be assumed by the instance group.

    • ThreadsPerCore (integer) --

      Specifies the value for Threads per core. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For instance types that doesn't support multithreading, specify 1. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.

    • InstanceStorageConfigs (list) --

      Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster instance group.

      • (dict) --

        Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.

        • EbsVolumeConfig (dict) --

          Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

          • VolumeSizeInGB (integer) -- [REQUIRED]

            The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

    • OnStartDeepHealthChecks (list) --

      A flag indicating whether deep health checks should be performed when the cluster instance group is created or updated.

      • (string) --

    • TrainingPlanArn (string) --

      The Amazon Resource Name (ARN); of the training plan to use for this cluster instance group.

      For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

    • OverrideVpcConfig (dict) --

      To configure multi-AZ deployments, customize the Amazon VPC configuration at the instance group level. You can specify different subnets and security groups across different AZs in the instance group specification to override a SageMaker HyperPod cluster's default Amazon VPC configuration. For more information about deploying a cluster in multiple AZs, see Setting up SageMaker HyperPod clusters across multiple AZs.

      • SecurityGroupIds (list) -- [REQUIRED]

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) --

      • Subnets (list) -- [REQUIRED]

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) --

    • ScheduledUpdateConfig (dict) --

      The configuration object of the schedule that SageMaker uses to update the AMI.

      • ScheduleExpression (string) -- [REQUIRED]

        A cron expression that specifies the schedule that SageMaker follows when updating the AMI.

      • DeploymentConfig (dict) --

        The configuration to use when updating the AMI versions.

        • RollingUpdatePolicy (dict) --

          The policy that SageMaker uses when updating the AMI versions of the cluster.

          • MaximumBatchSize (dict) -- [REQUIRED]

            The maximum amount of instances in the cluster that SageMaker can update at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

          • RollbackMaximumBatchSize (dict) --

            The maximum amount of instances in the cluster that SageMaker can roll back at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

        • WaitIntervalInSeconds (integer) --

          The duration in seconds that SageMaker waits before updating more instances in the cluster.

        • AutoRollbackConfiguration (list) --

          An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.

          • (dict) --

            The details of the alarm to monitor during the AMI update.

            • AlarmName (string) -- [REQUIRED]

              The name of the alarm.

type RestrictedInstanceGroups:

list

param RestrictedInstanceGroups:

The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.

  • (dict) --

    The specifications of a restricted instance group that you need to define.

    • InstanceCount (integer) -- [REQUIRED]

      Specifies the number of instances to add to the restricted instance group of a SageMaker HyperPod cluster.

    • InstanceGroupName (string) -- [REQUIRED]

      Specifies the name of the restricted instance group.

    • InstanceType (string) -- [REQUIRED]

      Specifies the instance type of the restricted instance group.

    • ExecutionRole (string) -- [REQUIRED]

      Specifies an IAM execution role to be assumed by the restricted instance group.

    • ThreadsPerCore (integer) --

      The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.

    • InstanceStorageConfigs (list) --

      Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster restricted instance group.

      • (dict) --

        Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.

        • EbsVolumeConfig (dict) --

          Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

          • VolumeSizeInGB (integer) -- [REQUIRED]

            The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

    • OnStartDeepHealthChecks (list) --

      A flag indicating whether deep health checks should be performed when the cluster restricted instance group is created or updated.

      • (string) --

    • TrainingPlanArn (string) --

      The Amazon Resource Name (ARN) of the training plan to filter clusters by. For more information about reserving GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

    • OverrideVpcConfig (dict) --

      Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.

      • SecurityGroupIds (list) -- [REQUIRED]

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) --

      • Subnets (list) -- [REQUIRED]

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) --

    • ScheduledUpdateConfig (dict) --

      The configuration object of the schedule that SageMaker follows when updating the AMI.

      • ScheduleExpression (string) -- [REQUIRED]

        A cron expression that specifies the schedule that SageMaker follows when updating the AMI.

      • DeploymentConfig (dict) --

        The configuration to use when updating the AMI versions.

        • RollingUpdatePolicy (dict) --

          The policy that SageMaker uses when updating the AMI versions of the cluster.

          • MaximumBatchSize (dict) -- [REQUIRED]

            The maximum amount of instances in the cluster that SageMaker can update at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

          • RollbackMaximumBatchSize (dict) --

            The maximum amount of instances in the cluster that SageMaker can roll back at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

        • WaitIntervalInSeconds (integer) --

          The duration in seconds that SageMaker waits before updating more instances in the cluster.

        • AutoRollbackConfiguration (list) --

          An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.

          • (dict) --

            The details of the alarm to monitor during the AMI update.

            • AlarmName (string) -- [REQUIRED]

              The name of the alarm.

    • EnvironmentConfig (dict) -- [REQUIRED]

      The configuration for the restricted instance groups (RIG) environment.

      • FSxLustreConfig (dict) --

        Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.

        • SizeInGiB (integer) -- [REQUIRED]

          The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).

        • PerUnitStorageThroughput (integer) -- [REQUIRED]

          The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.

type VpcConfig:

dict

param VpcConfig:

Specifies the Amazon Virtual Private Cloud (VPC) that is associated with the Amazon SageMaker HyperPod cluster. You can control access to and from your resources by configuring your VPC. For more information, see Give SageMaker access to resources in your Amazon VPC.

  • SecurityGroupIds (list) -- [REQUIRED]

    The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

    • (string) --

  • Subnets (list) -- [REQUIRED]

    The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

    • (string) --

type Tags:

list

param Tags:

Custom tags for managing the SageMaker HyperPod cluster as an Amazon Web Services resource. You can add tags to your cluster in the same way you add them in other Amazon Web Services services that support tagging. To learn more about tagging Amazon Web Services resources in general, see Tagging Amazon Web Services Resources User Guide.

  • (dict) --

    A tag object that consists of a key and an optional value, used to manage metadata for SageMaker Amazon Web Services resources.

    You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags.

    For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources. For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy.

    • Key (string) -- [REQUIRED]

      The tag key. Tag keys must be unique per resource.

    • Value (string) -- [REQUIRED]

      The tag value.

type Orchestrator:

dict

param Orchestrator:

The type of orchestrator to use for the SageMaker HyperPod cluster. Currently, the only supported value is "eks", which is to use an Amazon Elastic Kubernetes Service (EKS) cluster as the orchestrator.

  • Eks (dict) -- [REQUIRED]

    The Amazon EKS cluster used as the orchestrator for the SageMaker HyperPod cluster.

    • ClusterArn (string) -- [REQUIRED]

      The Amazon Resource Name (ARN) of the Amazon EKS cluster associated with the SageMaker HyperPod cluster.

type NodeRecovery:

string

param NodeRecovery:

The node recovery mode for the SageMaker HyperPod cluster. When set to Automatic, SageMaker HyperPod will automatically reboot or replace faulty nodes when issues are detected. When set to None, cluster administrators will need to manually manage any faulty cluster instances.

rtype:

dict

returns:

Response Syntax

{
    'ClusterArn': 'string'
}

Response Structure

  • (dict) --

    • ClusterArn (string) --

      The Amazon Resource Name (ARN) of the cluster.

DescribeCluster (updated) Link ¶
Changes (response)
{'RestrictedInstanceGroups': [{'CurrentCount': 'integer',
                               'EnvironmentConfig': {'FSxLustreConfig': {'PerUnitStorageThroughput': 'integer',
                                                                         'SizeInGiB': 'integer'},
                                                     'S3OutputPath': 'string'},
                               'ExecutionRole': 'string',
                               'InstanceGroupName': 'string',
                               'InstanceStorageConfigs': [{'EbsVolumeConfig': {'VolumeSizeInGB': 'integer'}}],
                               'InstanceType': 'ml.p4d.24xlarge | '
                                               'ml.p4de.24xlarge | '
                                               'ml.p5.48xlarge | '
                                               'ml.trn1.32xlarge | '
                                               'ml.trn1n.32xlarge | '
                                               'ml.g5.xlarge | ml.g5.2xlarge | '
                                               'ml.g5.4xlarge | ml.g5.8xlarge '
                                               '| ml.g5.12xlarge | '
                                               'ml.g5.16xlarge | '
                                               'ml.g5.24xlarge | '
                                               'ml.g5.48xlarge | ml.c5.large | '
                                               'ml.c5.xlarge | ml.c5.2xlarge | '
                                               'ml.c5.4xlarge | ml.c5.9xlarge '
                                               '| ml.c5.12xlarge | '
                                               'ml.c5.18xlarge | '
                                               'ml.c5.24xlarge | ml.c5n.large '
                                               '| ml.c5n.2xlarge | '
                                               'ml.c5n.4xlarge | '
                                               'ml.c5n.9xlarge | '
                                               'ml.c5n.18xlarge | ml.m5.large '
                                               '| ml.m5.xlarge | ml.m5.2xlarge '
                                               '| ml.m5.4xlarge | '
                                               'ml.m5.8xlarge | ml.m5.12xlarge '
                                               '| ml.m5.16xlarge | '
                                               'ml.m5.24xlarge | ml.t3.medium '
                                               '| ml.t3.large | ml.t3.xlarge | '
                                               'ml.t3.2xlarge | ml.g6.xlarge | '
                                               'ml.g6.2xlarge | ml.g6.4xlarge '
                                               '| ml.g6.8xlarge | '
                                               'ml.g6.16xlarge | '
                                               'ml.g6.12xlarge | '
                                               'ml.g6.24xlarge | '
                                               'ml.g6.48xlarge | '
                                               'ml.gr6.4xlarge | '
                                               'ml.gr6.8xlarge | ml.g6e.xlarge '
                                               '| ml.g6e.2xlarge | '
                                               'ml.g6e.4xlarge | '
                                               'ml.g6e.8xlarge | '
                                               'ml.g6e.16xlarge | '
                                               'ml.g6e.12xlarge | '
                                               'ml.g6e.24xlarge | '
                                               'ml.g6e.48xlarge | '
                                               'ml.p5e.48xlarge | '
                                               'ml.p5en.48xlarge | '
                                               'ml.p6-b200.48xlarge | '
                                               'ml.trn2.48xlarge | '
                                               'ml.c6i.large | ml.c6i.xlarge | '
                                               'ml.c6i.2xlarge | '
                                               'ml.c6i.4xlarge | '
                                               'ml.c6i.8xlarge | '
                                               'ml.c6i.12xlarge | '
                                               'ml.c6i.16xlarge | '
                                               'ml.c6i.24xlarge | '
                                               'ml.c6i.32xlarge | ml.m6i.large '
                                               '| ml.m6i.xlarge | '
                                               'ml.m6i.2xlarge | '
                                               'ml.m6i.4xlarge | '
                                               'ml.m6i.8xlarge | '
                                               'ml.m6i.12xlarge | '
                                               'ml.m6i.16xlarge | '
                                               'ml.m6i.24xlarge | '
                                               'ml.m6i.32xlarge | ml.r6i.large '
                                               '| ml.r6i.xlarge | '
                                               'ml.r6i.2xlarge | '
                                               'ml.r6i.4xlarge | '
                                               'ml.r6i.8xlarge | '
                                               'ml.r6i.12xlarge | '
                                               'ml.r6i.16xlarge | '
                                               'ml.r6i.24xlarge | '
                                               'ml.r6i.32xlarge | '
                                               'ml.i3en.large | ml.i3en.xlarge '
                                               '| ml.i3en.2xlarge | '
                                               'ml.i3en.3xlarge | '
                                               'ml.i3en.6xlarge | '
                                               'ml.i3en.12xlarge | '
                                               'ml.i3en.24xlarge | '
                                               'ml.m7i.large | ml.m7i.xlarge | '
                                               'ml.m7i.2xlarge | '
                                               'ml.m7i.4xlarge | '
                                               'ml.m7i.8xlarge | '
                                               'ml.m7i.12xlarge | '
                                               'ml.m7i.16xlarge | '
                                               'ml.m7i.24xlarge | '
                                               'ml.m7i.48xlarge | ml.r7i.large '
                                               '| ml.r7i.xlarge | '
                                               'ml.r7i.2xlarge | '
                                               'ml.r7i.4xlarge | '
                                               'ml.r7i.8xlarge | '
                                               'ml.r7i.12xlarge | '
                                               'ml.r7i.16xlarge | '
                                               'ml.r7i.24xlarge | '
                                               'ml.r7i.48xlarge',
                               'OnStartDeepHealthChecks': ['InstanceStress | '
                                                           'InstanceConnectivity'],
                               'OverrideVpcConfig': {'SecurityGroupIds': ['string'],
                                                     'Subnets': ['string']},
                               'ScheduledUpdateConfig': {'DeploymentConfig': {'AutoRollbackConfiguration': [{'AlarmName': 'string'}],
                                                                              'RollingUpdatePolicy': {'MaximumBatchSize': {'Type': 'INSTANCE_COUNT '
                                                                                                                                   '| '
                                                                                                                                   'CAPACITY_PERCENTAGE',
                                                                                                                           'Value': 'integer'},
                                                                                                      'RollbackMaximumBatchSize': {'Type': 'INSTANCE_COUNT '
                                                                                                                                           '| '
                                                                                                                                           'CAPACITY_PERCENTAGE',
                                                                                                                                   'Value': 'integer'}},
                                                                              'WaitIntervalInSeconds': 'integer'},
                                                         'ScheduleExpression': 'string'},
                               'Status': 'InService | Creating | Updating | '
                                         'Failed | Degraded | SystemUpdating | '
                                         'Deleting',
                               'TargetCount': 'integer',
                               'ThreadsPerCore': 'integer',
                               'TrainingPlanArn': 'string',
                               'TrainingPlanStatus': 'string'}]}

Retrieves information of a SageMaker HyperPod cluster.

See also: AWS API Documentation

Request Syntax

client.describe_cluster(
    ClusterName='string'
)
type ClusterName:

string

param ClusterName:

[REQUIRED]

The string name or the Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.

rtype:

dict

returns:

Response Syntax

{
    'ClusterArn': 'string',
    'ClusterName': 'string',
    'ClusterStatus': 'Creating'|'Deleting'|'Failed'|'InService'|'RollingBack'|'SystemUpdating'|'Updating',
    'CreationTime': datetime(2015, 1, 1),
    'FailureMessage': 'string',
    'InstanceGroups': [
        {
            'CurrentCount': 123,
            'TargetCount': 123,
            'InstanceGroupName': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'LifeCycleConfig': {
                'SourceS3Uri': 'string',
                'OnCreate': 'string'
            },
            'ExecutionRole': 'string',
            'ThreadsPerCore': 123,
            'InstanceStorageConfigs': [
                {
                    'EbsVolumeConfig': {
                        'VolumeSizeInGB': 123
                    }
                },
            ],
            'OnStartDeepHealthChecks': [
                'InstanceStress'|'InstanceConnectivity',
            ],
            'Status': 'InService'|'Creating'|'Updating'|'Failed'|'Degraded'|'SystemUpdating'|'Deleting',
            'TrainingPlanArn': 'string',
            'TrainingPlanStatus': 'string',
            'OverrideVpcConfig': {
                'SecurityGroupIds': [
                    'string',
                ],
                'Subnets': [
                    'string',
                ]
            },
            'ScheduledUpdateConfig': {
                'ScheduleExpression': 'string',
                'DeploymentConfig': {
                    'RollingUpdatePolicy': {
                        'MaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        },
                        'RollbackMaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        }
                    },
                    'WaitIntervalInSeconds': 123,
                    'AutoRollbackConfiguration': [
                        {
                            'AlarmName': 'string'
                        },
                    ]
                }
            }
        },
    ],
    'RestrictedInstanceGroups': [
        {
            'CurrentCount': 123,
            'TargetCount': 123,
            'InstanceGroupName': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'ExecutionRole': 'string',
            'ThreadsPerCore': 123,
            'InstanceStorageConfigs': [
                {
                    'EbsVolumeConfig': {
                        'VolumeSizeInGB': 123
                    }
                },
            ],
            'OnStartDeepHealthChecks': [
                'InstanceStress'|'InstanceConnectivity',
            ],
            'Status': 'InService'|'Creating'|'Updating'|'Failed'|'Degraded'|'SystemUpdating'|'Deleting',
            'TrainingPlanArn': 'string',
            'TrainingPlanStatus': 'string',
            'OverrideVpcConfig': {
                'SecurityGroupIds': [
                    'string',
                ],
                'Subnets': [
                    'string',
                ]
            },
            'ScheduledUpdateConfig': {
                'ScheduleExpression': 'string',
                'DeploymentConfig': {
                    'RollingUpdatePolicy': {
                        'MaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        },
                        'RollbackMaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        }
                    },
                    'WaitIntervalInSeconds': 123,
                    'AutoRollbackConfiguration': [
                        {
                            'AlarmName': 'string'
                        },
                    ]
                }
            },
            'EnvironmentConfig': {
                'FSxLustreConfig': {
                    'SizeInGiB': 123,
                    'PerUnitStorageThroughput': 123
                },
                'S3OutputPath': 'string'
            }
        },
    ],
    'VpcConfig': {
        'SecurityGroupIds': [
            'string',
        ],
        'Subnets': [
            'string',
        ]
    },
    'Orchestrator': {
        'Eks': {
            'ClusterArn': 'string'
        }
    },
    'NodeRecovery': 'Automatic'|'None'
}

Response Structure

  • (dict) --

    • ClusterArn (string) --

      The Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.

    • ClusterName (string) --

      The name of the SageMaker HyperPod cluster.

    • ClusterStatus (string) --

      The status of the SageMaker HyperPod cluster.

    • CreationTime (datetime) --

      The time when the SageMaker Cluster is created.

    • FailureMessage (string) --

      The failure message of the SageMaker HyperPod cluster.

    • InstanceGroups (list) --

      The instance groups of the SageMaker HyperPod cluster.

      • (dict) --

        Details of an instance group in a SageMaker HyperPod cluster.

        • CurrentCount (integer) --

          The number of instances that are currently in the instance group of a SageMaker HyperPod cluster.

        • TargetCount (integer) --

          The number of instances you specified to add to the instance group of a SageMaker HyperPod cluster.

        • InstanceGroupName (string) --

          The name of the instance group of a SageMaker HyperPod cluster.

        • InstanceType (string) --

          The instance type of the instance group of a SageMaker HyperPod cluster.

        • LifeCycleConfig (dict) --

          Details of LifeCycle configuration for the instance group.

          • SourceS3Uri (string) --

            An Amazon S3 bucket path where your lifecycle scripts are stored.

          • OnCreate (string) --

            The file name of the entrypoint script of lifecycle scripts under SourceS3Uri. This entrypoint script runs during cluster creation.

        • ExecutionRole (string) --

          The execution role for the instance group to assume.

        • ThreadsPerCore (integer) --

          The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.

        • InstanceStorageConfigs (list) --

          The additional storage configurations for the instances in the SageMaker HyperPod cluster instance group.

          • (dict) --

            Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.

            • EbsVolumeConfig (dict) --

              Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

              • VolumeSizeInGB (integer) --

                The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

        • OnStartDeepHealthChecks (list) --

          A flag indicating whether deep health checks should be performed when the cluster instance group is created or updated.

          • (string) --

        • Status (string) --

          The current status of the cluster instance group.

          • InService: The instance group is active and healthy.

          • Creating: The instance group is being provisioned.

          • Updating: The instance group is being updated.

          • Failed: The instance group has failed to provision or is no longer healthy.

          • Degraded: The instance group is degraded, meaning that some instances have failed to provision or are no longer healthy.

          • Deleting: The instance group is being deleted.

        • TrainingPlanArn (string) --

          The Amazon Resource Name (ARN); of the training plan associated with this cluster instance group.

          For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • TrainingPlanStatus (string) --

          The current status of the training plan associated with this cluster instance group.

        • OverrideVpcConfig (dict) --

          The customized Amazon VPC configuration at the instance group level that overrides the default Amazon VPC configuration of the SageMaker HyperPod cluster.

          • SecurityGroupIds (list) --

            The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

            • (string) --

          • Subnets (list) --

            The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

            • (string) --

        • ScheduledUpdateConfig (dict) --

          The configuration object of the schedule that SageMaker follows when updating the AMI.

          • ScheduleExpression (string) --

            A cron expression that specifies the schedule that SageMaker follows when updating the AMI.

          • DeploymentConfig (dict) --

            The configuration to use when updating the AMI versions.

            • RollingUpdatePolicy (dict) --

              The policy that SageMaker uses when updating the AMI versions of the cluster.

              • MaximumBatchSize (dict) --

                The maximum amount of instances in the cluster that SageMaker can update at a time.

                • Type (string) --

                  Specifies whether SageMaker should process the update by amount or percentage of instances.

                • Value (integer) --

                  Specifies the amount or percentage of instances SageMaker updates at a time.

              • RollbackMaximumBatchSize (dict) --

                The maximum amount of instances in the cluster that SageMaker can roll back at a time.

                • Type (string) --

                  Specifies whether SageMaker should process the update by amount or percentage of instances.

                • Value (integer) --

                  Specifies the amount or percentage of instances SageMaker updates at a time.

            • WaitIntervalInSeconds (integer) --

              The duration in seconds that SageMaker waits before updating more instances in the cluster.

            • AutoRollbackConfiguration (list) --

              An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.

              • (dict) --

                The details of the alarm to monitor during the AMI update.

                • AlarmName (string) --

                  The name of the alarm.

    • RestrictedInstanceGroups (list) --

      The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.

      • (dict) --

        The instance group details of the restricted instance group (RIG).

        • CurrentCount (integer) --

          The number of instances that are currently in the restricted instance group of a SageMaker HyperPod cluster.

        • TargetCount (integer) --

          The number of instances you specified to add to the restricted instance group of a SageMaker HyperPod cluster.

        • InstanceGroupName (string) --

          The name of the restricted instance group of a SageMaker HyperPod cluster.

        • InstanceType (string) --

          The instance type of the restricted instance group of a SageMaker HyperPod cluster.

        • ExecutionRole (string) --

          The execution role for the restricted instance group to assume.

        • ThreadsPerCore (integer) --

          The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.

        • InstanceStorageConfigs (list) --

          The additional storage configurations for the instances in the SageMaker HyperPod cluster restricted instance group.

          • (dict) --

            Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.

            • EbsVolumeConfig (dict) --

              Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

              • VolumeSizeInGB (integer) --

                The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

        • OnStartDeepHealthChecks (list) --

          A flag indicating whether deep health checks should be performed when the cluster's restricted instance group is created or updated.

          • (string) --

        • Status (string) --

          The current status of the cluster's restricted instance group.

          • InService: The restricted instance group is active and healthy.

          • Creating: The restricted instance group is being provisioned.

          • Updating: The restricted instance group is being updated.

          • Failed: The restricted instance group has failed to provision or is no longer healthy.

          • Degraded: The restricted instance group is degraded, meaning that some instances have failed to provision or are no longer healthy.

          • Deleting: The restricted instance group is being deleted.

        • TrainingPlanArn (string) --

          The Amazon Resource Name (ARN) of the training plan to filter clusters by. For more information about reserving GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • TrainingPlanStatus (string) --

          The current status of the training plan associated with this cluster restricted instance group.

        • OverrideVpcConfig (dict) --

          Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.

          • SecurityGroupIds (list) --

            The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

            • (string) --

          • Subnets (list) --

            The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

            • (string) --

        • ScheduledUpdateConfig (dict) --

          The configuration object of the schedule that SageMaker follows when updating the AMI.

          • ScheduleExpression (string) --

            A cron expression that specifies the schedule that SageMaker follows when updating the AMI.

          • DeploymentConfig (dict) --

            The configuration to use when updating the AMI versions.

            • RollingUpdatePolicy (dict) --

              The policy that SageMaker uses when updating the AMI versions of the cluster.

              • MaximumBatchSize (dict) --

                The maximum amount of instances in the cluster that SageMaker can update at a time.

                • Type (string) --

                  Specifies whether SageMaker should process the update by amount or percentage of instances.

                • Value (integer) --

                  Specifies the amount or percentage of instances SageMaker updates at a time.

              • RollbackMaximumBatchSize (dict) --

                The maximum amount of instances in the cluster that SageMaker can roll back at a time.

                • Type (string) --

                  Specifies whether SageMaker should process the update by amount or percentage of instances.

                • Value (integer) --

                  Specifies the amount or percentage of instances SageMaker updates at a time.

            • WaitIntervalInSeconds (integer) --

              The duration in seconds that SageMaker waits before updating more instances in the cluster.

            • AutoRollbackConfiguration (list) --

              An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.

              • (dict) --

                The details of the alarm to monitor during the AMI update.

                • AlarmName (string) --

                  The name of the alarm.

        • EnvironmentConfig (dict) --

          The configuration for the restricted instance groups (RIG) environment.

          • FSxLustreConfig (dict) --

            Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.

            • SizeInGiB (integer) --

              The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).

            • PerUnitStorageThroughput (integer) --

              The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.

          • S3OutputPath (string) --

            The Amazon S3 path where output data from the restricted instance group (RIG) environment will be stored.

    • VpcConfig (dict) --

      Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.

      • SecurityGroupIds (list) --

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) --

      • Subnets (list) --

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) --

    • Orchestrator (dict) --

      The type of orchestrator used for the SageMaker HyperPod cluster.

      • Eks (dict) --

        The Amazon EKS cluster used as the orchestrator for the SageMaker HyperPod cluster.

        • ClusterArn (string) --

          The Amazon Resource Name (ARN) of the Amazon EKS cluster associated with the SageMaker HyperPod cluster.

    • NodeRecovery (string) --

      The node recovery mode configured for the SageMaker HyperPod cluster.

DescribePipeline (updated) Link ¶
Changes (request, response)
Request
{'PipelineVersionId': 'long'}
Response
{'PipelineVersionDescription': 'string', 'PipelineVersionDisplayName': 'string'}

Describes the details of a pipeline.

See also: AWS API Documentation

Request Syntax

client.describe_pipeline(
    PipelineName='string',
    PipelineVersionId=123
)
type PipelineName:

string

param PipelineName:

[REQUIRED]

The name or Amazon Resource Name (ARN) of the pipeline to describe.

type PipelineVersionId:

integer

param PipelineVersionId:

The ID of the pipeline version to describe.

rtype:

dict

returns:

Response Syntax

{
    'PipelineArn': 'string',
    'PipelineName': 'string',
    'PipelineDisplayName': 'string',
    'PipelineDefinition': 'string',
    'PipelineDescription': 'string',
    'RoleArn': 'string',
    'PipelineStatus': 'Active'|'Deleting',
    'CreationTime': datetime(2015, 1, 1),
    'LastModifiedTime': datetime(2015, 1, 1),
    'LastRunTime': datetime(2015, 1, 1),
    'CreatedBy': {
        'UserProfileArn': 'string',
        'UserProfileName': 'string',
        'DomainId': 'string',
        'IamIdentity': {
            'Arn': 'string',
            'PrincipalId': 'string',
            'SourceIdentity': 'string'
        }
    },
    'LastModifiedBy': {
        'UserProfileArn': 'string',
        'UserProfileName': 'string',
        'DomainId': 'string',
        'IamIdentity': {
            'Arn': 'string',
            'PrincipalId': 'string',
            'SourceIdentity': 'string'
        }
    },
    'ParallelismConfiguration': {
        'MaxParallelExecutionSteps': 123
    },
    'PipelineVersionDisplayName': 'string',
    'PipelineVersionDescription': 'string'
}

Response Structure

  • (dict) --

    • PipelineArn (string) --

      The Amazon Resource Name (ARN) of the pipeline.

    • PipelineName (string) --

      The name of the pipeline.

    • PipelineDisplayName (string) --

      The display name of the pipeline.

    • PipelineDefinition (string) --

      The JSON pipeline definition.

    • PipelineDescription (string) --

      The description of the pipeline.

    • RoleArn (string) --

      The Amazon Resource Name (ARN) that the pipeline uses to execute.

    • PipelineStatus (string) --

      The status of the pipeline execution.

    • CreationTime (datetime) --

      The time when the pipeline was created.

    • LastModifiedTime (datetime) --

      The time when the pipeline was last modified.

    • LastRunTime (datetime) --

      The time when the pipeline was last run.

    • CreatedBy (dict) --

      Information about the user who created or modified a SageMaker resource.

      • UserProfileArn (string) --

        The Amazon Resource Name (ARN) of the user's profile.

      • UserProfileName (string) --

        The name of the user's profile.

      • DomainId (string) --

        The domain associated with the user.

      • IamIdentity (dict) --

        The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.

        • Arn (string) --

          The Amazon Resource Name (ARN) of the IAM identity.

        • PrincipalId (string) --

          The ID of the principal that assumes the IAM identity.

        • SourceIdentity (string) --

          The person or application which assumes the IAM identity.

    • LastModifiedBy (dict) --

      Information about the user who created or modified a SageMaker resource.

      • UserProfileArn (string) --

        The Amazon Resource Name (ARN) of the user's profile.

      • UserProfileName (string) --

        The name of the user's profile.

      • DomainId (string) --

        The domain associated with the user.

      • IamIdentity (dict) --

        The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.

        • Arn (string) --

          The Amazon Resource Name (ARN) of the IAM identity.

        • PrincipalId (string) --

          The ID of the principal that assumes the IAM identity.

        • SourceIdentity (string) --

          The person or application which assumes the IAM identity.

    • ParallelismConfiguration (dict) --

      Lists the parallelism configuration applied to the pipeline.

      • MaxParallelExecutionSteps (integer) --

        The max number of steps that can be executed in parallel.

    • PipelineVersionDisplayName (string) --

      The display name of the pipeline version.

    • PipelineVersionDescription (string) --

      The description of the pipeline version.

DescribePipelineExecution (updated) Link ¶
Changes (response)
{'PipelineVersionId': 'long'}

Describes the details of a pipeline execution.

See also: AWS API Documentation

Request Syntax

client.describe_pipeline_execution(
    PipelineExecutionArn='string'
)
type PipelineExecutionArn:

string

param PipelineExecutionArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the pipeline execution.

rtype:

dict

returns:

Response Syntax

{
    'PipelineArn': 'string',
    'PipelineExecutionArn': 'string',
    'PipelineExecutionDisplayName': 'string',
    'PipelineExecutionStatus': 'Executing'|'Stopping'|'Stopped'|'Failed'|'Succeeded',
    'PipelineExecutionDescription': 'string',
    'PipelineExperimentConfig': {
        'ExperimentName': 'string',
        'TrialName': 'string'
    },
    'FailureReason': 'string',
    'CreationTime': datetime(2015, 1, 1),
    'LastModifiedTime': datetime(2015, 1, 1),
    'CreatedBy': {
        'UserProfileArn': 'string',
        'UserProfileName': 'string',
        'DomainId': 'string',
        'IamIdentity': {
            'Arn': 'string',
            'PrincipalId': 'string',
            'SourceIdentity': 'string'
        }
    },
    'LastModifiedBy': {
        'UserProfileArn': 'string',
        'UserProfileName': 'string',
        'DomainId': 'string',
        'IamIdentity': {
            'Arn': 'string',
            'PrincipalId': 'string',
            'SourceIdentity': 'string'
        }
    },
    'ParallelismConfiguration': {
        'MaxParallelExecutionSteps': 123
    },
    'SelectiveExecutionConfig': {
        'SourcePipelineExecutionArn': 'string',
        'SelectedSteps': [
            {
                'StepName': 'string'
            },
        ]
    },
    'PipelineVersionId': 123
}

Response Structure

  • (dict) --

    • PipelineArn (string) --

      The Amazon Resource Name (ARN) of the pipeline.

    • PipelineExecutionArn (string) --

      The Amazon Resource Name (ARN) of the pipeline execution.

    • PipelineExecutionDisplayName (string) --

      The display name of the pipeline execution.

    • PipelineExecutionStatus (string) --

      The status of the pipeline execution.

    • PipelineExecutionDescription (string) --

      The description of the pipeline execution.

    • PipelineExperimentConfig (dict) --

      Specifies the names of the experiment and trial created by a pipeline.

      • ExperimentName (string) --

        The name of the experiment.

      • TrialName (string) --

        The name of the trial.

    • FailureReason (string) --

      If the execution failed, a message describing why.

    • CreationTime (datetime) --

      The time when the pipeline execution was created.

    • LastModifiedTime (datetime) --

      The time when the pipeline execution was modified last.

    • CreatedBy (dict) --

      Information about the user who created or modified a SageMaker resource.

      • UserProfileArn (string) --

        The Amazon Resource Name (ARN) of the user's profile.

      • UserProfileName (string) --

        The name of the user's profile.

      • DomainId (string) --

        The domain associated with the user.

      • IamIdentity (dict) --

        The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.

        • Arn (string) --

          The Amazon Resource Name (ARN) of the IAM identity.

        • PrincipalId (string) --

          The ID of the principal that assumes the IAM identity.

        • SourceIdentity (string) --

          The person or application which assumes the IAM identity.

    • LastModifiedBy (dict) --

      Information about the user who created or modified a SageMaker resource.

      • UserProfileArn (string) --

        The Amazon Resource Name (ARN) of the user's profile.

      • UserProfileName (string) --

        The name of the user's profile.

      • DomainId (string) --

        The domain associated with the user.

      • IamIdentity (dict) --

        The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.

        • Arn (string) --

          The Amazon Resource Name (ARN) of the IAM identity.

        • PrincipalId (string) --

          The ID of the principal that assumes the IAM identity.

        • SourceIdentity (string) --

          The person or application which assumes the IAM identity.

    • ParallelismConfiguration (dict) --

      The parallelism configuration applied to the pipeline.

      • MaxParallelExecutionSteps (integer) --

        The max number of steps that can be executed in parallel.

    • SelectiveExecutionConfig (dict) --

      The selective execution configuration applied to the pipeline run.

      • SourcePipelineExecutionArn (string) --

        The ARN from a reference execution of the current pipeline. Used to copy input collaterals needed for the selected steps to run. The execution status of the pipeline can be either Failed or Success.

        This field is required if the steps you specify for SelectedSteps depend on output collaterals from any non-specified pipeline steps. For more information, see Selective Execution for Pipeline Steps.

      • SelectedSteps (list) --

        A list of pipeline steps to run. All step(s) in all path(s) between two selected steps should be included.

        • (dict) --

          A step selected to run in selective execution mode.

          • StepName (string) --

            The name of the pipeline step.

    • PipelineVersionId (integer) --

      The ID of the pipeline version.

GetSearchSuggestions (updated) Link ¶
Changes (request)
{'Resource': {'PipelineVersion'}}

An auto-complete API for the search functionality in the SageMaker console. It returns suggestions of possible matches for the property name to use in Search queries. Provides suggestions for HyperParameters, Tags, and Metrics.

See also: AWS API Documentation

Request Syntax

client.get_search_suggestions(
    Resource='TrainingJob'|'Experiment'|'ExperimentTrial'|'ExperimentTrialComponent'|'Endpoint'|'Model'|'ModelPackage'|'ModelPackageGroup'|'Pipeline'|'PipelineExecution'|'FeatureGroup'|'FeatureMetadata'|'Image'|'ImageVersion'|'Project'|'HyperParameterTuningJob'|'ModelCard'|'PipelineVersion',
    SuggestionQuery={
        'PropertyNameQuery': {
            'PropertyNameHint': 'string'
        }
    }
)
type Resource:

string

param Resource:

[REQUIRED]

The name of the SageMaker resource to search for.

type SuggestionQuery:

dict

param SuggestionQuery:

Limits the property names that are included in the response.

  • PropertyNameQuery (dict) --

    Defines a property name hint. Only property names that begin with the specified hint are included in the response.

    • PropertyNameHint (string) -- [REQUIRED]

      Text that begins a property's name.

rtype:

dict

returns:

Response Syntax

{
    'PropertyNameSuggestions': [
        {
            'PropertyName': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • PropertyNameSuggestions (list) --

      A list of property names for a Resource that match a SuggestionQuery.

      • (dict) --

        A property name returned from a GetSearchSuggestions call that specifies a value in the PropertyNameQuery field.

        • PropertyName (string) --

          A suggested property name based on what you entered in the search textbox in the SageMaker console.

StartPipelineExecution (updated) Link ¶
Changes (request)
{'PipelineVersionId': 'long'}

Starts a pipeline execution.

See also: AWS API Documentation

Request Syntax

client.start_pipeline_execution(
    PipelineName='string',
    PipelineExecutionDisplayName='string',
    PipelineParameters=[
        {
            'Name': 'string',
            'Value': 'string'
        },
    ],
    PipelineExecutionDescription='string',
    ClientRequestToken='string',
    ParallelismConfiguration={
        'MaxParallelExecutionSteps': 123
    },
    SelectiveExecutionConfig={
        'SourcePipelineExecutionArn': 'string',
        'SelectedSteps': [
            {
                'StepName': 'string'
            },
        ]
    },
    PipelineVersionId=123
)
type PipelineName:

string

param PipelineName:

[REQUIRED]

The name or Amazon Resource Name (ARN) of the pipeline.

type PipelineExecutionDisplayName:

string

param PipelineExecutionDisplayName:

The display name of the pipeline execution.

type PipelineParameters:

list

param PipelineParameters:

Contains a list of pipeline parameters. This list can be empty.

  • (dict) --

    Assigns a value to a named Pipeline parameter.

    • Name (string) -- [REQUIRED]

      The name of the parameter to assign a value to. This parameter name must match a named parameter in the pipeline definition.

    • Value (string) -- [REQUIRED]

      The literal value for the parameter.

type PipelineExecutionDescription:

string

param PipelineExecutionDescription:

The description of the pipeline execution.

type ClientRequestToken:

string

param ClientRequestToken:

[REQUIRED]

A unique, case-sensitive identifier that you provide to ensure the idempotency of the operation. An idempotent operation completes no more than once.

This field is autopopulated if not provided.

type ParallelismConfiguration:

dict

param ParallelismConfiguration:

This configuration, if specified, overrides the parallelism configuration of the parent pipeline for this specific run.

  • MaxParallelExecutionSteps (integer) -- [REQUIRED]

    The max number of steps that can be executed in parallel.

type SelectiveExecutionConfig:

dict

param SelectiveExecutionConfig:

The selective execution configuration applied to the pipeline run.

  • SourcePipelineExecutionArn (string) --

    The ARN from a reference execution of the current pipeline. Used to copy input collaterals needed for the selected steps to run. The execution status of the pipeline can be either Failed or Success.

    This field is required if the steps you specify for SelectedSteps depend on output collaterals from any non-specified pipeline steps. For more information, see Selective Execution for Pipeline Steps.

  • SelectedSteps (list) -- [REQUIRED]

    A list of pipeline steps to run. All step(s) in all path(s) between two selected steps should be included.

    • (dict) --

      A step selected to run in selective execution mode.

      • StepName (string) -- [REQUIRED]

        The name of the pipeline step.

type PipelineVersionId:

integer

param PipelineVersionId:

The ID of the pipeline version to start execution from.

rtype:

dict

returns:

Response Syntax

{
    'PipelineExecutionArn': 'string'
}

Response Structure

  • (dict) --

    • PipelineExecutionArn (string) --

      The Amazon Resource Name (ARN) of the pipeline execution.

UpdateCluster (updated) Link ¶
Changes (request)
{'RestrictedInstanceGroups': [{'EnvironmentConfig': {'FSxLustreConfig': {'PerUnitStorageThroughput': 'integer',
                                                                         'SizeInGiB': 'integer'}},
                               'ExecutionRole': 'string',
                               'InstanceCount': 'integer',
                               'InstanceGroupName': 'string',
                               'InstanceStorageConfigs': [{'EbsVolumeConfig': {'VolumeSizeInGB': 'integer'}}],
                               'InstanceType': 'ml.p4d.24xlarge | '
                                               'ml.p4de.24xlarge | '
                                               'ml.p5.48xlarge | '
                                               'ml.trn1.32xlarge | '
                                               'ml.trn1n.32xlarge | '
                                               'ml.g5.xlarge | ml.g5.2xlarge | '
                                               'ml.g5.4xlarge | ml.g5.8xlarge '
                                               '| ml.g5.12xlarge | '
                                               'ml.g5.16xlarge | '
                                               'ml.g5.24xlarge | '
                                               'ml.g5.48xlarge | ml.c5.large | '
                                               'ml.c5.xlarge | ml.c5.2xlarge | '
                                               'ml.c5.4xlarge | ml.c5.9xlarge '
                                               '| ml.c5.12xlarge | '
                                               'ml.c5.18xlarge | '
                                               'ml.c5.24xlarge | ml.c5n.large '
                                               '| ml.c5n.2xlarge | '
                                               'ml.c5n.4xlarge | '
                                               'ml.c5n.9xlarge | '
                                               'ml.c5n.18xlarge | ml.m5.large '
                                               '| ml.m5.xlarge | ml.m5.2xlarge '
                                               '| ml.m5.4xlarge | '
                                               'ml.m5.8xlarge | ml.m5.12xlarge '
                                               '| ml.m5.16xlarge | '
                                               'ml.m5.24xlarge | ml.t3.medium '
                                               '| ml.t3.large | ml.t3.xlarge | '
                                               'ml.t3.2xlarge | ml.g6.xlarge | '
                                               'ml.g6.2xlarge | ml.g6.4xlarge '
                                               '| ml.g6.8xlarge | '
                                               'ml.g6.16xlarge | '
                                               'ml.g6.12xlarge | '
                                               'ml.g6.24xlarge | '
                                               'ml.g6.48xlarge | '
                                               'ml.gr6.4xlarge | '
                                               'ml.gr6.8xlarge | ml.g6e.xlarge '
                                               '| ml.g6e.2xlarge | '
                                               'ml.g6e.4xlarge | '
                                               'ml.g6e.8xlarge | '
                                               'ml.g6e.16xlarge | '
                                               'ml.g6e.12xlarge | '
                                               'ml.g6e.24xlarge | '
                                               'ml.g6e.48xlarge | '
                                               'ml.p5e.48xlarge | '
                                               'ml.p5en.48xlarge | '
                                               'ml.p6-b200.48xlarge | '
                                               'ml.trn2.48xlarge | '
                                               'ml.c6i.large | ml.c6i.xlarge | '
                                               'ml.c6i.2xlarge | '
                                               'ml.c6i.4xlarge | '
                                               'ml.c6i.8xlarge | '
                                               'ml.c6i.12xlarge | '
                                               'ml.c6i.16xlarge | '
                                               'ml.c6i.24xlarge | '
                                               'ml.c6i.32xlarge | ml.m6i.large '
                                               '| ml.m6i.xlarge | '
                                               'ml.m6i.2xlarge | '
                                               'ml.m6i.4xlarge | '
                                               'ml.m6i.8xlarge | '
                                               'ml.m6i.12xlarge | '
                                               'ml.m6i.16xlarge | '
                                               'ml.m6i.24xlarge | '
                                               'ml.m6i.32xlarge | ml.r6i.large '
                                               '| ml.r6i.xlarge | '
                                               'ml.r6i.2xlarge | '
                                               'ml.r6i.4xlarge | '
                                               'ml.r6i.8xlarge | '
                                               'ml.r6i.12xlarge | '
                                               'ml.r6i.16xlarge | '
                                               'ml.r6i.24xlarge | '
                                               'ml.r6i.32xlarge | '
                                               'ml.i3en.large | ml.i3en.xlarge '
                                               '| ml.i3en.2xlarge | '
                                               'ml.i3en.3xlarge | '
                                               'ml.i3en.6xlarge | '
                                               'ml.i3en.12xlarge | '
                                               'ml.i3en.24xlarge | '
                                               'ml.m7i.large | ml.m7i.xlarge | '
                                               'ml.m7i.2xlarge | '
                                               'ml.m7i.4xlarge | '
                                               'ml.m7i.8xlarge | '
                                               'ml.m7i.12xlarge | '
                                               'ml.m7i.16xlarge | '
                                               'ml.m7i.24xlarge | '
                                               'ml.m7i.48xlarge | ml.r7i.large '
                                               '| ml.r7i.xlarge | '
                                               'ml.r7i.2xlarge | '
                                               'ml.r7i.4xlarge | '
                                               'ml.r7i.8xlarge | '
                                               'ml.r7i.12xlarge | '
                                               'ml.r7i.16xlarge | '
                                               'ml.r7i.24xlarge | '
                                               'ml.r7i.48xlarge',
                               'OnStartDeepHealthChecks': ['InstanceStress | '
                                                           'InstanceConnectivity'],
                               'OverrideVpcConfig': {'SecurityGroupIds': ['string'],
                                                     'Subnets': ['string']},
                               'ScheduledUpdateConfig': {'DeploymentConfig': {'AutoRollbackConfiguration': [{'AlarmName': 'string'}],
                                                                              'RollingUpdatePolicy': {'MaximumBatchSize': {'Type': 'INSTANCE_COUNT '
                                                                                                                                   '| '
                                                                                                                                   'CAPACITY_PERCENTAGE',
                                                                                                                           'Value': 'integer'},
                                                                                                      'RollbackMaximumBatchSize': {'Type': 'INSTANCE_COUNT '
                                                                                                                                           '| '
                                                                                                                                           'CAPACITY_PERCENTAGE',
                                                                                                                                   'Value': 'integer'}},
                                                                              'WaitIntervalInSeconds': 'integer'},
                                                         'ScheduleExpression': 'string'},
                               'ThreadsPerCore': 'integer',
                               'TrainingPlanArn': 'string'}]}

Updates a SageMaker HyperPod cluster.

See also: AWS API Documentation

Request Syntax

client.update_cluster(
    ClusterName='string',
    InstanceGroups=[
        {
            'InstanceCount': 123,
            'InstanceGroupName': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'LifeCycleConfig': {
                'SourceS3Uri': 'string',
                'OnCreate': 'string'
            },
            'ExecutionRole': 'string',
            'ThreadsPerCore': 123,
            'InstanceStorageConfigs': [
                {
                    'EbsVolumeConfig': {
                        'VolumeSizeInGB': 123
                    }
                },
            ],
            'OnStartDeepHealthChecks': [
                'InstanceStress'|'InstanceConnectivity',
            ],
            'TrainingPlanArn': 'string',
            'OverrideVpcConfig': {
                'SecurityGroupIds': [
                    'string',
                ],
                'Subnets': [
                    'string',
                ]
            },
            'ScheduledUpdateConfig': {
                'ScheduleExpression': 'string',
                'DeploymentConfig': {
                    'RollingUpdatePolicy': {
                        'MaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        },
                        'RollbackMaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        }
                    },
                    'WaitIntervalInSeconds': 123,
                    'AutoRollbackConfiguration': [
                        {
                            'AlarmName': 'string'
                        },
                    ]
                }
            }
        },
    ],
    RestrictedInstanceGroups=[
        {
            'InstanceCount': 123,
            'InstanceGroupName': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'ExecutionRole': 'string',
            'ThreadsPerCore': 123,
            'InstanceStorageConfigs': [
                {
                    'EbsVolumeConfig': {
                        'VolumeSizeInGB': 123
                    }
                },
            ],
            'OnStartDeepHealthChecks': [
                'InstanceStress'|'InstanceConnectivity',
            ],
            'TrainingPlanArn': 'string',
            'OverrideVpcConfig': {
                'SecurityGroupIds': [
                    'string',
                ],
                'Subnets': [
                    'string',
                ]
            },
            'ScheduledUpdateConfig': {
                'ScheduleExpression': 'string',
                'DeploymentConfig': {
                    'RollingUpdatePolicy': {
                        'MaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        },
                        'RollbackMaximumBatchSize': {
                            'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE',
                            'Value': 123
                        }
                    },
                    'WaitIntervalInSeconds': 123,
                    'AutoRollbackConfiguration': [
                        {
                            'AlarmName': 'string'
                        },
                    ]
                }
            },
            'EnvironmentConfig': {
                'FSxLustreConfig': {
                    'SizeInGiB': 123,
                    'PerUnitStorageThroughput': 123
                }
            }
        },
    ],
    NodeRecovery='Automatic'|'None',
    InstanceGroupsToDelete=[
        'string',
    ]
)
type ClusterName:

string

param ClusterName:

[REQUIRED]

Specify the name of the SageMaker HyperPod cluster you want to update.

type InstanceGroups:

list

param InstanceGroups:

Specify the instance groups to update.

  • (dict) --

    The specifications of an instance group that you need to define.

    • InstanceCount (integer) -- [REQUIRED]

      Specifies the number of instances to add to the instance group of a SageMaker HyperPod cluster.

    • InstanceGroupName (string) -- [REQUIRED]

      Specifies the name of the instance group.

    • InstanceType (string) -- [REQUIRED]

      Specifies the instance type of the instance group.

    • LifeCycleConfig (dict) -- [REQUIRED]

      Specifies the LifeCycle configuration for the instance group.

      • SourceS3Uri (string) -- [REQUIRED]

        An Amazon S3 bucket path where your lifecycle scripts are stored.

      • OnCreate (string) -- [REQUIRED]

        The file name of the entrypoint script of lifecycle scripts under SourceS3Uri. This entrypoint script runs during cluster creation.

    • ExecutionRole (string) -- [REQUIRED]

      Specifies an IAM execution role to be assumed by the instance group.

    • ThreadsPerCore (integer) --

      Specifies the value for Threads per core. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For instance types that doesn't support multithreading, specify 1. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.

    • InstanceStorageConfigs (list) --

      Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster instance group.

      • (dict) --

        Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.

        • EbsVolumeConfig (dict) --

          Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

          • VolumeSizeInGB (integer) -- [REQUIRED]

            The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

    • OnStartDeepHealthChecks (list) --

      A flag indicating whether deep health checks should be performed when the cluster instance group is created or updated.

      • (string) --

    • TrainingPlanArn (string) --

      The Amazon Resource Name (ARN); of the training plan to use for this cluster instance group.

      For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

    • OverrideVpcConfig (dict) --

      To configure multi-AZ deployments, customize the Amazon VPC configuration at the instance group level. You can specify different subnets and security groups across different AZs in the instance group specification to override a SageMaker HyperPod cluster's default Amazon VPC configuration. For more information about deploying a cluster in multiple AZs, see Setting up SageMaker HyperPod clusters across multiple AZs.

      • SecurityGroupIds (list) -- [REQUIRED]

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) --

      • Subnets (list) -- [REQUIRED]

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) --

    • ScheduledUpdateConfig (dict) --

      The configuration object of the schedule that SageMaker uses to update the AMI.

      • ScheduleExpression (string) -- [REQUIRED]

        A cron expression that specifies the schedule that SageMaker follows when updating the AMI.

      • DeploymentConfig (dict) --

        The configuration to use when updating the AMI versions.

        • RollingUpdatePolicy (dict) --

          The policy that SageMaker uses when updating the AMI versions of the cluster.

          • MaximumBatchSize (dict) -- [REQUIRED]

            The maximum amount of instances in the cluster that SageMaker can update at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

          • RollbackMaximumBatchSize (dict) --

            The maximum amount of instances in the cluster that SageMaker can roll back at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

        • WaitIntervalInSeconds (integer) --

          The duration in seconds that SageMaker waits before updating more instances in the cluster.

        • AutoRollbackConfiguration (list) --

          An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.

          • (dict) --

            The details of the alarm to monitor during the AMI update.

            • AlarmName (string) -- [REQUIRED]

              The name of the alarm.

type RestrictedInstanceGroups:

list

param RestrictedInstanceGroups:

The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.

  • (dict) --

    The specifications of a restricted instance group that you need to define.

    • InstanceCount (integer) -- [REQUIRED]

      Specifies the number of instances to add to the restricted instance group of a SageMaker HyperPod cluster.

    • InstanceGroupName (string) -- [REQUIRED]

      Specifies the name of the restricted instance group.

    • InstanceType (string) -- [REQUIRED]

      Specifies the instance type of the restricted instance group.

    • ExecutionRole (string) -- [REQUIRED]

      Specifies an IAM execution role to be assumed by the restricted instance group.

    • ThreadsPerCore (integer) --

      The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.

    • InstanceStorageConfigs (list) --

      Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster restricted instance group.

      • (dict) --

        Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.

        • EbsVolumeConfig (dict) --

          Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

          • VolumeSizeInGB (integer) -- [REQUIRED]

            The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.

    • OnStartDeepHealthChecks (list) --

      A flag indicating whether deep health checks should be performed when the cluster restricted instance group is created or updated.

      • (string) --

    • TrainingPlanArn (string) --

      The Amazon Resource Name (ARN) of the training plan to filter clusters by. For more information about reserving GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

    • OverrideVpcConfig (dict) --

      Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.

      • SecurityGroupIds (list) -- [REQUIRED]

        The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.

        • (string) --

      • Subnets (list) -- [REQUIRED]

        The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.

        • (string) --

    • ScheduledUpdateConfig (dict) --

      The configuration object of the schedule that SageMaker follows when updating the AMI.

      • ScheduleExpression (string) -- [REQUIRED]

        A cron expression that specifies the schedule that SageMaker follows when updating the AMI.

      • DeploymentConfig (dict) --

        The configuration to use when updating the AMI versions.

        • RollingUpdatePolicy (dict) --

          The policy that SageMaker uses when updating the AMI versions of the cluster.

          • MaximumBatchSize (dict) -- [REQUIRED]

            The maximum amount of instances in the cluster that SageMaker can update at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

          • RollbackMaximumBatchSize (dict) --

            The maximum amount of instances in the cluster that SageMaker can roll back at a time.

            • Type (string) -- [REQUIRED]

              Specifies whether SageMaker should process the update by amount or percentage of instances.

            • Value (integer) -- [REQUIRED]

              Specifies the amount or percentage of instances SageMaker updates at a time.

        • WaitIntervalInSeconds (integer) --

          The duration in seconds that SageMaker waits before updating more instances in the cluster.

        • AutoRollbackConfiguration (list) --

          An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.

          • (dict) --

            The details of the alarm to monitor during the AMI update.

            • AlarmName (string) -- [REQUIRED]

              The name of the alarm.

    • EnvironmentConfig (dict) -- [REQUIRED]

      The configuration for the restricted instance groups (RIG) environment.

      • FSxLustreConfig (dict) --

        Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.

        • SizeInGiB (integer) -- [REQUIRED]

          The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).

        • PerUnitStorageThroughput (integer) -- [REQUIRED]

          The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.

type NodeRecovery:

string

param NodeRecovery:

The node recovery mode to be applied to the SageMaker HyperPod cluster.

type InstanceGroupsToDelete:

list

param InstanceGroupsToDelete:

Specify the names of the instance groups to delete. Use a single , as the separator between multiple names.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'ClusterArn': 'string'
}

Response Structure

  • (dict) --

    • ClusterArn (string) --

      The Amazon Resource Name (ARN) of the updated SageMaker HyperPod cluster.

UpdatePipeline (updated) Link ¶
Changes (response)
{'PipelineVersionId': 'long'}

Updates a pipeline.

See also: AWS API Documentation

Request Syntax

client.update_pipeline(
    PipelineName='string',
    PipelineDisplayName='string',
    PipelineDefinition='string',
    PipelineDefinitionS3Location={
        'Bucket': 'string',
        'ObjectKey': 'string',
        'VersionId': 'string'
    },
    PipelineDescription='string',
    RoleArn='string',
    ParallelismConfiguration={
        'MaxParallelExecutionSteps': 123
    }
)
type PipelineName:

string

param PipelineName:

[REQUIRED]

The name of the pipeline to update.

type PipelineDisplayName:

string

param PipelineDisplayName:

The display name of the pipeline.

type PipelineDefinition:

string

param PipelineDefinition:

The JSON pipeline definition.

type PipelineDefinitionS3Location:

dict

param PipelineDefinitionS3Location:

The location of the pipeline definition stored in Amazon S3. If specified, SageMaker will retrieve the pipeline definition from this location.

  • Bucket (string) -- [REQUIRED]

    Name of the S3 bucket.

  • ObjectKey (string) -- [REQUIRED]

    The object key (or key name) uniquely identifies the object in an S3 bucket.

  • VersionId (string) --

    Version Id of the pipeline definition file. If not specified, Amazon SageMaker will retrieve the latest version.

type PipelineDescription:

string

param PipelineDescription:

The description of the pipeline.

type RoleArn:

string

param RoleArn:

The Amazon Resource Name (ARN) that the pipeline uses to execute.

type ParallelismConfiguration:

dict

param ParallelismConfiguration:

If specified, it applies to all executions of this pipeline by default.

  • MaxParallelExecutionSteps (integer) -- [REQUIRED]

    The max number of steps that can be executed in parallel.

rtype:

dict

returns:

Response Syntax

{
    'PipelineArn': 'string',
    'PipelineVersionId': 123
}

Response Structure

  • (dict) --

    • PipelineArn (string) --

      The Amazon Resource Name (ARN) of the updated pipeline.

    • PipelineVersionId (integer) --

      The ID of the pipeline version.