2025/07/15 - Amazon SageMaker Service - 2 new9 updated api methods
Changes This release adds support for a new Restricted instance group type to enable a specialized environment for running Nova customization jobs on SageMaker HyperPod clusters. This release also adds support for SageMaker pipeline versioning.
Gets a list of all versions of the pipeline.
See also: AWS API Documentation
Request Syntax
client.list_pipeline_versions( PipelineName='string', CreatedAfter=datetime(2015, 1, 1), CreatedBefore=datetime(2015, 1, 1), SortOrder='Ascending'|'Descending', NextToken='string', MaxResults=123 )
string
[REQUIRED]
The Amazon Resource Name (ARN) of the pipeline.
datetime
A filter that returns the pipeline versions that were created after a specified time.
datetime
A filter that returns the pipeline versions that were created before a specified time.
string
The sort order for the results.
string
If the result of the previous ListPipelineVersions request was truncated, the response includes a NextToken. To retrieve the next set of pipeline versions, use this token in your next request.
integer
The maximum number of pipeline versions to return in the response.
dict
Response Syntax
{ 'PipelineVersionSummaries': [ { 'PipelineArn': 'string', 'PipelineVersionId': 123, 'CreationTime': datetime(2015, 1, 1), 'PipelineVersionDescription': 'string', 'PipelineVersionDisplayName': 'string', 'LastExecutionPipelineExecutionArn': 'string' }, ], 'NextToken': 'string' }
Response Structure
(dict) --
PipelineVersionSummaries (list) --
Contains a sorted list of pipeline version summary objects matching the specified filters. Each version summary includes the pipeline version ID, the creation date, and the last pipeline execution created from that version. This list can be empty.
(dict) --
The summary of the pipeline version.
PipelineArn (string) --
The Amazon Resource Name (ARN) of the pipeline.
PipelineVersionId (integer) --
The ID of the pipeline version.
CreationTime (datetime) --
The creation time of the pipeline version.
PipelineVersionDescription (string) --
The description of the pipeline version.
PipelineVersionDisplayName (string) --
The display name of the pipeline version.
LastExecutionPipelineExecutionArn (string) --
The Amazon Resource Name (ARN) of the most recent pipeline execution created from this pipeline version.
NextToken (string) --
If the result of the previous ListPipelineVersions request was truncated, the response includes a NextToken. To retrieve the next set of pipeline versions, use this token in your next request.
Updates a pipeline version.
See also: AWS API Documentation
Request Syntax
client.update_pipeline_version( PipelineArn='string', PipelineVersionId=123, PipelineVersionDisplayName='string', PipelineVersionDescription='string' )
string
[REQUIRED]
The Amazon Resource Name (ARN) of the pipeline.
integer
[REQUIRED]
The pipeline version ID to update.
string
The display name of the pipeline version.
string
The description of the pipeline version.
dict
Response Syntax
{ 'PipelineArn': 'string', 'PipelineVersionId': 123 }
Response Structure
(dict) --
PipelineArn (string) --
The Amazon Resource Name (ARN) of the pipeline.
PipelineVersionId (integer) --
The ID of the pipeline version.
{'RestrictedInstanceGroups': [{'EnvironmentConfig': {'FSxLustreConfig': {'PerUnitStorageThroughput': 'integer', 'SizeInGiB': 'integer'}}, 'ExecutionRole': 'string', 'InstanceCount': 'integer', 'InstanceGroupName': 'string', 'InstanceStorageConfigs': [{'EbsVolumeConfig': {'VolumeSizeInGB': 'integer'}}], 'InstanceType': 'ml.p4d.24xlarge | ' 'ml.p4de.24xlarge | ' 'ml.p5.48xlarge | ' 'ml.trn1.32xlarge | ' 'ml.trn1n.32xlarge | ' 'ml.g5.xlarge | ml.g5.2xlarge | ' 'ml.g5.4xlarge | ml.g5.8xlarge ' '| ml.g5.12xlarge | ' 'ml.g5.16xlarge | ' 'ml.g5.24xlarge | ' 'ml.g5.48xlarge | ml.c5.large | ' 'ml.c5.xlarge | ml.c5.2xlarge | ' 'ml.c5.4xlarge | ml.c5.9xlarge ' '| ml.c5.12xlarge | ' 'ml.c5.18xlarge | ' 'ml.c5.24xlarge | ml.c5n.large ' '| ml.c5n.2xlarge | ' 'ml.c5n.4xlarge | ' 'ml.c5n.9xlarge | ' 'ml.c5n.18xlarge | ml.m5.large ' '| ml.m5.xlarge | ml.m5.2xlarge ' '| ml.m5.4xlarge | ' 'ml.m5.8xlarge | ml.m5.12xlarge ' '| ml.m5.16xlarge | ' 'ml.m5.24xlarge | ml.t3.medium ' '| ml.t3.large | ml.t3.xlarge | ' 'ml.t3.2xlarge | ml.g6.xlarge | ' 'ml.g6.2xlarge | ml.g6.4xlarge ' '| ml.g6.8xlarge | ' 'ml.g6.16xlarge | ' 'ml.g6.12xlarge | ' 'ml.g6.24xlarge | ' 'ml.g6.48xlarge | ' 'ml.gr6.4xlarge | ' 'ml.gr6.8xlarge | ml.g6e.xlarge ' '| ml.g6e.2xlarge | ' 'ml.g6e.4xlarge | ' 'ml.g6e.8xlarge | ' 'ml.g6e.16xlarge | ' 'ml.g6e.12xlarge | ' 'ml.g6e.24xlarge | ' 'ml.g6e.48xlarge | ' 'ml.p5e.48xlarge | ' 'ml.p5en.48xlarge | ' 'ml.p6-b200.48xlarge | ' 'ml.trn2.48xlarge | ' 'ml.c6i.large | ml.c6i.xlarge | ' 'ml.c6i.2xlarge | ' 'ml.c6i.4xlarge | ' 'ml.c6i.8xlarge | ' 'ml.c6i.12xlarge | ' 'ml.c6i.16xlarge | ' 'ml.c6i.24xlarge | ' 'ml.c6i.32xlarge | ml.m6i.large ' '| ml.m6i.xlarge | ' 'ml.m6i.2xlarge | ' 'ml.m6i.4xlarge | ' 'ml.m6i.8xlarge | ' 'ml.m6i.12xlarge | ' 'ml.m6i.16xlarge | ' 'ml.m6i.24xlarge | ' 'ml.m6i.32xlarge | ml.r6i.large ' '| ml.r6i.xlarge | ' 'ml.r6i.2xlarge | ' 'ml.r6i.4xlarge | ' 'ml.r6i.8xlarge | ' 'ml.r6i.12xlarge | ' 'ml.r6i.16xlarge | ' 'ml.r6i.24xlarge | ' 'ml.r6i.32xlarge | ' 'ml.i3en.large | ml.i3en.xlarge ' '| ml.i3en.2xlarge | ' 'ml.i3en.3xlarge | ' 'ml.i3en.6xlarge | ' 'ml.i3en.12xlarge | ' 'ml.i3en.24xlarge | ' 'ml.m7i.large | ml.m7i.xlarge | ' 'ml.m7i.2xlarge | ' 'ml.m7i.4xlarge | ' 'ml.m7i.8xlarge | ' 'ml.m7i.12xlarge | ' 'ml.m7i.16xlarge | ' 'ml.m7i.24xlarge | ' 'ml.m7i.48xlarge | ml.r7i.large ' '| ml.r7i.xlarge | ' 'ml.r7i.2xlarge | ' 'ml.r7i.4xlarge | ' 'ml.r7i.8xlarge | ' 'ml.r7i.12xlarge | ' 'ml.r7i.16xlarge | ' 'ml.r7i.24xlarge | ' 'ml.r7i.48xlarge', 'OnStartDeepHealthChecks': ['InstanceStress | ' 'InstanceConnectivity'], 'OverrideVpcConfig': {'SecurityGroupIds': ['string'], 'Subnets': ['string']}, 'ScheduledUpdateConfig': {'DeploymentConfig': {'AutoRollbackConfiguration': [{'AlarmName': 'string'}], 'RollingUpdatePolicy': {'MaximumBatchSize': {'Type': 'INSTANCE_COUNT ' '| ' 'CAPACITY_PERCENTAGE', 'Value': 'integer'}, 'RollbackMaximumBatchSize': {'Type': 'INSTANCE_COUNT ' '| ' 'CAPACITY_PERCENTAGE', 'Value': 'integer'}}, 'WaitIntervalInSeconds': 'integer'}, 'ScheduleExpression': 'string'}, 'ThreadsPerCore': 'integer', 'TrainingPlanArn': 'string'}]}
Creates a SageMaker HyperPod cluster. SageMaker HyperPod is a capability of SageMaker for creating and managing persistent clusters for developing large machine learning models, such as large language models (LLMs) and diffusion models. To learn more, see Amazon SageMaker HyperPod in the Amazon SageMaker Developer Guide.
See also: AWS API Documentation
Request Syntax
client.create_cluster( ClusterName='string', InstanceGroups=[ { 'InstanceCount': 123, 'InstanceGroupName': 'string', 'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'LifeCycleConfig': { 'SourceS3Uri': 'string', 'OnCreate': 'string' }, 'ExecutionRole': 'string', 'ThreadsPerCore': 123, 'InstanceStorageConfigs': [ { 'EbsVolumeConfig': { 'VolumeSizeInGB': 123 } }, ], 'OnStartDeepHealthChecks': [ 'InstanceStress'|'InstanceConnectivity', ], 'TrainingPlanArn': 'string', 'OverrideVpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'ScheduledUpdateConfig': { 'ScheduleExpression': 'string', 'DeploymentConfig': { 'RollingUpdatePolicy': { 'MaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 }, 'RollbackMaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 } }, 'WaitIntervalInSeconds': 123, 'AutoRollbackConfiguration': [ { 'AlarmName': 'string' }, ] } } }, ], RestrictedInstanceGroups=[ { 'InstanceCount': 123, 'InstanceGroupName': 'string', 'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'ExecutionRole': 'string', 'ThreadsPerCore': 123, 'InstanceStorageConfigs': [ { 'EbsVolumeConfig': { 'VolumeSizeInGB': 123 } }, ], 'OnStartDeepHealthChecks': [ 'InstanceStress'|'InstanceConnectivity', ], 'TrainingPlanArn': 'string', 'OverrideVpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'ScheduledUpdateConfig': { 'ScheduleExpression': 'string', 'DeploymentConfig': { 'RollingUpdatePolicy': { 'MaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 }, 'RollbackMaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 } }, 'WaitIntervalInSeconds': 123, 'AutoRollbackConfiguration': [ { 'AlarmName': 'string' }, ] } }, 'EnvironmentConfig': { 'FSxLustreConfig': { 'SizeInGiB': 123, 'PerUnitStorageThroughput': 123 } } }, ], VpcConfig={ 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, Tags=[ { 'Key': 'string', 'Value': 'string' }, ], Orchestrator={ 'Eks': { 'ClusterArn': 'string' } }, NodeRecovery='Automatic'|'None' )
string
[REQUIRED]
The name for the new SageMaker HyperPod cluster.
list
The instance groups to be created in the SageMaker HyperPod cluster.
(dict) --
The specifications of an instance group that you need to define.
InstanceCount (integer) -- [REQUIRED]
Specifies the number of instances to add to the instance group of a SageMaker HyperPod cluster.
InstanceGroupName (string) -- [REQUIRED]
Specifies the name of the instance group.
InstanceType (string) -- [REQUIRED]
Specifies the instance type of the instance group.
LifeCycleConfig (dict) -- [REQUIRED]
Specifies the LifeCycle configuration for the instance group.
SourceS3Uri (string) -- [REQUIRED]
An Amazon S3 bucket path where your lifecycle scripts are stored.
OnCreate (string) -- [REQUIRED]
The file name of the entrypoint script of lifecycle scripts under SourceS3Uri. This entrypoint script runs during cluster creation.
ExecutionRole (string) -- [REQUIRED]
Specifies an IAM execution role to be assumed by the instance group.
ThreadsPerCore (integer) --
Specifies the value for Threads per core. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For instance types that doesn't support multithreading, specify 1. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.
InstanceStorageConfigs (list) --
Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster instance group.
(dict) --
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.
EbsVolumeConfig (dict) --
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
VolumeSizeInGB (integer) -- [REQUIRED]
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
OnStartDeepHealthChecks (list) --
A flag indicating whether deep health checks should be performed when the cluster instance group is created or updated.
(string) --
TrainingPlanArn (string) --
The Amazon Resource Name (ARN); of the training plan to use for this cluster instance group.
For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.
OverrideVpcConfig (dict) --
To configure multi-AZ deployments, customize the Amazon VPC configuration at the instance group level. You can specify different subnets and security groups across different AZs in the instance group specification to override a SageMaker HyperPod cluster's default Amazon VPC configuration. For more information about deploying a cluster in multiple AZs, see Setting up SageMaker HyperPod clusters across multiple AZs.
SecurityGroupIds (list) -- [REQUIRED]
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) -- [REQUIRED]
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
ScheduledUpdateConfig (dict) --
The configuration object of the schedule that SageMaker uses to update the AMI.
ScheduleExpression (string) -- [REQUIRED]
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
DeploymentConfig (dict) --
The configuration to use when updating the AMI versions.
RollingUpdatePolicy (dict) --
The policy that SageMaker uses when updating the AMI versions of the cluster.
MaximumBatchSize (dict) -- [REQUIRED]
The maximum amount of instances in the cluster that SageMaker can update at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
RollbackMaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can roll back at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
WaitIntervalInSeconds (integer) --
The duration in seconds that SageMaker waits before updating more instances in the cluster.
AutoRollbackConfiguration (list) --
An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.
(dict) --
The details of the alarm to monitor during the AMI update.
AlarmName (string) -- [REQUIRED]
The name of the alarm.
list
The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.
(dict) --
The specifications of a restricted instance group that you need to define.
InstanceCount (integer) -- [REQUIRED]
Specifies the number of instances to add to the restricted instance group of a SageMaker HyperPod cluster.
InstanceGroupName (string) -- [REQUIRED]
Specifies the name of the restricted instance group.
InstanceType (string) -- [REQUIRED]
Specifies the instance type of the restricted instance group.
ExecutionRole (string) -- [REQUIRED]
Specifies an IAM execution role to be assumed by the restricted instance group.
ThreadsPerCore (integer) --
The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.
InstanceStorageConfigs (list) --
Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster restricted instance group.
(dict) --
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.
EbsVolumeConfig (dict) --
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
VolumeSizeInGB (integer) -- [REQUIRED]
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
OnStartDeepHealthChecks (list) --
A flag indicating whether deep health checks should be performed when the cluster restricted instance group is created or updated.
(string) --
TrainingPlanArn (string) --
The Amazon Resource Name (ARN) of the training plan to filter clusters by. For more information about reserving GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.
OverrideVpcConfig (dict) --
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.
SecurityGroupIds (list) -- [REQUIRED]
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) -- [REQUIRED]
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
ScheduledUpdateConfig (dict) --
The configuration object of the schedule that SageMaker follows when updating the AMI.
ScheduleExpression (string) -- [REQUIRED]
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
DeploymentConfig (dict) --
The configuration to use when updating the AMI versions.
RollingUpdatePolicy (dict) --
The policy that SageMaker uses when updating the AMI versions of the cluster.
MaximumBatchSize (dict) -- [REQUIRED]
The maximum amount of instances in the cluster that SageMaker can update at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
RollbackMaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can roll back at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
WaitIntervalInSeconds (integer) --
The duration in seconds that SageMaker waits before updating more instances in the cluster.
AutoRollbackConfiguration (list) --
An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.
(dict) --
The details of the alarm to monitor during the AMI update.
AlarmName (string) -- [REQUIRED]
The name of the alarm.
EnvironmentConfig (dict) -- [REQUIRED]
The configuration for the restricted instance groups (RIG) environment.
FSxLustreConfig (dict) --
Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.
SizeInGiB (integer) -- [REQUIRED]
The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).
PerUnitStorageThroughput (integer) -- [REQUIRED]
The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.
dict
Specifies the Amazon Virtual Private Cloud (VPC) that is associated with the Amazon SageMaker HyperPod cluster. You can control access to and from your resources by configuring your VPC. For more information, see Give SageMaker access to resources in your Amazon VPC.
SecurityGroupIds (list) -- [REQUIRED]
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) -- [REQUIRED]
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
list
Custom tags for managing the SageMaker HyperPod cluster as an Amazon Web Services resource. You can add tags to your cluster in the same way you add them in other Amazon Web Services services that support tagging. To learn more about tagging Amazon Web Services resources in general, see Tagging Amazon Web Services Resources User Guide.
(dict) --
A tag object that consists of a key and an optional value, used to manage metadata for SageMaker Amazon Web Services resources.
You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags.
For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources. For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy.
Key (string) -- [REQUIRED]
The tag key. Tag keys must be unique per resource.
Value (string) -- [REQUIRED]
The tag value.
dict
The type of orchestrator to use for the SageMaker HyperPod cluster. Currently, the only supported value is "eks", which is to use an Amazon Elastic Kubernetes Service (EKS) cluster as the orchestrator.
Eks (dict) -- [REQUIRED]
The Amazon EKS cluster used as the orchestrator for the SageMaker HyperPod cluster.
ClusterArn (string) -- [REQUIRED]
The Amazon Resource Name (ARN) of the Amazon EKS cluster associated with the SageMaker HyperPod cluster.
string
The node recovery mode for the SageMaker HyperPod cluster. When set to Automatic, SageMaker HyperPod will automatically reboot or replace faulty nodes when issues are detected. When set to None, cluster administrators will need to manually manage any faulty cluster instances.
dict
Response Syntax
{ 'ClusterArn': 'string' }
Response Structure
(dict) --
ClusterArn (string) --
The Amazon Resource Name (ARN) of the cluster.
{'RestrictedInstanceGroups': [{'CurrentCount': 'integer', 'EnvironmentConfig': {'FSxLustreConfig': {'PerUnitStorageThroughput': 'integer', 'SizeInGiB': 'integer'}, 'S3OutputPath': 'string'}, 'ExecutionRole': 'string', 'InstanceGroupName': 'string', 'InstanceStorageConfigs': [{'EbsVolumeConfig': {'VolumeSizeInGB': 'integer'}}], 'InstanceType': 'ml.p4d.24xlarge | ' 'ml.p4de.24xlarge | ' 'ml.p5.48xlarge | ' 'ml.trn1.32xlarge | ' 'ml.trn1n.32xlarge | ' 'ml.g5.xlarge | ml.g5.2xlarge | ' 'ml.g5.4xlarge | ml.g5.8xlarge ' '| ml.g5.12xlarge | ' 'ml.g5.16xlarge | ' 'ml.g5.24xlarge | ' 'ml.g5.48xlarge | ml.c5.large | ' 'ml.c5.xlarge | ml.c5.2xlarge | ' 'ml.c5.4xlarge | ml.c5.9xlarge ' '| ml.c5.12xlarge | ' 'ml.c5.18xlarge | ' 'ml.c5.24xlarge | ml.c5n.large ' '| ml.c5n.2xlarge | ' 'ml.c5n.4xlarge | ' 'ml.c5n.9xlarge | ' 'ml.c5n.18xlarge | ml.m5.large ' '| ml.m5.xlarge | ml.m5.2xlarge ' '| ml.m5.4xlarge | ' 'ml.m5.8xlarge | ml.m5.12xlarge ' '| ml.m5.16xlarge | ' 'ml.m5.24xlarge | ml.t3.medium ' '| ml.t3.large | ml.t3.xlarge | ' 'ml.t3.2xlarge | ml.g6.xlarge | ' 'ml.g6.2xlarge | ml.g6.4xlarge ' '| ml.g6.8xlarge | ' 'ml.g6.16xlarge | ' 'ml.g6.12xlarge | ' 'ml.g6.24xlarge | ' 'ml.g6.48xlarge | ' 'ml.gr6.4xlarge | ' 'ml.gr6.8xlarge | ml.g6e.xlarge ' '| ml.g6e.2xlarge | ' 'ml.g6e.4xlarge | ' 'ml.g6e.8xlarge | ' 'ml.g6e.16xlarge | ' 'ml.g6e.12xlarge | ' 'ml.g6e.24xlarge | ' 'ml.g6e.48xlarge | ' 'ml.p5e.48xlarge | ' 'ml.p5en.48xlarge | ' 'ml.p6-b200.48xlarge | ' 'ml.trn2.48xlarge | ' 'ml.c6i.large | ml.c6i.xlarge | ' 'ml.c6i.2xlarge | ' 'ml.c6i.4xlarge | ' 'ml.c6i.8xlarge | ' 'ml.c6i.12xlarge | ' 'ml.c6i.16xlarge | ' 'ml.c6i.24xlarge | ' 'ml.c6i.32xlarge | ml.m6i.large ' '| ml.m6i.xlarge | ' 'ml.m6i.2xlarge | ' 'ml.m6i.4xlarge | ' 'ml.m6i.8xlarge | ' 'ml.m6i.12xlarge | ' 'ml.m6i.16xlarge | ' 'ml.m6i.24xlarge | ' 'ml.m6i.32xlarge | ml.r6i.large ' '| ml.r6i.xlarge | ' 'ml.r6i.2xlarge | ' 'ml.r6i.4xlarge | ' 'ml.r6i.8xlarge | ' 'ml.r6i.12xlarge | ' 'ml.r6i.16xlarge | ' 'ml.r6i.24xlarge | ' 'ml.r6i.32xlarge | ' 'ml.i3en.large | ml.i3en.xlarge ' '| ml.i3en.2xlarge | ' 'ml.i3en.3xlarge | ' 'ml.i3en.6xlarge | ' 'ml.i3en.12xlarge | ' 'ml.i3en.24xlarge | ' 'ml.m7i.large | ml.m7i.xlarge | ' 'ml.m7i.2xlarge | ' 'ml.m7i.4xlarge | ' 'ml.m7i.8xlarge | ' 'ml.m7i.12xlarge | ' 'ml.m7i.16xlarge | ' 'ml.m7i.24xlarge | ' 'ml.m7i.48xlarge | ml.r7i.large ' '| ml.r7i.xlarge | ' 'ml.r7i.2xlarge | ' 'ml.r7i.4xlarge | ' 'ml.r7i.8xlarge | ' 'ml.r7i.12xlarge | ' 'ml.r7i.16xlarge | ' 'ml.r7i.24xlarge | ' 'ml.r7i.48xlarge', 'OnStartDeepHealthChecks': ['InstanceStress | ' 'InstanceConnectivity'], 'OverrideVpcConfig': {'SecurityGroupIds': ['string'], 'Subnets': ['string']}, 'ScheduledUpdateConfig': {'DeploymentConfig': {'AutoRollbackConfiguration': [{'AlarmName': 'string'}], 'RollingUpdatePolicy': {'MaximumBatchSize': {'Type': 'INSTANCE_COUNT ' '| ' 'CAPACITY_PERCENTAGE', 'Value': 'integer'}, 'RollbackMaximumBatchSize': {'Type': 'INSTANCE_COUNT ' '| ' 'CAPACITY_PERCENTAGE', 'Value': 'integer'}}, 'WaitIntervalInSeconds': 'integer'}, 'ScheduleExpression': 'string'}, 'Status': 'InService | Creating | Updating | ' 'Failed | Degraded | SystemUpdating | ' 'Deleting', 'TargetCount': 'integer', 'ThreadsPerCore': 'integer', 'TrainingPlanArn': 'string', 'TrainingPlanStatus': 'string'}]}
Retrieves information of a SageMaker HyperPod cluster.
See also: AWS API Documentation
Request Syntax
client.describe_cluster( ClusterName='string' )
string
[REQUIRED]
The string name or the Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.
dict
Response Syntax
{ 'ClusterArn': 'string', 'ClusterName': 'string', 'ClusterStatus': 'Creating'|'Deleting'|'Failed'|'InService'|'RollingBack'|'SystemUpdating'|'Updating', 'CreationTime': datetime(2015, 1, 1), 'FailureMessage': 'string', 'InstanceGroups': [ { 'CurrentCount': 123, 'TargetCount': 123, 'InstanceGroupName': 'string', 'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'LifeCycleConfig': { 'SourceS3Uri': 'string', 'OnCreate': 'string' }, 'ExecutionRole': 'string', 'ThreadsPerCore': 123, 'InstanceStorageConfigs': [ { 'EbsVolumeConfig': { 'VolumeSizeInGB': 123 } }, ], 'OnStartDeepHealthChecks': [ 'InstanceStress'|'InstanceConnectivity', ], 'Status': 'InService'|'Creating'|'Updating'|'Failed'|'Degraded'|'SystemUpdating'|'Deleting', 'TrainingPlanArn': 'string', 'TrainingPlanStatus': 'string', 'OverrideVpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'ScheduledUpdateConfig': { 'ScheduleExpression': 'string', 'DeploymentConfig': { 'RollingUpdatePolicy': { 'MaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 }, 'RollbackMaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 } }, 'WaitIntervalInSeconds': 123, 'AutoRollbackConfiguration': [ { 'AlarmName': 'string' }, ] } } }, ], 'RestrictedInstanceGroups': [ { 'CurrentCount': 123, 'TargetCount': 123, 'InstanceGroupName': 'string', 'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'ExecutionRole': 'string', 'ThreadsPerCore': 123, 'InstanceStorageConfigs': [ { 'EbsVolumeConfig': { 'VolumeSizeInGB': 123 } }, ], 'OnStartDeepHealthChecks': [ 'InstanceStress'|'InstanceConnectivity', ], 'Status': 'InService'|'Creating'|'Updating'|'Failed'|'Degraded'|'SystemUpdating'|'Deleting', 'TrainingPlanArn': 'string', 'TrainingPlanStatus': 'string', 'OverrideVpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'ScheduledUpdateConfig': { 'ScheduleExpression': 'string', 'DeploymentConfig': { 'RollingUpdatePolicy': { 'MaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 }, 'RollbackMaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 } }, 'WaitIntervalInSeconds': 123, 'AutoRollbackConfiguration': [ { 'AlarmName': 'string' }, ] } }, 'EnvironmentConfig': { 'FSxLustreConfig': { 'SizeInGiB': 123, 'PerUnitStorageThroughput': 123 }, 'S3OutputPath': 'string' } }, ], 'VpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'Orchestrator': { 'Eks': { 'ClusterArn': 'string' } }, 'NodeRecovery': 'Automatic'|'None' }
Response Structure
(dict) --
ClusterArn (string) --
The Amazon Resource Name (ARN) of the SageMaker HyperPod cluster.
ClusterName (string) --
The name of the SageMaker HyperPod cluster.
ClusterStatus (string) --
The status of the SageMaker HyperPod cluster.
CreationTime (datetime) --
The time when the SageMaker Cluster is created.
FailureMessage (string) --
The failure message of the SageMaker HyperPod cluster.
InstanceGroups (list) --
The instance groups of the SageMaker HyperPod cluster.
(dict) --
Details of an instance group in a SageMaker HyperPod cluster.
CurrentCount (integer) --
The number of instances that are currently in the instance group of a SageMaker HyperPod cluster.
TargetCount (integer) --
The number of instances you specified to add to the instance group of a SageMaker HyperPod cluster.
InstanceGroupName (string) --
The name of the instance group of a SageMaker HyperPod cluster.
InstanceType (string) --
The instance type of the instance group of a SageMaker HyperPod cluster.
LifeCycleConfig (dict) --
Details of LifeCycle configuration for the instance group.
SourceS3Uri (string) --
An Amazon S3 bucket path where your lifecycle scripts are stored.
OnCreate (string) --
The file name of the entrypoint script of lifecycle scripts under SourceS3Uri. This entrypoint script runs during cluster creation.
ExecutionRole (string) --
The execution role for the instance group to assume.
ThreadsPerCore (integer) --
The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.
InstanceStorageConfigs (list) --
The additional storage configurations for the instances in the SageMaker HyperPod cluster instance group.
(dict) --
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.
EbsVolumeConfig (dict) --
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
VolumeSizeInGB (integer) --
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
OnStartDeepHealthChecks (list) --
A flag indicating whether deep health checks should be performed when the cluster instance group is created or updated.
(string) --
Status (string) --
The current status of the cluster instance group.
InService: The instance group is active and healthy.
Creating: The instance group is being provisioned.
Updating: The instance group is being updated.
Failed: The instance group has failed to provision or is no longer healthy.
Degraded: The instance group is degraded, meaning that some instances have failed to provision or are no longer healthy.
Deleting: The instance group is being deleted.
TrainingPlanArn (string) --
The Amazon Resource Name (ARN); of the training plan associated with this cluster instance group.
For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.
TrainingPlanStatus (string) --
The current status of the training plan associated with this cluster instance group.
OverrideVpcConfig (dict) --
The customized Amazon VPC configuration at the instance group level that overrides the default Amazon VPC configuration of the SageMaker HyperPod cluster.
SecurityGroupIds (list) --
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) --
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
ScheduledUpdateConfig (dict) --
The configuration object of the schedule that SageMaker follows when updating the AMI.
ScheduleExpression (string) --
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
DeploymentConfig (dict) --
The configuration to use when updating the AMI versions.
RollingUpdatePolicy (dict) --
The policy that SageMaker uses when updating the AMI versions of the cluster.
MaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can update at a time.
Type (string) --
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) --
Specifies the amount or percentage of instances SageMaker updates at a time.
RollbackMaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can roll back at a time.
Type (string) --
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) --
Specifies the amount or percentage of instances SageMaker updates at a time.
WaitIntervalInSeconds (integer) --
The duration in seconds that SageMaker waits before updating more instances in the cluster.
AutoRollbackConfiguration (list) --
An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.
(dict) --
The details of the alarm to monitor during the AMI update.
AlarmName (string) --
The name of the alarm.
RestrictedInstanceGroups (list) --
The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.
(dict) --
The instance group details of the restricted instance group (RIG).
CurrentCount (integer) --
The number of instances that are currently in the restricted instance group of a SageMaker HyperPod cluster.
TargetCount (integer) --
The number of instances you specified to add to the restricted instance group of a SageMaker HyperPod cluster.
InstanceGroupName (string) --
The name of the restricted instance group of a SageMaker HyperPod cluster.
InstanceType (string) --
The instance type of the restricted instance group of a SageMaker HyperPod cluster.
ExecutionRole (string) --
The execution role for the restricted instance group to assume.
ThreadsPerCore (integer) --
The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.
InstanceStorageConfigs (list) --
The additional storage configurations for the instances in the SageMaker HyperPod cluster restricted instance group.
(dict) --
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.
EbsVolumeConfig (dict) --
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
VolumeSizeInGB (integer) --
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
OnStartDeepHealthChecks (list) --
A flag indicating whether deep health checks should be performed when the cluster's restricted instance group is created or updated.
(string) --
Status (string) --
The current status of the cluster's restricted instance group.
InService: The restricted instance group is active and healthy.
Creating: The restricted instance group is being provisioned.
Updating: The restricted instance group is being updated.
Failed: The restricted instance group has failed to provision or is no longer healthy.
Degraded: The restricted instance group is degraded, meaning that some instances have failed to provision or are no longer healthy.
Deleting: The restricted instance group is being deleted.
TrainingPlanArn (string) --
The Amazon Resource Name (ARN) of the training plan to filter clusters by. For more information about reserving GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.
TrainingPlanStatus (string) --
The current status of the training plan associated with this cluster restricted instance group.
OverrideVpcConfig (dict) --
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.
SecurityGroupIds (list) --
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) --
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
ScheduledUpdateConfig (dict) --
The configuration object of the schedule that SageMaker follows when updating the AMI.
ScheduleExpression (string) --
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
DeploymentConfig (dict) --
The configuration to use when updating the AMI versions.
RollingUpdatePolicy (dict) --
The policy that SageMaker uses when updating the AMI versions of the cluster.
MaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can update at a time.
Type (string) --
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) --
Specifies the amount or percentage of instances SageMaker updates at a time.
RollbackMaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can roll back at a time.
Type (string) --
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) --
Specifies the amount or percentage of instances SageMaker updates at a time.
WaitIntervalInSeconds (integer) --
The duration in seconds that SageMaker waits before updating more instances in the cluster.
AutoRollbackConfiguration (list) --
An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.
(dict) --
The details of the alarm to monitor during the AMI update.
AlarmName (string) --
The name of the alarm.
EnvironmentConfig (dict) --
The configuration for the restricted instance groups (RIG) environment.
FSxLustreConfig (dict) --
Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.
SizeInGiB (integer) --
The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).
PerUnitStorageThroughput (integer) --
The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.
S3OutputPath (string) --
The Amazon S3 path where output data from the restricted instance group (RIG) environment will be stored.
VpcConfig (dict) --
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.
SecurityGroupIds (list) --
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) --
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
Orchestrator (dict) --
The type of orchestrator used for the SageMaker HyperPod cluster.
Eks (dict) --
The Amazon EKS cluster used as the orchestrator for the SageMaker HyperPod cluster.
ClusterArn (string) --
The Amazon Resource Name (ARN) of the Amazon EKS cluster associated with the SageMaker HyperPod cluster.
NodeRecovery (string) --
The node recovery mode configured for the SageMaker HyperPod cluster.
{'PipelineVersionId': 'long'}Response
{'PipelineVersionDescription': 'string', 'PipelineVersionDisplayName': 'string'}
Describes the details of a pipeline.
See also: AWS API Documentation
Request Syntax
client.describe_pipeline( PipelineName='string', PipelineVersionId=123 )
string
[REQUIRED]
The name or Amazon Resource Name (ARN) of the pipeline to describe.
integer
The ID of the pipeline version to describe.
dict
Response Syntax
{ 'PipelineArn': 'string', 'PipelineName': 'string', 'PipelineDisplayName': 'string', 'PipelineDefinition': 'string', 'PipelineDescription': 'string', 'RoleArn': 'string', 'PipelineStatus': 'Active'|'Deleting', 'CreationTime': datetime(2015, 1, 1), 'LastModifiedTime': datetime(2015, 1, 1), 'LastRunTime': datetime(2015, 1, 1), 'CreatedBy': { 'UserProfileArn': 'string', 'UserProfileName': 'string', 'DomainId': 'string', 'IamIdentity': { 'Arn': 'string', 'PrincipalId': 'string', 'SourceIdentity': 'string' } }, 'LastModifiedBy': { 'UserProfileArn': 'string', 'UserProfileName': 'string', 'DomainId': 'string', 'IamIdentity': { 'Arn': 'string', 'PrincipalId': 'string', 'SourceIdentity': 'string' } }, 'ParallelismConfiguration': { 'MaxParallelExecutionSteps': 123 }, 'PipelineVersionDisplayName': 'string', 'PipelineVersionDescription': 'string' }
Response Structure
(dict) --
PipelineArn (string) --
The Amazon Resource Name (ARN) of the pipeline.
PipelineName (string) --
The name of the pipeline.
PipelineDisplayName (string) --
The display name of the pipeline.
PipelineDefinition (string) --
The JSON pipeline definition.
PipelineDescription (string) --
The description of the pipeline.
RoleArn (string) --
The Amazon Resource Name (ARN) that the pipeline uses to execute.
PipelineStatus (string) --
The status of the pipeline execution.
CreationTime (datetime) --
The time when the pipeline was created.
LastModifiedTime (datetime) --
The time when the pipeline was last modified.
LastRunTime (datetime) --
The time when the pipeline was last run.
CreatedBy (dict) --
Information about the user who created or modified a SageMaker resource.
UserProfileArn (string) --
The Amazon Resource Name (ARN) of the user's profile.
UserProfileName (string) --
The name of the user's profile.
DomainId (string) --
The domain associated with the user.
IamIdentity (dict) --
The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.
Arn (string) --
The Amazon Resource Name (ARN) of the IAM identity.
PrincipalId (string) --
The ID of the principal that assumes the IAM identity.
SourceIdentity (string) --
The person or application which assumes the IAM identity.
LastModifiedBy (dict) --
Information about the user who created or modified a SageMaker resource.
UserProfileArn (string) --
The Amazon Resource Name (ARN) of the user's profile.
UserProfileName (string) --
The name of the user's profile.
DomainId (string) --
The domain associated with the user.
IamIdentity (dict) --
The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.
Arn (string) --
The Amazon Resource Name (ARN) of the IAM identity.
PrincipalId (string) --
The ID of the principal that assumes the IAM identity.
SourceIdentity (string) --
The person or application which assumes the IAM identity.
ParallelismConfiguration (dict) --
Lists the parallelism configuration applied to the pipeline.
MaxParallelExecutionSteps (integer) --
The max number of steps that can be executed in parallel.
PipelineVersionDisplayName (string) --
The display name of the pipeline version.
PipelineVersionDescription (string) --
The description of the pipeline version.
{'PipelineVersionId': 'long'}
Describes the details of a pipeline execution.
See also: AWS API Documentation
Request Syntax
client.describe_pipeline_execution( PipelineExecutionArn='string' )
string
[REQUIRED]
The Amazon Resource Name (ARN) of the pipeline execution.
dict
Response Syntax
{ 'PipelineArn': 'string', 'PipelineExecutionArn': 'string', 'PipelineExecutionDisplayName': 'string', 'PipelineExecutionStatus': 'Executing'|'Stopping'|'Stopped'|'Failed'|'Succeeded', 'PipelineExecutionDescription': 'string', 'PipelineExperimentConfig': { 'ExperimentName': 'string', 'TrialName': 'string' }, 'FailureReason': 'string', 'CreationTime': datetime(2015, 1, 1), 'LastModifiedTime': datetime(2015, 1, 1), 'CreatedBy': { 'UserProfileArn': 'string', 'UserProfileName': 'string', 'DomainId': 'string', 'IamIdentity': { 'Arn': 'string', 'PrincipalId': 'string', 'SourceIdentity': 'string' } }, 'LastModifiedBy': { 'UserProfileArn': 'string', 'UserProfileName': 'string', 'DomainId': 'string', 'IamIdentity': { 'Arn': 'string', 'PrincipalId': 'string', 'SourceIdentity': 'string' } }, 'ParallelismConfiguration': { 'MaxParallelExecutionSteps': 123 }, 'SelectiveExecutionConfig': { 'SourcePipelineExecutionArn': 'string', 'SelectedSteps': [ { 'StepName': 'string' }, ] }, 'PipelineVersionId': 123 }
Response Structure
(dict) --
PipelineArn (string) --
The Amazon Resource Name (ARN) of the pipeline.
PipelineExecutionArn (string) --
The Amazon Resource Name (ARN) of the pipeline execution.
PipelineExecutionDisplayName (string) --
The display name of the pipeline execution.
PipelineExecutionStatus (string) --
The status of the pipeline execution.
PipelineExecutionDescription (string) --
The description of the pipeline execution.
PipelineExperimentConfig (dict) --
Specifies the names of the experiment and trial created by a pipeline.
ExperimentName (string) --
The name of the experiment.
TrialName (string) --
The name of the trial.
FailureReason (string) --
If the execution failed, a message describing why.
CreationTime (datetime) --
The time when the pipeline execution was created.
LastModifiedTime (datetime) --
The time when the pipeline execution was modified last.
CreatedBy (dict) --
Information about the user who created or modified a SageMaker resource.
UserProfileArn (string) --
The Amazon Resource Name (ARN) of the user's profile.
UserProfileName (string) --
The name of the user's profile.
DomainId (string) --
The domain associated with the user.
IamIdentity (dict) --
The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.
Arn (string) --
The Amazon Resource Name (ARN) of the IAM identity.
PrincipalId (string) --
The ID of the principal that assumes the IAM identity.
SourceIdentity (string) --
The person or application which assumes the IAM identity.
LastModifiedBy (dict) --
Information about the user who created or modified a SageMaker resource.
UserProfileArn (string) --
The Amazon Resource Name (ARN) of the user's profile.
UserProfileName (string) --
The name of the user's profile.
DomainId (string) --
The domain associated with the user.
IamIdentity (dict) --
The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.
Arn (string) --
The Amazon Resource Name (ARN) of the IAM identity.
PrincipalId (string) --
The ID of the principal that assumes the IAM identity.
SourceIdentity (string) --
The person or application which assumes the IAM identity.
ParallelismConfiguration (dict) --
The parallelism configuration applied to the pipeline.
MaxParallelExecutionSteps (integer) --
The max number of steps that can be executed in parallel.
SelectiveExecutionConfig (dict) --
The selective execution configuration applied to the pipeline run.
SourcePipelineExecutionArn (string) --
The ARN from a reference execution of the current pipeline. Used to copy input collaterals needed for the selected steps to run. The execution status of the pipeline can be either Failed or Success.
This field is required if the steps you specify for SelectedSteps depend on output collaterals from any non-specified pipeline steps. For more information, see Selective Execution for Pipeline Steps.
SelectedSteps (list) --
A list of pipeline steps to run. All step(s) in all path(s) between two selected steps should be included.
(dict) --
A step selected to run in selective execution mode.
StepName (string) --
The name of the pipeline step.
PipelineVersionId (integer) --
The ID of the pipeline version.
{'Resource': {'PipelineVersion'}}
An auto-complete API for the search functionality in the SageMaker console. It returns suggestions of possible matches for the property name to use in Search queries. Provides suggestions for HyperParameters, Tags, and Metrics.
See also: AWS API Documentation
Request Syntax
client.get_search_suggestions( Resource='TrainingJob'|'Experiment'|'ExperimentTrial'|'ExperimentTrialComponent'|'Endpoint'|'Model'|'ModelPackage'|'ModelPackageGroup'|'Pipeline'|'PipelineExecution'|'FeatureGroup'|'FeatureMetadata'|'Image'|'ImageVersion'|'Project'|'HyperParameterTuningJob'|'ModelCard'|'PipelineVersion', SuggestionQuery={ 'PropertyNameQuery': { 'PropertyNameHint': 'string' } } )
string
[REQUIRED]
The name of the SageMaker resource to search for.
dict
Limits the property names that are included in the response.
PropertyNameQuery (dict) --
Defines a property name hint. Only property names that begin with the specified hint are included in the response.
PropertyNameHint (string) -- [REQUIRED]
Text that begins a property's name.
dict
Response Syntax
{ 'PropertyNameSuggestions': [ { 'PropertyName': 'string' }, ] }
Response Structure
(dict) --
PropertyNameSuggestions (list) --
A list of property names for a Resource that match a SuggestionQuery.
(dict) --
A property name returned from a GetSearchSuggestions call that specifies a value in the PropertyNameQuery field.
PropertyName (string) --
A suggested property name based on what you entered in the search textbox in the SageMaker console.
{'Resource': {'PipelineVersion'}}Response
{'Results': {'PipelineExecution': {'PipelineVersionDisplayName': 'string', 'PipelineVersionId': 'long'}, 'PipelineVersion': {'CreatedBy': {'DomainId': 'string', 'IamIdentity': {'Arn': 'string', 'PrincipalId': 'string', 'SourceIdentity': 'string'}, 'UserProfileArn': 'string', 'UserProfileName': 'string'}, 'CreationTime': 'timestamp', 'LastExecutedPipelineExecutionArn': 'string', 'LastExecutedPipelineExecutionDisplayName': 'string', 'LastExecutedPipelineExecutionStatus': 'Executing ' '| ' 'Stopping ' '| ' 'Stopped ' '| ' 'Failed ' '| ' 'Succeeded', 'LastModifiedBy': {'DomainId': 'string', 'IamIdentity': {'Arn': 'string', 'PrincipalId': 'string', 'SourceIdentity': 'string'}, 'UserProfileArn': 'string', 'UserProfileName': 'string'}, 'LastModifiedTime': 'timestamp', 'PipelineArn': 'string', 'PipelineVersionDescription': 'string', 'PipelineVersionDisplayName': 'string', 'PipelineVersionId': 'long'}}}
Finds SageMaker resources that match a search query. Matching resources are returned as a list of SearchRecord objects in the response. You can sort the search results by any resource property in a ascending or descending order.
You can query against the following value types: numeric, text, Boolean, and timestamp.
See also: AWS API Documentation
Request Syntax
client.search( Resource='TrainingJob'|'Experiment'|'ExperimentTrial'|'ExperimentTrialComponent'|'Endpoint'|'Model'|'ModelPackage'|'ModelPackageGroup'|'Pipeline'|'PipelineExecution'|'FeatureGroup'|'FeatureMetadata'|'Image'|'ImageVersion'|'Project'|'HyperParameterTuningJob'|'ModelCard'|'PipelineVersion', SearchExpression={ 'Filters': [ { 'Name': 'string', 'Operator': 'Equals'|'NotEquals'|'GreaterThan'|'GreaterThanOrEqualTo'|'LessThan'|'LessThanOrEqualTo'|'Contains'|'Exists'|'NotExists'|'In', 'Value': 'string' }, ], 'NestedFilters': [ { 'NestedPropertyName': 'string', 'Filters': [ { 'Name': 'string', 'Operator': 'Equals'|'NotEquals'|'GreaterThan'|'GreaterThanOrEqualTo'|'LessThan'|'LessThanOrEqualTo'|'Contains'|'Exists'|'NotExists'|'In', 'Value': 'string' }, ] }, ], 'SubExpressions': [ {'... recursive ...'}, ], 'Operator': 'And'|'Or' }, SortBy='string', SortOrder='Ascending'|'Descending', NextToken='string', MaxResults=123, CrossAccountFilterOption='SameAccount'|'CrossAccount', VisibilityConditions=[ { 'Key': 'string', 'Value': 'string' }, ] )
string
[REQUIRED]
The name of the SageMaker resource to search for.
dict
A Boolean conditional statement. Resources must satisfy this condition to be included in search results. You must provide at least one subexpression, filter, or nested filter. The maximum number of recursive SubExpressions, NestedFilters, and Filters that can be included in a SearchExpression object is 50.
Filters (list) --
A list of filter objects.
(dict) --
A conditional statement for a search expression that includes a resource property, a Boolean operator, and a value. Resources that match the statement are returned in the results from the Search API.
If you specify a Value, but not an Operator, SageMaker uses the equals operator.
In search, there are several property types:
Metrics
To define a metric filter, enter a value using the form "Metrics.<name>", where <name> is a metric name. For example, the following filter searches for training jobs with an "accuracy" metric greater than "0.9":
{
"Name": "Metrics.accuracy",
"Operator": "GreaterThan",
"Value": "0.9"
}
HyperParameters
To define a hyperparameter filter, enter a value with the form "HyperParameters.<name>". Decimal hyperparameter values are treated as a decimal in a comparison if the specified Value is also a decimal value. If the specified Value is an integer, the decimal hyperparameter values are treated as integers. For example, the following filter is satisfied by training jobs with a "learning_rate" hyperparameter that is less than "0.5":
{
"Name": "HyperParameters.learning_rate",
"Operator": "LessThan",
"Value": "0.5"
}
Tags
To define a tag filter, enter a value with the form Tags.<key>.
Name (string) -- [REQUIRED]
A resource property name. For example, TrainingJobName. For valid property names, see SearchRecord. You must specify a valid property for the resource.
Operator (string) --
A Boolean binary operator that is used to evaluate the filter. The operator field contains one of the following values:
Equals
The value of Name equals Value.
NotEquals
The value of Name doesn't equal Value.
Exists
The Name property exists.
NotExists
The Name property does not exist.
GreaterThan
The value of Name is greater than Value. Not supported for text properties.
GreaterThanOrEqualTo
The value of Name is greater than or equal to Value. Not supported for text properties.
LessThan
The value of Name is less than Value. Not supported for text properties.
LessThanOrEqualTo
The value of Name is less than or equal to Value. Not supported for text properties.
In
The value of Name is one of the comma delimited strings in Value. Only supported for text properties.
Contains
The value of Name contains the string Value. Only supported for text properties.
A SearchExpression can include the Contains operator multiple times when the value of Name is one of the following:
Experiment.DisplayName
Experiment.ExperimentName
Experiment.Tags
Trial.DisplayName
Trial.TrialName
Trial.Tags
TrialComponent.DisplayName
TrialComponent.TrialComponentName
TrialComponent.Tags
TrialComponent.InputArtifacts
TrialComponent.OutputArtifacts
A SearchExpression can include only one Contains operator for all other values of Name. In these cases, if you include multiple Contains operators in the SearchExpression, the result is the following error message: " 'CONTAINS' operator usage limit of 1 exceeded."
Value (string) --
A value used with Name and Operator to determine which resources satisfy the filter's condition. For numerical properties, Value must be an integer or floating-point decimal. For timestamp properties, Value must be an ISO 8601 date-time string of the following format: YYYY-mm-dd'T'HH:MM:SS.
NestedFilters (list) --
A list of nested filter objects.
(dict) --
A list of nested Filter objects. A resource must satisfy the conditions of all filters to be included in the results returned from the Search API.
For example, to filter on a training job's InputDataConfig property with a specific channel name and S3Uri prefix, define the following filters:
'{Name:"InputDataConfig.ChannelName", "Operator":"Equals", "Value":"train"}',
'{Name:"InputDataConfig.DataSource.S3DataSource.S3Uri", "Operator":"Contains", "Value":"mybucket/catdata"}'
NestedPropertyName (string) -- [REQUIRED]
The name of the property to use in the nested filters. The value must match a listed property name, such as InputDataConfig.
Filters (list) -- [REQUIRED]
A list of filters. Each filter acts on a property. Filters must contain at least one Filters value. For example, a NestedFilters call might include a filter on the PropertyName parameter of the InputDataConfig property: InputDataConfig.DataSource.S3DataSource.S3Uri.
(dict) --
A conditional statement for a search expression that includes a resource property, a Boolean operator, and a value. Resources that match the statement are returned in the results from the Search API.
If you specify a Value, but not an Operator, SageMaker uses the equals operator.
In search, there are several property types:
Metrics
To define a metric filter, enter a value using the form "Metrics.<name>", where <name> is a metric name. For example, the following filter searches for training jobs with an "accuracy" metric greater than "0.9":
{
"Name": "Metrics.accuracy",
"Operator": "GreaterThan",
"Value": "0.9"
}
HyperParameters
To define a hyperparameter filter, enter a value with the form "HyperParameters.<name>". Decimal hyperparameter values are treated as a decimal in a comparison if the specified Value is also a decimal value. If the specified Value is an integer, the decimal hyperparameter values are treated as integers. For example, the following filter is satisfied by training jobs with a "learning_rate" hyperparameter that is less than "0.5":
{
"Name": "HyperParameters.learning_rate",
"Operator": "LessThan",
"Value": "0.5"
}
Tags
To define a tag filter, enter a value with the form Tags.<key>.
Name (string) -- [REQUIRED]
A resource property name. For example, TrainingJobName. For valid property names, see SearchRecord. You must specify a valid property for the resource.
Operator (string) --
A Boolean binary operator that is used to evaluate the filter. The operator field contains one of the following values:
Equals
The value of Name equals Value.
NotEquals
The value of Name doesn't equal Value.
Exists
The Name property exists.
NotExists
The Name property does not exist.
GreaterThan
The value of Name is greater than Value. Not supported for text properties.
GreaterThanOrEqualTo
The value of Name is greater than or equal to Value. Not supported for text properties.
LessThan
The value of Name is less than Value. Not supported for text properties.
LessThanOrEqualTo
The value of Name is less than or equal to Value. Not supported for text properties.
In
The value of Name is one of the comma delimited strings in Value. Only supported for text properties.
Contains
The value of Name contains the string Value. Only supported for text properties.
A SearchExpression can include the Contains operator multiple times when the value of Name is one of the following:
Experiment.DisplayName
Experiment.ExperimentName
Experiment.Tags
Trial.DisplayName
Trial.TrialName
Trial.Tags
TrialComponent.DisplayName
TrialComponent.TrialComponentName
TrialComponent.Tags
TrialComponent.InputArtifacts
TrialComponent.OutputArtifacts
A SearchExpression can include only one Contains operator for all other values of Name. In these cases, if you include multiple Contains operators in the SearchExpression, the result is the following error message: " 'CONTAINS' operator usage limit of 1 exceeded."
Value (string) --
A value used with Name and Operator to determine which resources satisfy the filter's condition. For numerical properties, Value must be an integer or floating-point decimal. For timestamp properties, Value must be an ISO 8601 date-time string of the following format: YYYY-mm-dd'T'HH:MM:SS.
SubExpressions (list) --
A list of search expression objects.
(dict) --
A multi-expression that searches for the specified resource or resources in a search. All resource objects that satisfy the expression's condition are included in the search results. You must specify at least one subexpression, filter, or nested filter. A SearchExpression can contain up to twenty elements.
A SearchExpression contains the following components:
A list of Filter objects. Each filter defines a simple Boolean expression comprised of a resource property name, Boolean operator, and value.
A list of NestedFilter objects. Each nested filter defines a list of Boolean expressions using a list of resource properties. A nested filter is satisfied if a single object in the list satisfies all Boolean expressions.
A list of SearchExpression objects. A search expression object can be nested in a list of search expression objects.
A Boolean operator: And or Or.
Operator (string) --
A Boolean operator used to evaluate the search expression. If you want every conditional statement in all lists to be satisfied for the entire search expression to be true, specify And. If only a single conditional statement needs to be true for the entire search expression to be true, specify Or. The default value is And.
string
The name of the resource property used to sort the SearchResults. The default is LastModifiedTime.
string
How SearchResults are ordered. Valid values are Ascending or Descending. The default is Descending.
string
If more than MaxResults resources match the specified SearchExpression, the response includes a NextToken. The NextToken can be passed to the next SearchRequest to continue retrieving results.
integer
The maximum number of results to return.
string
A cross account filter option. When the value is "CrossAccount" the search results will only include resources made discoverable to you from other accounts. When the value is "SameAccount" or null the search results will only include resources from your account. Default is null. For more information on searching for resources made discoverable to your account, see Search discoverable resources in the SageMaker Developer Guide. The maximum number of ``ResourceCatalog``s viewable is 1000.
list
Limits the results of your search request to the resources that you can access.
(dict) --
The list of key-value pairs used to filter your search results. If a search result contains a key from your list, it is included in the final search response if the value associated with the key in the result matches the value you specified. If the value doesn't match, the result is excluded from the search response. Any resources that don't have a key from the list that you've provided will also be included in the search response.
Key (string) --
The key that specifies the tag that you're using to filter the search results. It must be in the following format: Tags.<key>.
Value (string) --
The value for the tag that you're using to filter the search results.
dict
Response Syntax
# This section is too large to render. # Please see the AWS API Documentation linked below.
Response Structure
# This section is too large to render. # Please see the AWS API Documentation linked below.
{'PipelineVersionId': 'long'}
Starts a pipeline execution.
See also: AWS API Documentation
Request Syntax
client.start_pipeline_execution( PipelineName='string', PipelineExecutionDisplayName='string', PipelineParameters=[ { 'Name': 'string', 'Value': 'string' }, ], PipelineExecutionDescription='string', ClientRequestToken='string', ParallelismConfiguration={ 'MaxParallelExecutionSteps': 123 }, SelectiveExecutionConfig={ 'SourcePipelineExecutionArn': 'string', 'SelectedSteps': [ { 'StepName': 'string' }, ] }, PipelineVersionId=123 )
string
[REQUIRED]
The name or Amazon Resource Name (ARN) of the pipeline.
string
The display name of the pipeline execution.
list
Contains a list of pipeline parameters. This list can be empty.
(dict) --
Assigns a value to a named Pipeline parameter.
Name (string) -- [REQUIRED]
The name of the parameter to assign a value to. This parameter name must match a named parameter in the pipeline definition.
Value (string) -- [REQUIRED]
The literal value for the parameter.
string
The description of the pipeline execution.
string
[REQUIRED]
A unique, case-sensitive identifier that you provide to ensure the idempotency of the operation. An idempotent operation completes no more than once.
This field is autopopulated if not provided.
dict
This configuration, if specified, overrides the parallelism configuration of the parent pipeline for this specific run.
MaxParallelExecutionSteps (integer) -- [REQUIRED]
The max number of steps that can be executed in parallel.
dict
The selective execution configuration applied to the pipeline run.
SourcePipelineExecutionArn (string) --
The ARN from a reference execution of the current pipeline. Used to copy input collaterals needed for the selected steps to run. The execution status of the pipeline can be either Failed or Success.
This field is required if the steps you specify for SelectedSteps depend on output collaterals from any non-specified pipeline steps. For more information, see Selective Execution for Pipeline Steps.
SelectedSteps (list) -- [REQUIRED]
A list of pipeline steps to run. All step(s) in all path(s) between two selected steps should be included.
(dict) --
A step selected to run in selective execution mode.
StepName (string) -- [REQUIRED]
The name of the pipeline step.
integer
The ID of the pipeline version to start execution from.
dict
Response Syntax
{ 'PipelineExecutionArn': 'string' }
Response Structure
(dict) --
PipelineExecutionArn (string) --
The Amazon Resource Name (ARN) of the pipeline execution.
{'RestrictedInstanceGroups': [{'EnvironmentConfig': {'FSxLustreConfig': {'PerUnitStorageThroughput': 'integer', 'SizeInGiB': 'integer'}}, 'ExecutionRole': 'string', 'InstanceCount': 'integer', 'InstanceGroupName': 'string', 'InstanceStorageConfigs': [{'EbsVolumeConfig': {'VolumeSizeInGB': 'integer'}}], 'InstanceType': 'ml.p4d.24xlarge | ' 'ml.p4de.24xlarge | ' 'ml.p5.48xlarge | ' 'ml.trn1.32xlarge | ' 'ml.trn1n.32xlarge | ' 'ml.g5.xlarge | ml.g5.2xlarge | ' 'ml.g5.4xlarge | ml.g5.8xlarge ' '| ml.g5.12xlarge | ' 'ml.g5.16xlarge | ' 'ml.g5.24xlarge | ' 'ml.g5.48xlarge | ml.c5.large | ' 'ml.c5.xlarge | ml.c5.2xlarge | ' 'ml.c5.4xlarge | ml.c5.9xlarge ' '| ml.c5.12xlarge | ' 'ml.c5.18xlarge | ' 'ml.c5.24xlarge | ml.c5n.large ' '| ml.c5n.2xlarge | ' 'ml.c5n.4xlarge | ' 'ml.c5n.9xlarge | ' 'ml.c5n.18xlarge | ml.m5.large ' '| ml.m5.xlarge | ml.m5.2xlarge ' '| ml.m5.4xlarge | ' 'ml.m5.8xlarge | ml.m5.12xlarge ' '| ml.m5.16xlarge | ' 'ml.m5.24xlarge | ml.t3.medium ' '| ml.t3.large | ml.t3.xlarge | ' 'ml.t3.2xlarge | ml.g6.xlarge | ' 'ml.g6.2xlarge | ml.g6.4xlarge ' '| ml.g6.8xlarge | ' 'ml.g6.16xlarge | ' 'ml.g6.12xlarge | ' 'ml.g6.24xlarge | ' 'ml.g6.48xlarge | ' 'ml.gr6.4xlarge | ' 'ml.gr6.8xlarge | ml.g6e.xlarge ' '| ml.g6e.2xlarge | ' 'ml.g6e.4xlarge | ' 'ml.g6e.8xlarge | ' 'ml.g6e.16xlarge | ' 'ml.g6e.12xlarge | ' 'ml.g6e.24xlarge | ' 'ml.g6e.48xlarge | ' 'ml.p5e.48xlarge | ' 'ml.p5en.48xlarge | ' 'ml.p6-b200.48xlarge | ' 'ml.trn2.48xlarge | ' 'ml.c6i.large | ml.c6i.xlarge | ' 'ml.c6i.2xlarge | ' 'ml.c6i.4xlarge | ' 'ml.c6i.8xlarge | ' 'ml.c6i.12xlarge | ' 'ml.c6i.16xlarge | ' 'ml.c6i.24xlarge | ' 'ml.c6i.32xlarge | ml.m6i.large ' '| ml.m6i.xlarge | ' 'ml.m6i.2xlarge | ' 'ml.m6i.4xlarge | ' 'ml.m6i.8xlarge | ' 'ml.m6i.12xlarge | ' 'ml.m6i.16xlarge | ' 'ml.m6i.24xlarge | ' 'ml.m6i.32xlarge | ml.r6i.large ' '| ml.r6i.xlarge | ' 'ml.r6i.2xlarge | ' 'ml.r6i.4xlarge | ' 'ml.r6i.8xlarge | ' 'ml.r6i.12xlarge | ' 'ml.r6i.16xlarge | ' 'ml.r6i.24xlarge | ' 'ml.r6i.32xlarge | ' 'ml.i3en.large | ml.i3en.xlarge ' '| ml.i3en.2xlarge | ' 'ml.i3en.3xlarge | ' 'ml.i3en.6xlarge | ' 'ml.i3en.12xlarge | ' 'ml.i3en.24xlarge | ' 'ml.m7i.large | ml.m7i.xlarge | ' 'ml.m7i.2xlarge | ' 'ml.m7i.4xlarge | ' 'ml.m7i.8xlarge | ' 'ml.m7i.12xlarge | ' 'ml.m7i.16xlarge | ' 'ml.m7i.24xlarge | ' 'ml.m7i.48xlarge | ml.r7i.large ' '| ml.r7i.xlarge | ' 'ml.r7i.2xlarge | ' 'ml.r7i.4xlarge | ' 'ml.r7i.8xlarge | ' 'ml.r7i.12xlarge | ' 'ml.r7i.16xlarge | ' 'ml.r7i.24xlarge | ' 'ml.r7i.48xlarge', 'OnStartDeepHealthChecks': ['InstanceStress | ' 'InstanceConnectivity'], 'OverrideVpcConfig': {'SecurityGroupIds': ['string'], 'Subnets': ['string']}, 'ScheduledUpdateConfig': {'DeploymentConfig': {'AutoRollbackConfiguration': [{'AlarmName': 'string'}], 'RollingUpdatePolicy': {'MaximumBatchSize': {'Type': 'INSTANCE_COUNT ' '| ' 'CAPACITY_PERCENTAGE', 'Value': 'integer'}, 'RollbackMaximumBatchSize': {'Type': 'INSTANCE_COUNT ' '| ' 'CAPACITY_PERCENTAGE', 'Value': 'integer'}}, 'WaitIntervalInSeconds': 'integer'}, 'ScheduleExpression': 'string'}, 'ThreadsPerCore': 'integer', 'TrainingPlanArn': 'string'}]}
Updates a SageMaker HyperPod cluster.
See also: AWS API Documentation
Request Syntax
client.update_cluster( ClusterName='string', InstanceGroups=[ { 'InstanceCount': 123, 'InstanceGroupName': 'string', 'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'LifeCycleConfig': { 'SourceS3Uri': 'string', 'OnCreate': 'string' }, 'ExecutionRole': 'string', 'ThreadsPerCore': 123, 'InstanceStorageConfigs': [ { 'EbsVolumeConfig': { 'VolumeSizeInGB': 123 } }, ], 'OnStartDeepHealthChecks': [ 'InstanceStress'|'InstanceConnectivity', ], 'TrainingPlanArn': 'string', 'OverrideVpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'ScheduledUpdateConfig': { 'ScheduleExpression': 'string', 'DeploymentConfig': { 'RollingUpdatePolicy': { 'MaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 }, 'RollbackMaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 } }, 'WaitIntervalInSeconds': 123, 'AutoRollbackConfiguration': [ { 'AlarmName': 'string' }, ] } } }, ], RestrictedInstanceGroups=[ { 'InstanceCount': 123, 'InstanceGroupName': 'string', 'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'ExecutionRole': 'string', 'ThreadsPerCore': 123, 'InstanceStorageConfigs': [ { 'EbsVolumeConfig': { 'VolumeSizeInGB': 123 } }, ], 'OnStartDeepHealthChecks': [ 'InstanceStress'|'InstanceConnectivity', ], 'TrainingPlanArn': 'string', 'OverrideVpcConfig': { 'SecurityGroupIds': [ 'string', ], 'Subnets': [ 'string', ] }, 'ScheduledUpdateConfig': { 'ScheduleExpression': 'string', 'DeploymentConfig': { 'RollingUpdatePolicy': { 'MaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 }, 'RollbackMaximumBatchSize': { 'Type': 'INSTANCE_COUNT'|'CAPACITY_PERCENTAGE', 'Value': 123 } }, 'WaitIntervalInSeconds': 123, 'AutoRollbackConfiguration': [ { 'AlarmName': 'string' }, ] } }, 'EnvironmentConfig': { 'FSxLustreConfig': { 'SizeInGiB': 123, 'PerUnitStorageThroughput': 123 } } }, ], NodeRecovery='Automatic'|'None', InstanceGroupsToDelete=[ 'string', ] )
string
[REQUIRED]
Specify the name of the SageMaker HyperPod cluster you want to update.
list
Specify the instance groups to update.
(dict) --
The specifications of an instance group that you need to define.
InstanceCount (integer) -- [REQUIRED]
Specifies the number of instances to add to the instance group of a SageMaker HyperPod cluster.
InstanceGroupName (string) -- [REQUIRED]
Specifies the name of the instance group.
InstanceType (string) -- [REQUIRED]
Specifies the instance type of the instance group.
LifeCycleConfig (dict) -- [REQUIRED]
Specifies the LifeCycle configuration for the instance group.
SourceS3Uri (string) -- [REQUIRED]
An Amazon S3 bucket path where your lifecycle scripts are stored.
OnCreate (string) -- [REQUIRED]
The file name of the entrypoint script of lifecycle scripts under SourceS3Uri. This entrypoint script runs during cluster creation.
ExecutionRole (string) -- [REQUIRED]
Specifies an IAM execution role to be assumed by the instance group.
ThreadsPerCore (integer) --
Specifies the value for Threads per core. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For instance types that doesn't support multithreading, specify 1. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.
InstanceStorageConfigs (list) --
Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster instance group.
(dict) --
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.
EbsVolumeConfig (dict) --
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
VolumeSizeInGB (integer) -- [REQUIRED]
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
OnStartDeepHealthChecks (list) --
A flag indicating whether deep health checks should be performed when the cluster instance group is created or updated.
(string) --
TrainingPlanArn (string) --
The Amazon Resource Name (ARN); of the training plan to use for this cluster instance group.
For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.
OverrideVpcConfig (dict) --
To configure multi-AZ deployments, customize the Amazon VPC configuration at the instance group level. You can specify different subnets and security groups across different AZs in the instance group specification to override a SageMaker HyperPod cluster's default Amazon VPC configuration. For more information about deploying a cluster in multiple AZs, see Setting up SageMaker HyperPod clusters across multiple AZs.
SecurityGroupIds (list) -- [REQUIRED]
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) -- [REQUIRED]
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
ScheduledUpdateConfig (dict) --
The configuration object of the schedule that SageMaker uses to update the AMI.
ScheduleExpression (string) -- [REQUIRED]
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
DeploymentConfig (dict) --
The configuration to use when updating the AMI versions.
RollingUpdatePolicy (dict) --
The policy that SageMaker uses when updating the AMI versions of the cluster.
MaximumBatchSize (dict) -- [REQUIRED]
The maximum amount of instances in the cluster that SageMaker can update at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
RollbackMaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can roll back at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
WaitIntervalInSeconds (integer) --
The duration in seconds that SageMaker waits before updating more instances in the cluster.
AutoRollbackConfiguration (list) --
An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.
(dict) --
The details of the alarm to monitor during the AMI update.
AlarmName (string) -- [REQUIRED]
The name of the alarm.
list
The specialized instance groups for training models like Amazon Nova to be created in the SageMaker HyperPod cluster.
(dict) --
The specifications of a restricted instance group that you need to define.
InstanceCount (integer) -- [REQUIRED]
Specifies the number of instances to add to the restricted instance group of a SageMaker HyperPod cluster.
InstanceGroupName (string) -- [REQUIRED]
Specifies the name of the restricted instance group.
InstanceType (string) -- [REQUIRED]
Specifies the instance type of the restricted instance group.
ExecutionRole (string) -- [REQUIRED]
Specifies an IAM execution role to be assumed by the restricted instance group.
ThreadsPerCore (integer) --
The number you specified to TreadsPerCore in CreateCluster for enabling or disabling multithreading. For instance types that support multithreading, you can specify 1 for disabling multithreading and 2 for enabling multithreading. For more information, see the reference table of CPU cores and threads per CPU core per instance type in the Amazon Elastic Compute Cloud User Guide.
InstanceStorageConfigs (list) --
Specifies the additional storage configurations for the instances in the SageMaker HyperPod cluster restricted instance group.
(dict) --
Defines the configuration for attaching additional storage to the instances in the SageMaker HyperPod cluster instance group. To learn more, see SageMaker HyperPod release notes: June 20, 2024.
EbsVolumeConfig (dict) --
Defines the configuration for attaching additional Amazon Elastic Block Store (EBS) volumes to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
VolumeSizeInGB (integer) -- [REQUIRED]
The size in gigabytes (GB) of the additional EBS volume to be attached to the instances in the SageMaker HyperPod cluster instance group. The additional EBS volume is attached to each instance within the SageMaker HyperPod cluster instance group and mounted to /opt/sagemaker.
OnStartDeepHealthChecks (list) --
A flag indicating whether deep health checks should be performed when the cluster restricted instance group is created or updated.
(string) --
TrainingPlanArn (string) --
The Amazon Resource Name (ARN) of the training plan to filter clusters by. For more information about reserving GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.
OverrideVpcConfig (dict) --
Specifies an Amazon Virtual Private Cloud (VPC) that your SageMaker jobs, hosted models, and compute resources have access to. You can control access to and from your resources by configuring a VPC. For more information, see Give SageMaker Access to Resources in your Amazon VPC.
SecurityGroupIds (list) -- [REQUIRED]
The VPC security group IDs, in the form sg-xxxxxxxx. Specify the security groups for the VPC that is specified in the Subnets field.
(string) --
Subnets (list) -- [REQUIRED]
The ID of the subnets in the VPC to which you want to connect your training job or model. For information about the availability of specific instance types, see Supported Instance Types and Availability Zones.
(string) --
ScheduledUpdateConfig (dict) --
The configuration object of the schedule that SageMaker follows when updating the AMI.
ScheduleExpression (string) -- [REQUIRED]
A cron expression that specifies the schedule that SageMaker follows when updating the AMI.
DeploymentConfig (dict) --
The configuration to use when updating the AMI versions.
RollingUpdatePolicy (dict) --
The policy that SageMaker uses when updating the AMI versions of the cluster.
MaximumBatchSize (dict) -- [REQUIRED]
The maximum amount of instances in the cluster that SageMaker can update at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
RollbackMaximumBatchSize (dict) --
The maximum amount of instances in the cluster that SageMaker can roll back at a time.
Type (string) -- [REQUIRED]
Specifies whether SageMaker should process the update by amount or percentage of instances.
Value (integer) -- [REQUIRED]
Specifies the amount or percentage of instances SageMaker updates at a time.
WaitIntervalInSeconds (integer) --
The duration in seconds that SageMaker waits before updating more instances in the cluster.
AutoRollbackConfiguration (list) --
An array that contains the alarms that SageMaker monitors to know whether to roll back the AMI update.
(dict) --
The details of the alarm to monitor during the AMI update.
AlarmName (string) -- [REQUIRED]
The name of the alarm.
EnvironmentConfig (dict) -- [REQUIRED]
The configuration for the restricted instance groups (RIG) environment.
FSxLustreConfig (dict) --
Configuration settings for an Amazon FSx for Lustre file system to be used with the cluster.
SizeInGiB (integer) -- [REQUIRED]
The storage capacity of the Amazon FSx for Lustre file system, specified in gibibytes (GiB).
PerUnitStorageThroughput (integer) -- [REQUIRED]
The throughput capacity of the Amazon FSx for Lustre file system, measured in MB/s per TiB of storage.
string
The node recovery mode to be applied to the SageMaker HyperPod cluster.
list
Specify the names of the instance groups to delete. Use a single , as the separator between multiple names.
(string) --
dict
Response Syntax
{ 'ClusterArn': 'string' }
Response Structure
(dict) --
ClusterArn (string) --
The Amazon Resource Name (ARN) of the updated SageMaker HyperPod cluster.
{'PipelineVersionId': 'long'}
Updates a pipeline.
See also: AWS API Documentation
Request Syntax
client.update_pipeline( PipelineName='string', PipelineDisplayName='string', PipelineDefinition='string', PipelineDefinitionS3Location={ 'Bucket': 'string', 'ObjectKey': 'string', 'VersionId': 'string' }, PipelineDescription='string', RoleArn='string', ParallelismConfiguration={ 'MaxParallelExecutionSteps': 123 } )
string
[REQUIRED]
The name of the pipeline to update.
string
The display name of the pipeline.
string
The JSON pipeline definition.
dict
The location of the pipeline definition stored in Amazon S3. If specified, SageMaker will retrieve the pipeline definition from this location.
Bucket (string) -- [REQUIRED]
Name of the S3 bucket.
ObjectKey (string) -- [REQUIRED]
The object key (or key name) uniquely identifies the object in an S3 bucket.
VersionId (string) --
Version Id of the pipeline definition file. If not specified, Amazon SageMaker will retrieve the latest version.
string
The description of the pipeline.
string
The Amazon Resource Name (ARN) that the pipeline uses to execute.
dict
If specified, it applies to all executions of this pipeline by default.
MaxParallelExecutionSteps (integer) -- [REQUIRED]
The max number of steps that can be executed in parallel.
dict
Response Syntax
{ 'PipelineArn': 'string', 'PipelineVersionId': 123 }
Response Structure
(dict) --
PipelineArn (string) --
The Amazon Resource Name (ARN) of the updated pipeline.
PipelineVersionId (integer) --
The ID of the pipeline version.