Amazon SageMaker Service

2025/11/20 - Amazon SageMaker Service - 2 new8 updated api methods

Changes  Added training plan support for inference endpoints. Added HyperPod task governance with accelerator partition-based quota allocation. Added BatchRebootClusterNodes and BatchReplaceClusterNodes APIs. Updated ListClusterNodes to include privateDnsHostName.

BatchReplaceClusterNodes (new) Link ¶

Replaces specific nodes within a SageMaker HyperPod cluster with new hardware. BatchReplaceClusterNodes terminates the specified instances and provisions new replacement instances with the same configuration but fresh hardware. The Amazon Machine Image (AMI) and instance configuration remain the same.

This operation is useful for recovering from hardware failures or persistent issues that cannot be resolved through a reboot.

See also: AWS API Documentation

Request Syntax

client.batch_replace_cluster_nodes(
    ClusterName='string',
    NodeIds=[
        'string',
    ],
    NodeLogicalIds=[
        'string',
    ]
)
type ClusterName:

string

param ClusterName:

[REQUIRED]

The name or Amazon Resource Name (ARN) of the SageMaker HyperPod cluster containing the nodes to replace.

type NodeIds:

list

param NodeIds:

A list of EC2 instance IDs to replace with new hardware. You can specify between 1 and 25 instance IDs.

  • (string) --

type NodeLogicalIds:

list

param NodeLogicalIds:

A list of logical node IDs to replace with new hardware. You can specify between 1 and 25 logical node IDs.

The NodeLogicalId is a unique identifier that persists throughout the node's lifecycle and can be used to track nodes that are still being provisioned and don't yet have an EC2 instance ID assigned.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'Successful': [
        'string',
    ],
    'Failed': [
        {
            'NodeId': 'string',
            'ErrorCode': 'InstanceIdNotFound'|'InvalidInstanceStatus'|'InstanceIdInUse'|'InternalServerError',
            'Message': 'string'
        },
    ],
    'FailedNodeLogicalIds': [
        {
            'NodeLogicalId': 'string',
            'ErrorCode': 'InstanceIdNotFound'|'InvalidInstanceStatus'|'InstanceIdInUse'|'InternalServerError',
            'Message': 'string'
        },
    ],
    'SuccessfulNodeLogicalIds': [
        'string',
    ]
}

Response Structure

  • (dict) --

    • Successful (list) --

      A list of EC2 instance IDs for which the replacement operation was successfully initiated.

      • (string) --

    • Failed (list) --

      A list of errors encountered for EC2 instance IDs that could not be replaced. Each error includes the instance ID, an error code, and a descriptive message.

      • (dict) --

        Represents an error encountered when replacing a node in a SageMaker HyperPod cluster.

        • NodeId (string) --

          The EC2 instance ID of the node that encountered an error during the replacement operation.

        • ErrorCode (string) --

          The error code associated with the error encountered when replacing a node.

          Possible values:

          • InstanceIdNotFound: The instance does not exist in the specified cluster.

          • InvalidInstanceStatus: The instance is in a state that does not allow replacement. Wait for the instance to finish any ongoing changes before retrying.

          • InstanceIdInUse: Another operation is already in progress for this node. Wait for the operation to complete before retrying.

          • InternalServerError: An internal error occurred while processing this node.

        • Message (string) --

          A human-readable message describing the error encountered when replacing a node.

    • FailedNodeLogicalIds (list) --

      A list of errors encountered for logical node IDs that could not be replaced. Each error includes the logical node ID, an error code, and a descriptive message. This field is only present when NodeLogicalIds were provided in the request.

      • (dict) --

        Represents an error encountered when replacing a node (identified by its logical node ID) in a SageMaker HyperPod cluster.

        • NodeLogicalId (string) --

          The logical node ID of the node that encountered an error during the replacement operation.

        • ErrorCode (string) --

          The error code associated with the error encountered when replacing a node by logical node ID.

          Possible values:

          • InstanceIdNotFound: The node does not exist in the specified cluster.

          • InvalidInstanceStatus: The node is in a state that does not allow replacement. Wait for the node to finish any ongoing changes before retrying.

          • InstanceIdInUse: Another operation is already in progress for this node. Wait for the operation to complete before retrying.

          • InternalServerError: An internal error occurred while processing this node.

        • Message (string) --

          A human-readable message describing the error encountered when replacing a node by logical node ID.

    • SuccessfulNodeLogicalIds (list) --

      A list of logical node IDs for which the replacement operation was successfully initiated. This field is only present when NodeLogicalIds were provided in the request.

      • (string) --

BatchRebootClusterNodes (new) Link ¶

Reboots specific nodes within a SageMaker HyperPod cluster using a soft recovery mechanism. BatchRebootClusterNodes performs a graceful reboot of the specified nodes by calling the Amazon Elastic Compute Cloud RebootInstances API, which attempts to cleanly shut down the operating system before restarting the instance.

This operation is useful for recovering from transient issues or applying certain configuration changes that require a restart.

See also: AWS API Documentation

Request Syntax

client.batch_reboot_cluster_nodes(
    ClusterName='string',
    NodeIds=[
        'string',
    ],
    NodeLogicalIds=[
        'string',
    ]
)
type ClusterName:

string

param ClusterName:

[REQUIRED]

The name or Amazon Resource Name (ARN) of the SageMaker HyperPod cluster containing the nodes to reboot.

type NodeIds:

list

param NodeIds:

A list of EC2 instance IDs to reboot using soft recovery. You can specify between 1 and 25 instance IDs.

  • (string) --

type NodeLogicalIds:

list

param NodeLogicalIds:

A list of logical node IDs to reboot using soft recovery. You can specify between 1 and 25 logical node IDs.

The NodeLogicalId is a unique identifier that persists throughout the node's lifecycle and can be used to track nodes that are still being provisioned and don't yet have an EC2 instance ID assigned.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'Successful': [
        'string',
    ],
    'Failed': [
        {
            'NodeId': 'string',
            'ErrorCode': 'InstanceIdNotFound'|'InvalidInstanceStatus'|'InstanceIdInUse'|'InternalServerError',
            'Message': 'string'
        },
    ],
    'FailedNodeLogicalIds': [
        {
            'NodeLogicalId': 'string',
            'ErrorCode': 'InstanceIdNotFound'|'InvalidInstanceStatus'|'InstanceIdInUse'|'InternalServerError',
            'Message': 'string'
        },
    ],
    'SuccessfulNodeLogicalIds': [
        'string',
    ]
}

Response Structure

  • (dict) --

    • Successful (list) --

      A list of EC2 instance IDs for which the reboot operation was successfully initiated.

      • (string) --

    • Failed (list) --

      A list of errors encountered for EC2 instance IDs that could not be rebooted. Each error includes the instance ID, an error code, and a descriptive message.

      • (dict) --

        Represents an error encountered when rebooting a node from a SageMaker HyperPod cluster.

        • NodeId (string) --

          The EC2 instance ID of the node that encountered an error during the reboot operation.

        • ErrorCode (string) --

          The error code associated with the error encountered when rebooting a node.

          Possible values:

          • InstanceIdNotFound: The instance does not exist in the specified cluster.

          • InvalidInstanceStatus: The instance is in a state that does not allow rebooting. Wait for the instance to finish any ongoing changes before retrying.

          • InstanceIdInUse: Another operation is already in progress for this node. Wait for the operation to complete before retrying.

          • InternalServerError: An internal error occurred while processing this node.

        • Message (string) --

          A human-readable message describing the error encountered when rebooting a node.

    • FailedNodeLogicalIds (list) --

      A list of errors encountered for logical node IDs that could not be rebooted. Each error includes the logical node ID, an error code, and a descriptive message. This field is only present when NodeLogicalIds were provided in the request.

      • (dict) --

        Represents an error encountered when rebooting a node (identified by its logical node ID) from a SageMaker HyperPod cluster.

        • NodeLogicalId (string) --

          The logical node ID of the node that encountered an error during the reboot operation.

        • ErrorCode (string) --

          The error code associated with the error encountered when rebooting a node by logical node ID.

          Possible values:

          • InstanceIdNotFound: The node does not exist in the specified cluster.

          • InvalidInstanceStatus: The node is in a state that does not allow rebooting. Wait for the node to finish any ongoing changes before retrying.

          • InstanceIdInUse: Another operation is already in progress for this node. Wait for the operation to complete before retrying.

          • InternalServerError: An internal error occurred while processing this node.

        • Message (string) --

          A human-readable message describing the error encountered when rebooting a node by logical node ID.

    • SuccessfulNodeLogicalIds (list) --

      A list of logical node IDs for which the reboot operation was successfully initiated. This field is only present when NodeLogicalIds were provided in the request.

      • (string) --

CreateComputeQuota (updated) Link ¶
Changes (request)
{'ComputeQuotaConfig': {'ComputeQuotaResources': {'AcceleratorPartition': {'Count': 'integer',
                                                                           'Type': 'mig-1g.5gb '
                                                                                   '| '
                                                                                   'mig-1g.10gb '
                                                                                   '| '
                                                                                   'mig-1g.18gb '
                                                                                   '| '
                                                                                   'mig-1g.20gb '
                                                                                   '| '
                                                                                   'mig-1g.23gb '
                                                                                   '| '
                                                                                   'mig-1g.35gb '
                                                                                   '| '
                                                                                   'mig-1g.45gb '
                                                                                   '| '
                                                                                   'mig-1g.47gb '
                                                                                   '| '
                                                                                   'mig-2g.10gb '
                                                                                   '| '
                                                                                   'mig-2g.20gb '
                                                                                   '| '
                                                                                   'mig-2g.35gb '
                                                                                   '| '
                                                                                   'mig-2g.45gb '
                                                                                   '| '
                                                                                   'mig-2g.47gb '
                                                                                   '| '
                                                                                   'mig-3g.20gb '
                                                                                   '| '
                                                                                   'mig-3g.40gb '
                                                                                   '| '
                                                                                   'mig-3g.71gb '
                                                                                   '| '
                                                                                   'mig-3g.90gb '
                                                                                   '| '
                                                                                   'mig-3g.93gb '
                                                                                   '| '
                                                                                   'mig-4g.20gb '
                                                                                   '| '
                                                                                   'mig-4g.40gb '
                                                                                   '| '
                                                                                   'mig-4g.71gb '
                                                                                   '| '
                                                                                   'mig-4g.90gb '
                                                                                   '| '
                                                                                   'mig-4g.93gb '
                                                                                   '| '
                                                                                   'mig-7g.40gb '
                                                                                   '| '
                                                                                   'mig-7g.80gb '
                                                                                   '| '
                                                                                   'mig-7g.141gb '
                                                                                   '| '
                                                                                   'mig-7g.180gb '
                                                                                   '| '
                                                                                   'mig-7g.186gb'}}}}

Create compute allocation definition. This defines how compute is allocated, shared, and borrowed for specified entities. Specifically, how to lend and borrow idle compute and assign a fair-share weight to the specified entities.

See also: AWS API Documentation

Request Syntax

client.create_compute_quota(
    Name='string',
    Description='string',
    ClusterArn='string',
    ComputeQuotaConfig={
        'ComputeQuotaResources': [
            {
                'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.p6e-gb200.36xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.3xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
                'Count': 123,
                'Accelerators': 123,
                'VCpu': ...,
                'MemoryInGiB': ...,
                'AcceleratorPartition': {
                    'Type': 'mig-1g.5gb'|'mig-1g.10gb'|'mig-1g.18gb'|'mig-1g.20gb'|'mig-1g.23gb'|'mig-1g.35gb'|'mig-1g.45gb'|'mig-1g.47gb'|'mig-2g.10gb'|'mig-2g.20gb'|'mig-2g.35gb'|'mig-2g.45gb'|'mig-2g.47gb'|'mig-3g.20gb'|'mig-3g.40gb'|'mig-3g.71gb'|'mig-3g.90gb'|'mig-3g.93gb'|'mig-4g.20gb'|'mig-4g.40gb'|'mig-4g.71gb'|'mig-4g.90gb'|'mig-4g.93gb'|'mig-7g.40gb'|'mig-7g.80gb'|'mig-7g.141gb'|'mig-7g.180gb'|'mig-7g.186gb',
                    'Count': 123
                }
            },
        ],
        'ResourceSharingConfig': {
            'Strategy': 'Lend'|'DontLend'|'LendAndBorrow',
            'BorrowLimit': 123
        },
        'PreemptTeamTasks': 'Never'|'LowerPriority'
    },
    ComputeQuotaTarget={
        'TeamName': 'string',
        'FairShareWeight': 123
    },
    ActivationState='Enabled'|'Disabled',
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
type Name:

string

param Name:

[REQUIRED]

Name to the compute allocation definition.

type Description:

string

param Description:

Description of the compute allocation definition.

type ClusterArn:

string

param ClusterArn:

[REQUIRED]

ARN of the cluster.

type ComputeQuotaConfig:

dict

param ComputeQuotaConfig:

[REQUIRED]

Configuration of the compute allocation definition. This includes the resource sharing option, and the setting to preempt low priority tasks.

  • ComputeQuotaResources (list) --

    Allocate compute resources by instance types.

    • (dict) --

      Configuration of the resources used for the compute allocation definition.

      • InstanceType (string) -- [REQUIRED]

        The instance type of the instance group for the cluster.

      • Count (integer) --

        The number of instances to add to the instance group of a SageMaker HyperPod cluster.

      • Accelerators (integer) --

        The number of accelerators to allocate. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the number of accelerators you provide. For example, if you allocate 16 out of 32 total accelerators, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU and MemoryInGiB.

      • VCpu (float) --

        The number of vCPU to allocate. If you specify a value only for vCPU, SageMaker AI automatically allocates ratio-based values for MemoryInGiB based on this vCPU parameter. For example, if you allocate 20 out of 40 total vCPU, SageMaker AI uses the ratio of 0.5 and allocates values to MemoryInGiB. Accelerators are set to 0.

      • MemoryInGiB (float) --

        The amount of memory in GiB to allocate. If you specify a value only for this parameter, SageMaker AI automatically allocates a ratio-based value for vCPU based on this memory that you provide. For example, if you allocate 200 out of 400 total memory in GiB, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU. Accelerators are set to 0.

      • AcceleratorPartition (dict) --

        The accelerator partition configuration for fractional GPU allocation.

        • Type (string) -- [REQUIRED]

          The Multi-Instance GPU (MIG) profile type that defines the partition configuration. The profile specifies the compute and memory allocation for each partition instance. The available profile types depend on the instance type specified in the compute quota configuration.

        • Count (integer) -- [REQUIRED]

          The number of accelerator partitions to allocate with the specified partition type. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the accelerator partition count you provide.

  • ResourceSharingConfig (dict) --

    Resource sharing configuration. This defines how an entity can lend and borrow idle compute with other entities within the cluster.

    • Strategy (string) -- [REQUIRED]

      The strategy of how idle compute is shared within the cluster. The following are the options of strategies.

      • DontLend: entities do not lend idle compute.

      • Lend: entities can lend idle compute to entities that can borrow.

      • LendandBorrow: entities can lend idle compute and borrow idle compute from other entities.

      Default is LendandBorrow.

    • BorrowLimit (integer) --

      The limit on how much idle compute can be borrowed.The values can be 1 - 500 percent of idle compute that the team is allowed to borrow.

      Default is 50.

  • PreemptTeamTasks (string) --

    Allows workloads from within an entity to preempt same-team workloads. When set to LowerPriority, the entity's lower priority tasks are preempted by their own higher priority tasks.

    Default is LowerPriority.

type ComputeQuotaTarget:

dict

param ComputeQuotaTarget:

[REQUIRED]

The target entity to allocate compute resources to.

  • TeamName (string) -- [REQUIRED]

    Name of the team to allocate compute resources to.

  • FairShareWeight (integer) --

    Assigned entity fair-share weight. Idle compute will be shared across entities based on these assigned weights. This weight is only used when FairShare is enabled.

    A weight of 0 is the lowest priority and 100 is the highest. Weight 0 is the default.

type ActivationState:

string

param ActivationState:

The state of the compute allocation being described. Use to enable or disable compute allocation.

Default is Enabled.

type Tags:

list

param Tags:

Tags of the compute allocation definition.

  • (dict) --

    A tag object that consists of a key and an optional value, used to manage metadata for SageMaker Amazon Web Services resources.

    You can add tags to notebook instances, training jobs, hyperparameter tuning jobs, batch transform jobs, models, labeling jobs, work teams, endpoint configurations, and endpoints. For more information on adding tags to SageMaker resources, see AddTags.

    For more information on adding metadata to your Amazon Web Services resources with tagging, see Tagging Amazon Web Services resources. For advice on best practices for managing Amazon Web Services resources with tagging, see Tagging Best Practices: Implement an Effective Amazon Web Services Resource Tagging Strategy.

    • Key (string) -- [REQUIRED]

      The tag key. Tag keys must be unique per resource.

    • Value (string) -- [REQUIRED]

      The tag value.

rtype:

dict

returns:

Response Syntax

{
    'ComputeQuotaArn': 'string',
    'ComputeQuotaId': 'string'
}

Response Structure

  • (dict) --

    • ComputeQuotaArn (string) --

      ARN of the compute allocation definition.

    • ComputeQuotaId (string) --

      ID of the compute allocation definition.

DescribeComputeQuota (updated) Link ¶
Changes (response)
{'ComputeQuotaConfig': {'ComputeQuotaResources': {'AcceleratorPartition': {'Count': 'integer',
                                                                           'Type': 'mig-1g.5gb '
                                                                                   '| '
                                                                                   'mig-1g.10gb '
                                                                                   '| '
                                                                                   'mig-1g.18gb '
                                                                                   '| '
                                                                                   'mig-1g.20gb '
                                                                                   '| '
                                                                                   'mig-1g.23gb '
                                                                                   '| '
                                                                                   'mig-1g.35gb '
                                                                                   '| '
                                                                                   'mig-1g.45gb '
                                                                                   '| '
                                                                                   'mig-1g.47gb '
                                                                                   '| '
                                                                                   'mig-2g.10gb '
                                                                                   '| '
                                                                                   'mig-2g.20gb '
                                                                                   '| '
                                                                                   'mig-2g.35gb '
                                                                                   '| '
                                                                                   'mig-2g.45gb '
                                                                                   '| '
                                                                                   'mig-2g.47gb '
                                                                                   '| '
                                                                                   'mig-3g.20gb '
                                                                                   '| '
                                                                                   'mig-3g.40gb '
                                                                                   '| '
                                                                                   'mig-3g.71gb '
                                                                                   '| '
                                                                                   'mig-3g.90gb '
                                                                                   '| '
                                                                                   'mig-3g.93gb '
                                                                                   '| '
                                                                                   'mig-4g.20gb '
                                                                                   '| '
                                                                                   'mig-4g.40gb '
                                                                                   '| '
                                                                                   'mig-4g.71gb '
                                                                                   '| '
                                                                                   'mig-4g.90gb '
                                                                                   '| '
                                                                                   'mig-4g.93gb '
                                                                                   '| '
                                                                                   'mig-7g.40gb '
                                                                                   '| '
                                                                                   'mig-7g.80gb '
                                                                                   '| '
                                                                                   'mig-7g.141gb '
                                                                                   '| '
                                                                                   'mig-7g.180gb '
                                                                                   '| '
                                                                                   'mig-7g.186gb'}}}}

Description of the compute allocation definition.

See also: AWS API Documentation

Request Syntax

client.describe_compute_quota(
    ComputeQuotaId='string',
    ComputeQuotaVersion=123
)
type ComputeQuotaId:

string

param ComputeQuotaId:

[REQUIRED]

ID of the compute allocation definition.

type ComputeQuotaVersion:

integer

param ComputeQuotaVersion:

Version of the compute allocation definition.

rtype:

dict

returns:

Response Syntax

{
    'ComputeQuotaArn': 'string',
    'ComputeQuotaId': 'string',
    'Name': 'string',
    'Description': 'string',
    'ComputeQuotaVersion': 123,
    'Status': 'Creating'|'CreateFailed'|'CreateRollbackFailed'|'Created'|'Updating'|'UpdateFailed'|'UpdateRollbackFailed'|'Updated'|'Deleting'|'DeleteFailed'|'DeleteRollbackFailed'|'Deleted',
    'FailureReason': 'string',
    'ClusterArn': 'string',
    'ComputeQuotaConfig': {
        'ComputeQuotaResources': [
            {
                'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.p6e-gb200.36xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.3xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
                'Count': 123,
                'Accelerators': 123,
                'VCpu': ...,
                'MemoryInGiB': ...,
                'AcceleratorPartition': {
                    'Type': 'mig-1g.5gb'|'mig-1g.10gb'|'mig-1g.18gb'|'mig-1g.20gb'|'mig-1g.23gb'|'mig-1g.35gb'|'mig-1g.45gb'|'mig-1g.47gb'|'mig-2g.10gb'|'mig-2g.20gb'|'mig-2g.35gb'|'mig-2g.45gb'|'mig-2g.47gb'|'mig-3g.20gb'|'mig-3g.40gb'|'mig-3g.71gb'|'mig-3g.90gb'|'mig-3g.93gb'|'mig-4g.20gb'|'mig-4g.40gb'|'mig-4g.71gb'|'mig-4g.90gb'|'mig-4g.93gb'|'mig-7g.40gb'|'mig-7g.80gb'|'mig-7g.141gb'|'mig-7g.180gb'|'mig-7g.186gb',
                    'Count': 123
                }
            },
        ],
        'ResourceSharingConfig': {
            'Strategy': 'Lend'|'DontLend'|'LendAndBorrow',
            'BorrowLimit': 123
        },
        'PreemptTeamTasks': 'Never'|'LowerPriority'
    },
    'ComputeQuotaTarget': {
        'TeamName': 'string',
        'FairShareWeight': 123
    },
    'ActivationState': 'Enabled'|'Disabled',
    'CreationTime': datetime(2015, 1, 1),
    'CreatedBy': {
        'UserProfileArn': 'string',
        'UserProfileName': 'string',
        'DomainId': 'string',
        'IamIdentity': {
            'Arn': 'string',
            'PrincipalId': 'string',
            'SourceIdentity': 'string'
        }
    },
    'LastModifiedTime': datetime(2015, 1, 1),
    'LastModifiedBy': {
        'UserProfileArn': 'string',
        'UserProfileName': 'string',
        'DomainId': 'string',
        'IamIdentity': {
            'Arn': 'string',
            'PrincipalId': 'string',
            'SourceIdentity': 'string'
        }
    }
}

Response Structure

  • (dict) --

    • ComputeQuotaArn (string) --

      ARN of the compute allocation definition.

    • ComputeQuotaId (string) --

      ID of the compute allocation definition.

    • Name (string) --

      Name of the compute allocation definition.

    • Description (string) --

      Description of the compute allocation definition.

    • ComputeQuotaVersion (integer) --

      Version of the compute allocation definition.

    • Status (string) --

      Status of the compute allocation definition.

    • FailureReason (string) --

      Failure reason of the compute allocation definition.

    • ClusterArn (string) --

      ARN of the cluster.

    • ComputeQuotaConfig (dict) --

      Configuration of the compute allocation definition. This includes the resource sharing option, and the setting to preempt low priority tasks.

      • ComputeQuotaResources (list) --

        Allocate compute resources by instance types.

        • (dict) --

          Configuration of the resources used for the compute allocation definition.

          • InstanceType (string) --

            The instance type of the instance group for the cluster.

          • Count (integer) --

            The number of instances to add to the instance group of a SageMaker HyperPod cluster.

          • Accelerators (integer) --

            The number of accelerators to allocate. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the number of accelerators you provide. For example, if you allocate 16 out of 32 total accelerators, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU and MemoryInGiB.

          • VCpu (float) --

            The number of vCPU to allocate. If you specify a value only for vCPU, SageMaker AI automatically allocates ratio-based values for MemoryInGiB based on this vCPU parameter. For example, if you allocate 20 out of 40 total vCPU, SageMaker AI uses the ratio of 0.5 and allocates values to MemoryInGiB. Accelerators are set to 0.

          • MemoryInGiB (float) --

            The amount of memory in GiB to allocate. If you specify a value only for this parameter, SageMaker AI automatically allocates a ratio-based value for vCPU based on this memory that you provide. For example, if you allocate 200 out of 400 total memory in GiB, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU. Accelerators are set to 0.

          • AcceleratorPartition (dict) --

            The accelerator partition configuration for fractional GPU allocation.

            • Type (string) --

              The Multi-Instance GPU (MIG) profile type that defines the partition configuration. The profile specifies the compute and memory allocation for each partition instance. The available profile types depend on the instance type specified in the compute quota configuration.

            • Count (integer) --

              The number of accelerator partitions to allocate with the specified partition type. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the accelerator partition count you provide.

      • ResourceSharingConfig (dict) --

        Resource sharing configuration. This defines how an entity can lend and borrow idle compute with other entities within the cluster.

        • Strategy (string) --

          The strategy of how idle compute is shared within the cluster. The following are the options of strategies.

          • DontLend: entities do not lend idle compute.

          • Lend: entities can lend idle compute to entities that can borrow.

          • LendandBorrow: entities can lend idle compute and borrow idle compute from other entities.

          Default is LendandBorrow.

        • BorrowLimit (integer) --

          The limit on how much idle compute can be borrowed.The values can be 1 - 500 percent of idle compute that the team is allowed to borrow.

          Default is 50.

      • PreemptTeamTasks (string) --

        Allows workloads from within an entity to preempt same-team workloads. When set to LowerPriority, the entity's lower priority tasks are preempted by their own higher priority tasks.

        Default is LowerPriority.

    • ComputeQuotaTarget (dict) --

      The target entity to allocate compute resources to.

      • TeamName (string) --

        Name of the team to allocate compute resources to.

      • FairShareWeight (integer) --

        Assigned entity fair-share weight. Idle compute will be shared across entities based on these assigned weights. This weight is only used when FairShare is enabled.

        A weight of 0 is the lowest priority and 100 is the highest. Weight 0 is the default.

    • ActivationState (string) --

      The state of the compute allocation being described. Use to enable or disable compute allocation.

      Default is Enabled.

    • CreationTime (datetime) --

      Creation time of the compute allocation configuration.

    • CreatedBy (dict) --

      Information about the user who created or modified a SageMaker resource.

      • UserProfileArn (string) --

        The Amazon Resource Name (ARN) of the user's profile.

      • UserProfileName (string) --

        The name of the user's profile.

      • DomainId (string) --

        The domain associated with the user.

      • IamIdentity (dict) --

        The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.

        • Arn (string) --

          The Amazon Resource Name (ARN) of the IAM identity.

        • PrincipalId (string) --

          The ID of the principal that assumes the IAM identity.

        • SourceIdentity (string) --

          The person or application which assumes the IAM identity.

    • LastModifiedTime (datetime) --

      Last modified time of the compute allocation configuration.

    • LastModifiedBy (dict) --

      Information about the user who created or modified a SageMaker resource.

      • UserProfileArn (string) --

        The Amazon Resource Name (ARN) of the user's profile.

      • UserProfileName (string) --

        The name of the user's profile.

      • DomainId (string) --

        The domain associated with the user.

      • IamIdentity (dict) --

        The IAM Identity details associated with the user. These details are associated with model package groups, model packages, and project entities only.

        • Arn (string) --

          The Amazon Resource Name (ARN) of the IAM identity.

        • PrincipalId (string) --

          The ID of the principal that assumes the IAM identity.

        • SourceIdentity (string) --

          The person or application which assumes the IAM identity.

DescribeTrainingPlan (updated) Link ¶
Changes (response)
{'TargetResources': {'endpoint'}}

Retrieves detailed information about a specific training plan.

See also: AWS API Documentation

Request Syntax

client.describe_training_plan(
    TrainingPlanName='string'
)
type TrainingPlanName:

string

param TrainingPlanName:

[REQUIRED]

The name of the training plan to describe.

rtype:

dict

returns:

Response Syntax

{
    'TrainingPlanArn': 'string',
    'TrainingPlanName': 'string',
    'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
    'StatusMessage': 'string',
    'DurationHours': 123,
    'DurationMinutes': 123,
    'StartTime': datetime(2015, 1, 1),
    'EndTime': datetime(2015, 1, 1),
    'UpfrontFee': 'string',
    'CurrencyCode': 'string',
    'TotalInstanceCount': 123,
    'AvailableInstanceCount': 123,
    'InUseInstanceCount': 123,
    'UnhealthyInstanceCount': 123,
    'AvailableSpareInstanceCount': 123,
    'TotalUltraServerCount': 123,
    'TargetResources': [
        'training-job'|'hyperpod-cluster'|'endpoint',
    ],
    'ReservedCapacitySummaries': [
        {
            'ReservedCapacityArn': 'string',
            'ReservedCapacityType': 'UltraServer'|'Instance',
            'UltraServerType': 'string',
            'UltraServerCount': 123,
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge'|'ml.p6-b200.48xlarge'|'ml.p4de.24xlarge'|'ml.p6e-gb200.36xlarge'|'ml.p5.4xlarge',
            'TotalInstanceCount': 123,
            'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
            'AvailabilityZone': 'string',
            'DurationHours': 123,
            'DurationMinutes': 123,
            'StartTime': datetime(2015, 1, 1),
            'EndTime': datetime(2015, 1, 1)
        },
    ]
}

Response Structure

  • (dict) --

    • TrainingPlanArn (string) --

      The Amazon Resource Name (ARN); of the training plan.

    • TrainingPlanName (string) --

      The name of the training plan.

    • Status (string) --

      The current status of the training plan (e.g., Pending, Active, Expired). To see the complete list of status values available for a training plan, refer to the Status attribute within the TrainingPlanSummary object.

    • StatusMessage (string) --

      A message providing additional information about the current status of the training plan.

    • DurationHours (integer) --

      The number of whole hours in the total duration for this training plan.

    • DurationMinutes (integer) --

      The additional minutes beyond whole hours in the total duration for this training plan.

    • StartTime (datetime) --

      The start time of the training plan.

    • EndTime (datetime) --

      The end time of the training plan.

    • UpfrontFee (string) --

      The upfront fee for the training plan.

    • CurrencyCode (string) --

      The currency code for the upfront fee (e.g., USD).

    • TotalInstanceCount (integer) --

      The total number of instances reserved in this training plan.

    • AvailableInstanceCount (integer) --

      The number of instances currently available for use in this training plan.

    • InUseInstanceCount (integer) --

      The number of instances currently in use from this training plan.

    • UnhealthyInstanceCount (integer) --

      The number of instances in the training plan that are currently in an unhealthy state.

    • AvailableSpareInstanceCount (integer) --

      The number of available spare instances in the training plan.

    • TotalUltraServerCount (integer) --

      The total number of UltraServers reserved to this training plan.

    • TargetResources (list) --

      The target resources (e.g., SageMaker Training Jobs, SageMaker HyperPod, SageMaker Endpoints) that can use this training plan.

      Training plans are specific to their target resource.

      • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

      • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

      • A training plan for SageMaker endpoints can be used exclusively to provide compute resources to SageMaker endpoints for model deployment.

      • (string) --

    • ReservedCapacitySummaries (list) --

      The list of Reserved Capacity providing the underlying compute resources of the plan.

      • (dict) --

        Details of a reserved capacity for the training plan.

        For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • ReservedCapacityArn (string) --

          The Amazon Resource Name (ARN); of the reserved capacity.

        • ReservedCapacityType (string) --

          The type of reserved capacity.

        • UltraServerType (string) --

          The type of UltraServer included in this reserved capacity, such as ml.u-p6e-gb200x72.

        • UltraServerCount (integer) --

          The number of UltraServers included in this reserved capacity.

        • InstanceType (string) --

          The instance type for the reserved capacity.

        • TotalInstanceCount (integer) --

          The total number of instances in the reserved capacity.

        • Status (string) --

          The current status of the reserved capacity.

        • AvailabilityZone (string) --

          The availability zone for the reserved capacity.

        • DurationHours (integer) --

          The number of whole hours in the total duration for this reserved capacity.

        • DurationMinutes (integer) --

          The additional minutes beyond whole hours in the total duration for this reserved capacity.

        • StartTime (datetime) --

          The start time of the reserved capacity.

        • EndTime (datetime) --

          The end time of the reserved capacity.

ListClusterNodes (updated) Link ¶
Changes (response)
{'ClusterNodeSummaries': {'PrivateDnsHostname': 'string'}}

Retrieves the list of instances (also called nodes interchangeably) in a SageMaker HyperPod cluster.

See also: AWS API Documentation

Request Syntax

client.list_cluster_nodes(
    ClusterName='string',
    CreationTimeAfter=datetime(2015, 1, 1),
    CreationTimeBefore=datetime(2015, 1, 1),
    InstanceGroupNameContains='string',
    MaxResults=123,
    NextToken='string',
    SortBy='CREATION_TIME'|'NAME',
    SortOrder='Ascending'|'Descending',
    IncludeNodeLogicalIds=True|False
)
type ClusterName:

string

param ClusterName:

[REQUIRED]

The string name or the Amazon Resource Name (ARN) of the SageMaker HyperPod cluster in which you want to retrieve the list of nodes.

type CreationTimeAfter:

datetime

param CreationTimeAfter:

A filter that returns nodes in a SageMaker HyperPod cluster created after the specified time. Timestamps are formatted according to the ISO 8601 standard.

Acceptable formats include:

  • YYYY-MM-DDThh:mm:ss.sssTZD (UTC), for example, 2014-10-01T20:30:00.000Z

  • YYYY-MM-DDThh:mm:ss.sssTZD (with offset), for example, 2014-10-01T12:30:00.000-08:00

  • YYYY-MM-DD, for example, 2014-10-01

  • Unix time in seconds, for example, 1412195400. This is also referred to as Unix Epoch time and represents the number of seconds since midnight, January 1, 1970 UTC.

For more information about the timestamp format, see Timestamp in the Amazon Web Services Command Line Interface User Guide.

type CreationTimeBefore:

datetime

param CreationTimeBefore:

A filter that returns nodes in a SageMaker HyperPod cluster created before the specified time. The acceptable formats are the same as the timestamp formats for CreationTimeAfter. For more information about the timestamp format, see Timestamp in the Amazon Web Services Command Line Interface User Guide.

type InstanceGroupNameContains:

string

param InstanceGroupNameContains:

A filter that returns the instance groups whose name contain a specified string.

type MaxResults:

integer

param MaxResults:

The maximum number of nodes to return in the response.

type NextToken:

string

param NextToken:

If the result of the previous ListClusterNodes request was truncated, the response includes a NextToken. To retrieve the next set of cluster nodes, use the token in the next request.

type SortBy:

string

param SortBy:

The field by which to sort results. The default value is CREATION_TIME.

type SortOrder:

string

param SortOrder:

The sort order for results. The default value is Ascending.

type IncludeNodeLogicalIds:

boolean

param IncludeNodeLogicalIds:

Specifies whether to include nodes that are still being provisioned in the response. When set to true, the response includes all nodes regardless of their provisioning status. When set to False (default), only nodes with assigned InstanceIds are returned.

rtype:

dict

returns:

Response Syntax

{
    'NextToken': 'string',
    'ClusterNodeSummaries': [
        {
            'InstanceGroupName': 'string',
            'InstanceId': 'string',
            'NodeLogicalId': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.p6e-gb200.36xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.3xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
            'LaunchTime': datetime(2015, 1, 1),
            'LastSoftwareUpdateTime': datetime(2015, 1, 1),
            'InstanceStatus': {
                'Status': 'Running'|'Failure'|'Pending'|'ShuttingDown'|'SystemUpdating'|'DeepHealthCheckInProgress'|'NotFound',
                'Message': 'string'
            },
            'UltraServerInfo': {
                'Id': 'string'
            },
            'PrivateDnsHostname': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • NextToken (string) --

      The next token specified for listing instances in a SageMaker HyperPod cluster.

    • ClusterNodeSummaries (list) --

      The summaries of listed instances in a SageMaker HyperPod cluster

      • (dict) --

        Lists a summary of the properties of an instance (also called a node interchangeably) of a SageMaker HyperPod cluster.

        • InstanceGroupName (string) --

          The name of the instance group in which the instance is.

        • InstanceId (string) --

          The ID of the instance.

        • NodeLogicalId (string) --

          A unique identifier for the node that persists throughout its lifecycle, from provisioning request to termination. This identifier can be used to track the node even before it has an assigned InstanceId. This field is only included when IncludeNodeLogicalIds is set to True in the ListClusterNodes request.

        • InstanceType (string) --

          The type of the instance.

        • LaunchTime (datetime) --

          The time when the instance is launched.

        • LastSoftwareUpdateTime (datetime) --

          The time when SageMaker last updated the software of the instances in the cluster.

        • InstanceStatus (dict) --

          The status of the instance.

          • Status (string) --

            The status of an instance in a SageMaker HyperPod cluster.

          • Message (string) --

            The message from an instance in a SageMaker HyperPod cluster.

        • UltraServerInfo (dict) --

          Contains information about the UltraServer.

          • Id (string) --

            The unique identifier of the UltraServer.

        • PrivateDnsHostname (string) --

          The private DNS hostname of the SageMaker HyperPod cluster node.

ListComputeQuotas (updated) Link ¶
Changes (response)
{'ComputeQuotaSummaries': {'ComputeQuotaConfig': {'ComputeQuotaResources': {'AcceleratorPartition': {'Count': 'integer',
                                                                                                     'Type': 'mig-1g.5gb '
                                                                                                             '| '
                                                                                                             'mig-1g.10gb '
                                                                                                             '| '
                                                                                                             'mig-1g.18gb '
                                                                                                             '| '
                                                                                                             'mig-1g.20gb '
                                                                                                             '| '
                                                                                                             'mig-1g.23gb '
                                                                                                             '| '
                                                                                                             'mig-1g.35gb '
                                                                                                             '| '
                                                                                                             'mig-1g.45gb '
                                                                                                             '| '
                                                                                                             'mig-1g.47gb '
                                                                                                             '| '
                                                                                                             'mig-2g.10gb '
                                                                                                             '| '
                                                                                                             'mig-2g.20gb '
                                                                                                             '| '
                                                                                                             'mig-2g.35gb '
                                                                                                             '| '
                                                                                                             'mig-2g.45gb '
                                                                                                             '| '
                                                                                                             'mig-2g.47gb '
                                                                                                             '| '
                                                                                                             'mig-3g.20gb '
                                                                                                             '| '
                                                                                                             'mig-3g.40gb '
                                                                                                             '| '
                                                                                                             'mig-3g.71gb '
                                                                                                             '| '
                                                                                                             'mig-3g.90gb '
                                                                                                             '| '
                                                                                                             'mig-3g.93gb '
                                                                                                             '| '
                                                                                                             'mig-4g.20gb '
                                                                                                             '| '
                                                                                                             'mig-4g.40gb '
                                                                                                             '| '
                                                                                                             'mig-4g.71gb '
                                                                                                             '| '
                                                                                                             'mig-4g.90gb '
                                                                                                             '| '
                                                                                                             'mig-4g.93gb '
                                                                                                             '| '
                                                                                                             'mig-7g.40gb '
                                                                                                             '| '
                                                                                                             'mig-7g.80gb '
                                                                                                             '| '
                                                                                                             'mig-7g.141gb '
                                                                                                             '| '
                                                                                                             'mig-7g.180gb '
                                                                                                             '| '
                                                                                                             'mig-7g.186gb'}}}}}

List the resource allocation definitions.

See also: AWS API Documentation

Request Syntax

client.list_compute_quotas(
    CreatedAfter=datetime(2015, 1, 1),
    CreatedBefore=datetime(2015, 1, 1),
    NameContains='string',
    Status='Creating'|'CreateFailed'|'CreateRollbackFailed'|'Created'|'Updating'|'UpdateFailed'|'UpdateRollbackFailed'|'Updated'|'Deleting'|'DeleteFailed'|'DeleteRollbackFailed'|'Deleted',
    ClusterArn='string',
    SortBy='Name'|'CreationTime'|'Status'|'ClusterArn',
    SortOrder='Ascending'|'Descending',
    NextToken='string',
    MaxResults=123
)
type CreatedAfter:

datetime

param CreatedAfter:

Filter for after this creation time. The input for this parameter is a Unix timestamp. To convert a date and time into a Unix timestamp, see EpochConverter.

type CreatedBefore:

datetime

param CreatedBefore:

Filter for before this creation time. The input for this parameter is a Unix timestamp. To convert a date and time into a Unix timestamp, see EpochConverter.

type NameContains:

string

param NameContains:

Filter for name containing this string.

type Status:

string

param Status:

Filter for status.

type ClusterArn:

string

param ClusterArn:

Filter for ARN of the cluster.

type SortBy:

string

param SortBy:

Filter for sorting the list by a given value. For example, sort by name, creation time, or status.

type SortOrder:

string

param SortOrder:

The order of the list. By default, listed in Descending order according to by SortBy. To change the list order, you can specify SortOrder to be Ascending.

type NextToken:

string

param NextToken:

If the previous response was truncated, you will receive this token. Use it in your next request to receive the next set of results.

type MaxResults:

integer

param MaxResults:

The maximum number of compute allocation definitions to list.

rtype:

dict

returns:

Response Syntax

{
    'ComputeQuotaSummaries': [
        {
            'ComputeQuotaArn': 'string',
            'ComputeQuotaId': 'string',
            'Name': 'string',
            'ComputeQuotaVersion': 123,
            'Status': 'Creating'|'CreateFailed'|'CreateRollbackFailed'|'Created'|'Updating'|'UpdateFailed'|'UpdateRollbackFailed'|'Updated'|'Deleting'|'DeleteFailed'|'DeleteRollbackFailed'|'Deleted',
            'ClusterArn': 'string',
            'ComputeQuotaConfig': {
                'ComputeQuotaResources': [
                    {
                        'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.p6e-gb200.36xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.3xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
                        'Count': 123,
                        'Accelerators': 123,
                        'VCpu': ...,
                        'MemoryInGiB': ...,
                        'AcceleratorPartition': {
                            'Type': 'mig-1g.5gb'|'mig-1g.10gb'|'mig-1g.18gb'|'mig-1g.20gb'|'mig-1g.23gb'|'mig-1g.35gb'|'mig-1g.45gb'|'mig-1g.47gb'|'mig-2g.10gb'|'mig-2g.20gb'|'mig-2g.35gb'|'mig-2g.45gb'|'mig-2g.47gb'|'mig-3g.20gb'|'mig-3g.40gb'|'mig-3g.71gb'|'mig-3g.90gb'|'mig-3g.93gb'|'mig-4g.20gb'|'mig-4g.40gb'|'mig-4g.71gb'|'mig-4g.90gb'|'mig-4g.93gb'|'mig-7g.40gb'|'mig-7g.80gb'|'mig-7g.141gb'|'mig-7g.180gb'|'mig-7g.186gb',
                            'Count': 123
                        }
                    },
                ],
                'ResourceSharingConfig': {
                    'Strategy': 'Lend'|'DontLend'|'LendAndBorrow',
                    'BorrowLimit': 123
                },
                'PreemptTeamTasks': 'Never'|'LowerPriority'
            },
            'ComputeQuotaTarget': {
                'TeamName': 'string',
                'FairShareWeight': 123
            },
            'ActivationState': 'Enabled'|'Disabled',
            'CreationTime': datetime(2015, 1, 1),
            'LastModifiedTime': datetime(2015, 1, 1)
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • ComputeQuotaSummaries (list) --

      Summaries of the compute allocation definitions.

      • (dict) --

        Summary of the compute allocation definition.

        • ComputeQuotaArn (string) --

          ARN of the compute allocation definition.

        • ComputeQuotaId (string) --

          ID of the compute allocation definition.

        • Name (string) --

          Name of the compute allocation definition.

        • ComputeQuotaVersion (integer) --

          Version of the compute allocation definition.

        • Status (string) --

          Status of the compute allocation definition.

        • ClusterArn (string) --

          ARN of the cluster.

        • ComputeQuotaConfig (dict) --

          Configuration of the compute allocation definition. This includes the resource sharing option, and the setting to preempt low priority tasks.

          • ComputeQuotaResources (list) --

            Allocate compute resources by instance types.

            • (dict) --

              Configuration of the resources used for the compute allocation definition.

              • InstanceType (string) --

                The instance type of the instance group for the cluster.

              • Count (integer) --

                The number of instances to add to the instance group of a SageMaker HyperPod cluster.

              • Accelerators (integer) --

                The number of accelerators to allocate. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the number of accelerators you provide. For example, if you allocate 16 out of 32 total accelerators, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU and MemoryInGiB.

              • VCpu (float) --

                The number of vCPU to allocate. If you specify a value only for vCPU, SageMaker AI automatically allocates ratio-based values for MemoryInGiB based on this vCPU parameter. For example, if you allocate 20 out of 40 total vCPU, SageMaker AI uses the ratio of 0.5 and allocates values to MemoryInGiB. Accelerators are set to 0.

              • MemoryInGiB (float) --

                The amount of memory in GiB to allocate. If you specify a value only for this parameter, SageMaker AI automatically allocates a ratio-based value for vCPU based on this memory that you provide. For example, if you allocate 200 out of 400 total memory in GiB, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU. Accelerators are set to 0.

              • AcceleratorPartition (dict) --

                The accelerator partition configuration for fractional GPU allocation.

                • Type (string) --

                  The Multi-Instance GPU (MIG) profile type that defines the partition configuration. The profile specifies the compute and memory allocation for each partition instance. The available profile types depend on the instance type specified in the compute quota configuration.

                • Count (integer) --

                  The number of accelerator partitions to allocate with the specified partition type. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the accelerator partition count you provide.

          • ResourceSharingConfig (dict) --

            Resource sharing configuration. This defines how an entity can lend and borrow idle compute with other entities within the cluster.

            • Strategy (string) --

              The strategy of how idle compute is shared within the cluster. The following are the options of strategies.

              • DontLend: entities do not lend idle compute.

              • Lend: entities can lend idle compute to entities that can borrow.

              • LendandBorrow: entities can lend idle compute and borrow idle compute from other entities.

              Default is LendandBorrow.

            • BorrowLimit (integer) --

              The limit on how much idle compute can be borrowed.The values can be 1 - 500 percent of idle compute that the team is allowed to borrow.

              Default is 50.

          • PreemptTeamTasks (string) --

            Allows workloads from within an entity to preempt same-team workloads. When set to LowerPriority, the entity's lower priority tasks are preempted by their own higher priority tasks.

            Default is LowerPriority.

        • ComputeQuotaTarget (dict) --

          The target entity to allocate compute resources to.

          • TeamName (string) --

            Name of the team to allocate compute resources to.

          • FairShareWeight (integer) --

            Assigned entity fair-share weight. Idle compute will be shared across entities based on these assigned weights. This weight is only used when FairShare is enabled.

            A weight of 0 is the lowest priority and 100 is the highest. Weight 0 is the default.

        • ActivationState (string) --

          The state of the compute allocation being described. Use to enable or disable compute allocation.

          Default is Enabled.

        • CreationTime (datetime) --

          Creation time of the compute allocation definition.

        • LastModifiedTime (datetime) --

          Last modified time of the compute allocation definition.

    • NextToken (string) --

      If the previous response was truncated, you will receive this token. Use it in your next request to receive the next set of results.

ListTrainingPlans (updated) Link ¶
Changes (response)
{'TrainingPlanSummaries': {'TargetResources': {'endpoint'}}}

Retrieves a list of training plans for the current account.

See also: AWS API Documentation

Request Syntax

client.list_training_plans(
    NextToken='string',
    MaxResults=123,
    StartTimeAfter=datetime(2015, 1, 1),
    StartTimeBefore=datetime(2015, 1, 1),
    SortBy='TrainingPlanName'|'StartTime'|'Status',
    SortOrder='Ascending'|'Descending',
    Filters=[
        {
            'Name': 'Status',
            'Value': 'string'
        },
    ]
)
type NextToken:

string

param NextToken:

A token to continue pagination if more results are available.

type MaxResults:

integer

param MaxResults:

The maximum number of results to return in the response.

type StartTimeAfter:

datetime

param StartTimeAfter:

Filter to list only training plans with an actual start time after this date.

type StartTimeBefore:

datetime

param StartTimeBefore:

Filter to list only training plans with an actual start time before this date.

type SortBy:

string

param SortBy:

The training plan field to sort the results by (e.g., StartTime, Status).

type SortOrder:

string

param SortOrder:

The order to sort the results (Ascending or Descending).

type Filters:

list

param Filters:

Additional filters to apply to the list of training plans.

  • (dict) --

    A filter to apply when listing or searching for training plans.

    For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

    • Name (string) -- [REQUIRED]

      The name of the filter field (e.g., Status, InstanceType).

    • Value (string) -- [REQUIRED]

      The value to filter by for the specified field.

rtype:

dict

returns:

Response Syntax

{
    'NextToken': 'string',
    'TrainingPlanSummaries': [
        {
            'TrainingPlanArn': 'string',
            'TrainingPlanName': 'string',
            'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
            'StatusMessage': 'string',
            'DurationHours': 123,
            'DurationMinutes': 123,
            'StartTime': datetime(2015, 1, 1),
            'EndTime': datetime(2015, 1, 1),
            'UpfrontFee': 'string',
            'CurrencyCode': 'string',
            'TotalInstanceCount': 123,
            'AvailableInstanceCount': 123,
            'InUseInstanceCount': 123,
            'TotalUltraServerCount': 123,
            'TargetResources': [
                'training-job'|'hyperpod-cluster'|'endpoint',
            ],
            'ReservedCapacitySummaries': [
                {
                    'ReservedCapacityArn': 'string',
                    'ReservedCapacityType': 'UltraServer'|'Instance',
                    'UltraServerType': 'string',
                    'UltraServerCount': 123,
                    'InstanceType': 'ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge'|'ml.p6-b200.48xlarge'|'ml.p4de.24xlarge'|'ml.p6e-gb200.36xlarge'|'ml.p5.4xlarge',
                    'TotalInstanceCount': 123,
                    'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
                    'AvailabilityZone': 'string',
                    'DurationHours': 123,
                    'DurationMinutes': 123,
                    'StartTime': datetime(2015, 1, 1),
                    'EndTime': datetime(2015, 1, 1)
                },
            ]
        },
    ]
}

Response Structure

  • (dict) --

    • NextToken (string) --

      A token to continue pagination if more results are available.

    • TrainingPlanSummaries (list) --

      A list of summary information for the training plans.

      • (dict) --

        Details of the training plan.

        For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • TrainingPlanArn (string) --

          The Amazon Resource Name (ARN); of the training plan.

        • TrainingPlanName (string) --

          The name of the training plan.

        • Status (string) --

          The current status of the training plan (e.g., Pending, Active, Expired). To see the complete list of status values available for a training plan, refer to the Status attribute within the TrainingPlanSummary object.

        • StatusMessage (string) --

          A message providing additional information about the current status of the training plan.

        • DurationHours (integer) --

          The number of whole hours in the total duration for this training plan.

        • DurationMinutes (integer) --

          The additional minutes beyond whole hours in the total duration for this training plan.

        • StartTime (datetime) --

          The start time of the training plan.

        • EndTime (datetime) --

          The end time of the training plan.

        • UpfrontFee (string) --

          The upfront fee for the training plan.

        • CurrencyCode (string) --

          The currency code for the upfront fee (e.g., USD).

        • TotalInstanceCount (integer) --

          The total number of instances reserved in this training plan.

        • AvailableInstanceCount (integer) --

          The number of instances currently available for use in this training plan.

        • InUseInstanceCount (integer) --

          The number of instances currently in use from this training plan.

        • TotalUltraServerCount (integer) --

          The total number of UltraServers allocated to this training plan.

        • TargetResources (list) --

          The target resources (e.g., training jobs, HyperPod clusters, Endpoints) that can use this training plan.

          Training plans are specific to their target resource.

          • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

          • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

          • A training plan for SageMaker endpoints can be used exclusively to provide compute resources to SageMaker endpoints for model deployment.

          • (string) --

        • ReservedCapacitySummaries (list) --

          A list of reserved capacities associated with this training plan, including details such as instance types, counts, and availability zones.

          • (dict) --

            Details of a reserved capacity for the training plan.

            For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

            • ReservedCapacityArn (string) --

              The Amazon Resource Name (ARN); of the reserved capacity.

            • ReservedCapacityType (string) --

              The type of reserved capacity.

            • UltraServerType (string) --

              The type of UltraServer included in this reserved capacity, such as ml.u-p6e-gb200x72.

            • UltraServerCount (integer) --

              The number of UltraServers included in this reserved capacity.

            • InstanceType (string) --

              The instance type for the reserved capacity.

            • TotalInstanceCount (integer) --

              The total number of instances in the reserved capacity.

            • Status (string) --

              The current status of the reserved capacity.

            • AvailabilityZone (string) --

              The availability zone for the reserved capacity.

            • DurationHours (integer) --

              The number of whole hours in the total duration for this reserved capacity.

            • DurationMinutes (integer) --

              The additional minutes beyond whole hours in the total duration for this reserved capacity.

            • StartTime (datetime) --

              The start time of the reserved capacity.

            • EndTime (datetime) --

              The end time of the reserved capacity.

SearchTrainingPlanOfferings (updated) Link ¶
Changes (request, response)
Request
{'TargetResources': {'endpoint'}}
Response
{'TrainingPlanOfferings': {'TargetResources': {'endpoint'}}}

Searches for available training plan offerings based on specified criteria.

  • Users search for available plan offerings based on their requirements (e.g., instance type, count, start time, duration).

  • And then, they create a plan that best matches their needs using the ID of the plan offering they want to use.

For more information about how to reserve GPU capacity for your SageMaker training jobs or SageMaker HyperPod clusters using Amazon SageMaker Training Plan , see ``CreateTrainingPlan ``.

See also: AWS API Documentation

Request Syntax

client.search_training_plan_offerings(
    InstanceType='ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge'|'ml.p6-b200.48xlarge'|'ml.p4de.24xlarge'|'ml.p6e-gb200.36xlarge'|'ml.p5.4xlarge',
    InstanceCount=123,
    UltraServerType='string',
    UltraServerCount=123,
    StartTimeAfter=datetime(2015, 1, 1),
    EndTimeBefore=datetime(2015, 1, 1),
    DurationHours=123,
    TargetResources=[
        'training-job'|'hyperpod-cluster'|'endpoint',
    ]
)
type InstanceType:

string

param InstanceType:

The type of instance you want to search for in the available training plan offerings. This field allows you to filter the search results based on the specific compute resources you require for your SageMaker training jobs or SageMaker HyperPod clusters. When searching for training plan offerings, specifying the instance type helps you find Reserved Instances that match your computational needs.

type InstanceCount:

integer

param InstanceCount:

The number of instances you want to reserve in the training plan offerings. This allows you to specify the quantity of compute resources needed for your SageMaker training jobs or SageMaker HyperPod clusters, helping you find reserved capacity offerings that match your requirements.

type UltraServerType:

string

param UltraServerType:

The type of UltraServer to search for, such as ml.u-p6e-gb200x72.

type UltraServerCount:

integer

param UltraServerCount:

The number of UltraServers to search for.

type StartTimeAfter:

datetime

param StartTimeAfter:

A filter to search for training plan offerings with a start time after a specified date.

type EndTimeBefore:

datetime

param EndTimeBefore:

A filter to search for reserved capacity offerings with an end time before a specified date.

type DurationHours:

integer

param DurationHours:

[REQUIRED]

The desired duration in hours for the training plan offerings.

type TargetResources:

list

param TargetResources:

[REQUIRED]

The target resources (e.g., SageMaker Training Jobs, SageMaker HyperPod, SageMaker Endpoints) to search for in the offerings.

Training plans are specific to their target resource.

  • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

  • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

  • A training plan for SageMaker endpoints can be used exclusively to provide compute resources to SageMaker endpoints for model deployment.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'TrainingPlanOfferings': [
        {
            'TrainingPlanOfferingId': 'string',
            'TargetResources': [
                'training-job'|'hyperpod-cluster'|'endpoint',
            ],
            'RequestedStartTimeAfter': datetime(2015, 1, 1),
            'RequestedEndTimeBefore': datetime(2015, 1, 1),
            'DurationHours': 123,
            'DurationMinutes': 123,
            'UpfrontFee': 'string',
            'CurrencyCode': 'string',
            'ReservedCapacityOfferings': [
                {
                    'ReservedCapacityType': 'UltraServer'|'Instance',
                    'UltraServerType': 'string',
                    'UltraServerCount': 123,
                    'InstanceType': 'ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge'|'ml.p6-b200.48xlarge'|'ml.p4de.24xlarge'|'ml.p6e-gb200.36xlarge'|'ml.p5.4xlarge',
                    'InstanceCount': 123,
                    'AvailabilityZone': 'string',
                    'DurationHours': 123,
                    'DurationMinutes': 123,
                    'StartTime': datetime(2015, 1, 1),
                    'EndTime': datetime(2015, 1, 1)
                },
            ]
        },
    ]
}

Response Structure

  • (dict) --

    • TrainingPlanOfferings (list) --

      A list of training plan offerings that match the search criteria.

      • (dict) --

        Details about a training plan offering.

        For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • TrainingPlanOfferingId (string) --

          The unique identifier for this training plan offering.

        • TargetResources (list) --

          The target resources (e.g., SageMaker Training Jobs, SageMaker HyperPod, SageMaker Endpoints) for this training plan offering.

          Training plans are specific to their target resource.

          • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

          • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

          • A training plan for SageMaker endpoints can be used exclusively to provide compute resources to SageMaker endpoints for model deployment.

          • (string) --

        • RequestedStartTimeAfter (datetime) --

          The requested start time that the user specified when searching for the training plan offering.

        • RequestedEndTimeBefore (datetime) --

          The requested end time that the user specified when searching for the training plan offering.

        • DurationHours (integer) --

          The number of whole hours in the total duration for this training plan offering.

        • DurationMinutes (integer) --

          The additional minutes beyond whole hours in the total duration for this training plan offering.

        • UpfrontFee (string) --

          The upfront fee for this training plan offering.

        • CurrencyCode (string) --

          The currency code for the upfront fee (e.g., USD).

        • ReservedCapacityOfferings (list) --

          A list of reserved capacity offerings associated with this training plan offering.

          • (dict) --

            Details about a reserved capacity offering for a training plan offering.

            For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

            • ReservedCapacityType (string) --

              The type of reserved capacity offering.

            • UltraServerType (string) --

              The type of UltraServer included in this reserved capacity offering, such as ml.u-p6e-gb200x72.

            • UltraServerCount (integer) --

              The number of UltraServers included in this reserved capacity offering.

            • InstanceType (string) --

              The instance type for the reserved capacity offering.

            • InstanceCount (integer) --

              The number of instances in the reserved capacity offering.

            • AvailabilityZone (string) --

              The availability zone for the reserved capacity offering.

            • DurationHours (integer) --

              The number of whole hours in the total duration for this reserved capacity offering.

            • DurationMinutes (integer) --

              The additional minutes beyond whole hours in the total duration for this reserved capacity offering.

            • StartTime (datetime) --

              The start time of the reserved capacity offering.

            • EndTime (datetime) --

              The end time of the reserved capacity offering.

UpdateComputeQuota (updated) Link ¶
Changes (request)
{'ComputeQuotaConfig': {'ComputeQuotaResources': {'AcceleratorPartition': {'Count': 'integer',
                                                                           'Type': 'mig-1g.5gb '
                                                                                   '| '
                                                                                   'mig-1g.10gb '
                                                                                   '| '
                                                                                   'mig-1g.18gb '
                                                                                   '| '
                                                                                   'mig-1g.20gb '
                                                                                   '| '
                                                                                   'mig-1g.23gb '
                                                                                   '| '
                                                                                   'mig-1g.35gb '
                                                                                   '| '
                                                                                   'mig-1g.45gb '
                                                                                   '| '
                                                                                   'mig-1g.47gb '
                                                                                   '| '
                                                                                   'mig-2g.10gb '
                                                                                   '| '
                                                                                   'mig-2g.20gb '
                                                                                   '| '
                                                                                   'mig-2g.35gb '
                                                                                   '| '
                                                                                   'mig-2g.45gb '
                                                                                   '| '
                                                                                   'mig-2g.47gb '
                                                                                   '| '
                                                                                   'mig-3g.20gb '
                                                                                   '| '
                                                                                   'mig-3g.40gb '
                                                                                   '| '
                                                                                   'mig-3g.71gb '
                                                                                   '| '
                                                                                   'mig-3g.90gb '
                                                                                   '| '
                                                                                   'mig-3g.93gb '
                                                                                   '| '
                                                                                   'mig-4g.20gb '
                                                                                   '| '
                                                                                   'mig-4g.40gb '
                                                                                   '| '
                                                                                   'mig-4g.71gb '
                                                                                   '| '
                                                                                   'mig-4g.90gb '
                                                                                   '| '
                                                                                   'mig-4g.93gb '
                                                                                   '| '
                                                                                   'mig-7g.40gb '
                                                                                   '| '
                                                                                   'mig-7g.80gb '
                                                                                   '| '
                                                                                   'mig-7g.141gb '
                                                                                   '| '
                                                                                   'mig-7g.180gb '
                                                                                   '| '
                                                                                   'mig-7g.186gb'}}}}

Update the compute allocation definition.

See also: AWS API Documentation

Request Syntax

client.update_compute_quota(
    ComputeQuotaId='string',
    TargetVersion=123,
    ComputeQuotaConfig={
        'ComputeQuotaResources': [
            {
                'InstanceType': 'ml.p4d.24xlarge'|'ml.p4de.24xlarge'|'ml.p5.48xlarge'|'ml.p6e-gb200.36xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.12xlarge'|'ml.c5.18xlarge'|'ml.c5.24xlarge'|'ml.c5n.large'|'ml.c5n.2xlarge'|'ml.c5n.4xlarge'|'ml.c5n.9xlarge'|'ml.c5n.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.8xlarge'|'ml.m5.12xlarge'|'ml.m5.16xlarge'|'ml.m5.24xlarge'|'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.16xlarge'|'ml.g6.12xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.gr6.4xlarge'|'ml.gr6.8xlarge'|'ml.g6e.xlarge'|'ml.g6e.2xlarge'|'ml.g6e.4xlarge'|'ml.g6e.8xlarge'|'ml.g6e.16xlarge'|'ml.g6e.12xlarge'|'ml.g6e.24xlarge'|'ml.g6e.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.p6-b200.48xlarge'|'ml.trn2.3xlarge'|'ml.trn2.48xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.m6i.large'|'ml.m6i.xlarge'|'ml.m6i.2xlarge'|'ml.m6i.4xlarge'|'ml.m6i.8xlarge'|'ml.m6i.12xlarge'|'ml.m6i.16xlarge'|'ml.m6i.24xlarge'|'ml.m6i.32xlarge'|'ml.r6i.large'|'ml.r6i.xlarge'|'ml.r6i.2xlarge'|'ml.r6i.4xlarge'|'ml.r6i.8xlarge'|'ml.r6i.12xlarge'|'ml.r6i.16xlarge'|'ml.r6i.24xlarge'|'ml.r6i.32xlarge'|'ml.i3en.large'|'ml.i3en.xlarge'|'ml.i3en.2xlarge'|'ml.i3en.3xlarge'|'ml.i3en.6xlarge'|'ml.i3en.12xlarge'|'ml.i3en.24xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge',
                'Count': 123,
                'Accelerators': 123,
                'VCpu': ...,
                'MemoryInGiB': ...,
                'AcceleratorPartition': {
                    'Type': 'mig-1g.5gb'|'mig-1g.10gb'|'mig-1g.18gb'|'mig-1g.20gb'|'mig-1g.23gb'|'mig-1g.35gb'|'mig-1g.45gb'|'mig-1g.47gb'|'mig-2g.10gb'|'mig-2g.20gb'|'mig-2g.35gb'|'mig-2g.45gb'|'mig-2g.47gb'|'mig-3g.20gb'|'mig-3g.40gb'|'mig-3g.71gb'|'mig-3g.90gb'|'mig-3g.93gb'|'mig-4g.20gb'|'mig-4g.40gb'|'mig-4g.71gb'|'mig-4g.90gb'|'mig-4g.93gb'|'mig-7g.40gb'|'mig-7g.80gb'|'mig-7g.141gb'|'mig-7g.180gb'|'mig-7g.186gb',
                    'Count': 123
                }
            },
        ],
        'ResourceSharingConfig': {
            'Strategy': 'Lend'|'DontLend'|'LendAndBorrow',
            'BorrowLimit': 123
        },
        'PreemptTeamTasks': 'Never'|'LowerPriority'
    },
    ComputeQuotaTarget={
        'TeamName': 'string',
        'FairShareWeight': 123
    },
    ActivationState='Enabled'|'Disabled',
    Description='string'
)
type ComputeQuotaId:

string

param ComputeQuotaId:

[REQUIRED]

ID of the compute allocation definition.

type TargetVersion:

integer

param TargetVersion:

[REQUIRED]

Target version.

type ComputeQuotaConfig:

dict

param ComputeQuotaConfig:

Configuration of the compute allocation definition. This includes the resource sharing option, and the setting to preempt low priority tasks.

  • ComputeQuotaResources (list) --

    Allocate compute resources by instance types.

    • (dict) --

      Configuration of the resources used for the compute allocation definition.

      • InstanceType (string) -- [REQUIRED]

        The instance type of the instance group for the cluster.

      • Count (integer) --

        The number of instances to add to the instance group of a SageMaker HyperPod cluster.

      • Accelerators (integer) --

        The number of accelerators to allocate. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the number of accelerators you provide. For example, if you allocate 16 out of 32 total accelerators, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU and MemoryInGiB.

      • VCpu (float) --

        The number of vCPU to allocate. If you specify a value only for vCPU, SageMaker AI automatically allocates ratio-based values for MemoryInGiB based on this vCPU parameter. For example, if you allocate 20 out of 40 total vCPU, SageMaker AI uses the ratio of 0.5 and allocates values to MemoryInGiB. Accelerators are set to 0.

      • MemoryInGiB (float) --

        The amount of memory in GiB to allocate. If you specify a value only for this parameter, SageMaker AI automatically allocates a ratio-based value for vCPU based on this memory that you provide. For example, if you allocate 200 out of 400 total memory in GiB, SageMaker AI uses the ratio of 0.5 and allocates values to vCPU. Accelerators are set to 0.

      • AcceleratorPartition (dict) --

        The accelerator partition configuration for fractional GPU allocation.

        • Type (string) -- [REQUIRED]

          The Multi-Instance GPU (MIG) profile type that defines the partition configuration. The profile specifies the compute and memory allocation for each partition instance. The available profile types depend on the instance type specified in the compute quota configuration.

        • Count (integer) -- [REQUIRED]

          The number of accelerator partitions to allocate with the specified partition type. If you don't specify a value for vCPU and MemoryInGiB, SageMaker AI automatically allocates ratio-based values for those parameters based on the accelerator partition count you provide.

  • ResourceSharingConfig (dict) --

    Resource sharing configuration. This defines how an entity can lend and borrow idle compute with other entities within the cluster.

    • Strategy (string) -- [REQUIRED]

      The strategy of how idle compute is shared within the cluster. The following are the options of strategies.

      • DontLend: entities do not lend idle compute.

      • Lend: entities can lend idle compute to entities that can borrow.

      • LendandBorrow: entities can lend idle compute and borrow idle compute from other entities.

      Default is LendandBorrow.

    • BorrowLimit (integer) --

      The limit on how much idle compute can be borrowed.The values can be 1 - 500 percent of idle compute that the team is allowed to borrow.

      Default is 50.

  • PreemptTeamTasks (string) --

    Allows workloads from within an entity to preempt same-team workloads. When set to LowerPriority, the entity's lower priority tasks are preempted by their own higher priority tasks.

    Default is LowerPriority.

type ComputeQuotaTarget:

dict

param ComputeQuotaTarget:

The target entity to allocate compute resources to.

  • TeamName (string) -- [REQUIRED]

    Name of the team to allocate compute resources to.

  • FairShareWeight (integer) --

    Assigned entity fair-share weight. Idle compute will be shared across entities based on these assigned weights. This weight is only used when FairShare is enabled.

    A weight of 0 is the lowest priority and 100 is the highest. Weight 0 is the default.

type ActivationState:

string

param ActivationState:

The state of the compute allocation being described. Use to enable or disable compute allocation.

Default is Enabled.

type Description:

string

param Description:

Description of the compute allocation definition.

rtype:

dict

returns:

Response Syntax

{
    'ComputeQuotaArn': 'string',
    'ComputeQuotaVersion': 123
}

Response Structure

  • (dict) --

    • ComputeQuotaArn (string) --

      ARN of the compute allocation definition.

    • ComputeQuotaVersion (integer) --

      Version of the compute allocation definition.