Amazon SageMaker Service

2025/01/16 - Amazon SageMaker Service - 3 updated api methods

Changes  Added support for ml.trn1.32xlarge instance type in Reserved Capacity Offering

DescribeTrainingPlan (updated) Link ¶
Changes (response)
{'ReservedCapacitySummaries': {'InstanceType': {'ml.trn1.32xlarge'}}}

Retrieves detailed information about a specific training plan.

See also: AWS API Documentation

Request Syntax

client.describe_training_plan(
    TrainingPlanName='string'
)
type TrainingPlanName:

string

param TrainingPlanName:

[REQUIRED]

The name of the training plan to describe.

rtype:

dict

returns:

Response Syntax

{
    'TrainingPlanArn': 'string',
    'TrainingPlanName': 'string',
    'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
    'StatusMessage': 'string',
    'DurationHours': 123,
    'DurationMinutes': 123,
    'StartTime': datetime(2015, 1, 1),
    'EndTime': datetime(2015, 1, 1),
    'UpfrontFee': 'string',
    'CurrencyCode': 'string',
    'TotalInstanceCount': 123,
    'AvailableInstanceCount': 123,
    'InUseInstanceCount': 123,
    'TargetResources': [
        'training-job'|'hyperpod-cluster',
    ],
    'ReservedCapacitySummaries': [
        {
            'ReservedCapacityArn': 'string',
            'InstanceType': 'ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge',
            'TotalInstanceCount': 123,
            'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
            'AvailabilityZone': 'string',
            'DurationHours': 123,
            'DurationMinutes': 123,
            'StartTime': datetime(2015, 1, 1),
            'EndTime': datetime(2015, 1, 1)
        },
    ]
}

Response Structure

  • (dict) --

    • TrainingPlanArn (string) --

      The Amazon Resource Name (ARN); of the training plan.

    • TrainingPlanName (string) --

      The name of the training plan.

    • Status (string) --

      The current status of the training plan (e.g., Pending, Active, Expired). To see the complete list of status values available for a training plan, refer to the Status attribute within the TrainingPlanSummary object.

    • StatusMessage (string) --

      A message providing additional information about the current status of the training plan.

    • DurationHours (integer) --

      The number of whole hours in the total duration for this training plan.

    • DurationMinutes (integer) --

      The additional minutes beyond whole hours in the total duration for this training plan.

    • StartTime (datetime) --

      The start time of the training plan.

    • EndTime (datetime) --

      The end time of the training plan.

    • UpfrontFee (string) --

      The upfront fee for the training plan.

    • CurrencyCode (string) --

      The currency code for the upfront fee (e.g., USD).

    • TotalInstanceCount (integer) --

      The total number of instances reserved in this training plan.

    • AvailableInstanceCount (integer) --

      The number of instances currently available for use in this training plan.

    • InUseInstanceCount (integer) --

      The number of instances currently in use from this training plan.

    • TargetResources (list) --

      The target resources (e.g., SageMaker Training Jobs, SageMaker HyperPod) that can use this training plan.

      Training plans are specific to their target resource.

      • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

      • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

      • (string) --

    • ReservedCapacitySummaries (list) --

      The list of Reserved Capacity providing the underlying compute resources of the plan.

      • (dict) --

        Details of a reserved capacity for the training plan.

        For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • ReservedCapacityArn (string) --

          The Amazon Resource Name (ARN); of the reserved capacity.

        • InstanceType (string) --

          The instance type for the reserved capacity.

        • TotalInstanceCount (integer) --

          The total number of instances in the reserved capacity.

        • Status (string) --

          The current status of the reserved capacity.

        • AvailabilityZone (string) --

          The availability zone for the reserved capacity.

        • DurationHours (integer) --

          The number of whole hours in the total duration for this reserved capacity.

        • DurationMinutes (integer) --

          The additional minutes beyond whole hours in the total duration for this reserved capacity.

        • StartTime (datetime) --

          The start time of the reserved capacity.

        • EndTime (datetime) --

          The end time of the reserved capacity.

ListTrainingPlans (updated) Link ¶
Changes (response)
{'TrainingPlanSummaries': {'ReservedCapacitySummaries': {'InstanceType': {'ml.trn1.32xlarge'}}}}

Retrieves a list of training plans for the current account.

See also: AWS API Documentation

Request Syntax

client.list_training_plans(
    NextToken='string',
    MaxResults=123,
    StartTimeAfter=datetime(2015, 1, 1),
    StartTimeBefore=datetime(2015, 1, 1),
    SortBy='TrainingPlanName'|'StartTime'|'Status',
    SortOrder='Ascending'|'Descending',
    Filters=[
        {
            'Name': 'Status',
            'Value': 'string'
        },
    ]
)
type NextToken:

string

param NextToken:

A token to continue pagination if more results are available.

type MaxResults:

integer

param MaxResults:

The maximum number of results to return in the response.

type StartTimeAfter:

datetime

param StartTimeAfter:

Filter to list only training plans with an actual start time after this date.

type StartTimeBefore:

datetime

param StartTimeBefore:

Filter to list only training plans with an actual start time before this date.

type SortBy:

string

param SortBy:

The training plan field to sort the results by (e.g., StartTime, Status).

type SortOrder:

string

param SortOrder:

The order to sort the results (Ascending or Descending).

type Filters:

list

param Filters:

Additional filters to apply to the list of training plans.

  • (dict) --

    A filter to apply when listing or searching for training plans.

    For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

    • Name (string) -- [REQUIRED]

      The name of the filter field (e.g., Status, InstanceType).

    • Value (string) -- [REQUIRED]

      The value to filter by for the specified field.

rtype:

dict

returns:

Response Syntax

{
    'NextToken': 'string',
    'TrainingPlanSummaries': [
        {
            'TrainingPlanArn': 'string',
            'TrainingPlanName': 'string',
            'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
            'StatusMessage': 'string',
            'DurationHours': 123,
            'DurationMinutes': 123,
            'StartTime': datetime(2015, 1, 1),
            'EndTime': datetime(2015, 1, 1),
            'UpfrontFee': 'string',
            'CurrencyCode': 'string',
            'TotalInstanceCount': 123,
            'AvailableInstanceCount': 123,
            'InUseInstanceCount': 123,
            'TargetResources': [
                'training-job'|'hyperpod-cluster',
            ],
            'ReservedCapacitySummaries': [
                {
                    'ReservedCapacityArn': 'string',
                    'InstanceType': 'ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge',
                    'TotalInstanceCount': 123,
                    'Status': 'Pending'|'Active'|'Scheduled'|'Expired'|'Failed',
                    'AvailabilityZone': 'string',
                    'DurationHours': 123,
                    'DurationMinutes': 123,
                    'StartTime': datetime(2015, 1, 1),
                    'EndTime': datetime(2015, 1, 1)
                },
            ]
        },
    ]
}

Response Structure

  • (dict) --

    • NextToken (string) --

      A token to continue pagination if more results are available.

    • TrainingPlanSummaries (list) --

      A list of summary information for the training plans.

      • (dict) --

        Details of the training plan.

        For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • TrainingPlanArn (string) --

          The Amazon Resource Name (ARN); of the training plan.

        • TrainingPlanName (string) --

          The name of the training plan.

        • Status (string) --

          The current status of the training plan (e.g., Pending, Active, Expired). To see the complete list of status values available for a training plan, refer to the Status attribute within the TrainingPlanSummary object.

        • StatusMessage (string) --

          A message providing additional information about the current status of the training plan.

        • DurationHours (integer) --

          The number of whole hours in the total duration for this training plan.

        • DurationMinutes (integer) --

          The additional minutes beyond whole hours in the total duration for this training plan.

        • StartTime (datetime) --

          The start time of the training plan.

        • EndTime (datetime) --

          The end time of the training plan.

        • UpfrontFee (string) --

          The upfront fee for the training plan.

        • CurrencyCode (string) --

          The currency code for the upfront fee (e.g., USD).

        • TotalInstanceCount (integer) --

          The total number of instances reserved in this training plan.

        • AvailableInstanceCount (integer) --

          The number of instances currently available for use in this training plan.

        • InUseInstanceCount (integer) --

          The number of instances currently in use from this training plan.

        • TargetResources (list) --

          The target resources (e.g., training jobs, HyperPod clusters) that can use this training plan.

          Training plans are specific to their target resource.

          • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

          • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

          • (string) --

        • ReservedCapacitySummaries (list) --

          A list of reserved capacities associated with this training plan, including details such as instance types, counts, and availability zones.

          • (dict) --

            Details of a reserved capacity for the training plan.

            For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

            • ReservedCapacityArn (string) --

              The Amazon Resource Name (ARN); of the reserved capacity.

            • InstanceType (string) --

              The instance type for the reserved capacity.

            • TotalInstanceCount (integer) --

              The total number of instances in the reserved capacity.

            • Status (string) --

              The current status of the reserved capacity.

            • AvailabilityZone (string) --

              The availability zone for the reserved capacity.

            • DurationHours (integer) --

              The number of whole hours in the total duration for this reserved capacity.

            • DurationMinutes (integer) --

              The additional minutes beyond whole hours in the total duration for this reserved capacity.

            • StartTime (datetime) --

              The start time of the reserved capacity.

            • EndTime (datetime) --

              The end time of the reserved capacity.

SearchTrainingPlanOfferings (updated) Link ¶
Changes (request, response)
Request
{'InstanceType': {'ml.trn1.32xlarge'}}
Response
{'TrainingPlanOfferings': {'ReservedCapacityOfferings': {'InstanceType': {'ml.trn1.32xlarge'}}}}

Searches for available training plan offerings based on specified criteria.

  • Users search for available plan offerings based on their requirements (e.g., instance type, count, start time, duration).

  • And then, they create a plan that best matches their needs using the ID of the plan offering they want to use.

For more information about how to reserve GPU capacity for your SageMaker training jobs or SageMaker HyperPod clusters using Amazon SageMaker Training Plan , see ``CreateTrainingPlan ``.

See also: AWS API Documentation

Request Syntax

client.search_training_plan_offerings(
    InstanceType='ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge',
    InstanceCount=123,
    StartTimeAfter=datetime(2015, 1, 1),
    EndTimeBefore=datetime(2015, 1, 1),
    DurationHours=123,
    TargetResources=[
        'training-job'|'hyperpod-cluster',
    ]
)
type InstanceType:

string

param InstanceType:

[REQUIRED]

The type of instance you want to search for in the available training plan offerings. This field allows you to filter the search results based on the specific compute resources you require for your SageMaker training jobs or SageMaker HyperPod clusters. When searching for training plan offerings, specifying the instance type helps you find Reserved Instances that match your computational needs.

type InstanceCount:

integer

param InstanceCount:

[REQUIRED]

The number of instances you want to reserve in the training plan offerings. This allows you to specify the quantity of compute resources needed for your SageMaker training jobs or SageMaker HyperPod clusters, helping you find reserved capacity offerings that match your requirements.

type StartTimeAfter:

datetime

param StartTimeAfter:

A filter to search for training plan offerings with a start time after a specified date.

type EndTimeBefore:

datetime

param EndTimeBefore:

A filter to search for reserved capacity offerings with an end time before a specified date.

type DurationHours:

integer

param DurationHours:

The desired duration in hours for the training plan offerings.

type TargetResources:

list

param TargetResources:

[REQUIRED]

The target resources (e.g., SageMaker Training Jobs, SageMaker HyperPod) to search for in the offerings.

Training plans are specific to their target resource.

  • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

  • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

  • (string) --

rtype:

dict

returns:

Response Syntax

{
    'TrainingPlanOfferings': [
        {
            'TrainingPlanOfferingId': 'string',
            'TargetResources': [
                'training-job'|'hyperpod-cluster',
            ],
            'RequestedStartTimeAfter': datetime(2015, 1, 1),
            'RequestedEndTimeBefore': datetime(2015, 1, 1),
            'DurationHours': 123,
            'DurationMinutes': 123,
            'UpfrontFee': 'string',
            'CurrencyCode': 'string',
            'ReservedCapacityOfferings': [
                {
                    'InstanceType': 'ml.p4d.24xlarge'|'ml.p5.48xlarge'|'ml.p5e.48xlarge'|'ml.p5en.48xlarge'|'ml.trn1.32xlarge'|'ml.trn2.48xlarge',
                    'InstanceCount': 123,
                    'AvailabilityZone': 'string',
                    'DurationHours': 123,
                    'DurationMinutes': 123,
                    'StartTime': datetime(2015, 1, 1),
                    'EndTime': datetime(2015, 1, 1)
                },
            ]
        },
    ]
}

Response Structure

  • (dict) --

    • TrainingPlanOfferings (list) --

      A list of training plan offerings that match the search criteria.

      • (dict) --

        Details about a training plan offering.

        For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

        • TrainingPlanOfferingId (string) --

          The unique identifier for this training plan offering.

        • TargetResources (list) --

          The target resources (e.g., SageMaker Training Jobs, SageMaker HyperPod) for this training plan offering.

          Training plans are specific to their target resource.

          • A training plan designed for SageMaker training jobs can only be used to schedule and run training jobs.

          • A training plan for HyperPod clusters can be used exclusively to provide compute resources to a cluster's instance group.

          • (string) --

        • RequestedStartTimeAfter (datetime) --

          The requested start time that the user specified when searching for the training plan offering.

        • RequestedEndTimeBefore (datetime) --

          The requested end time that the user specified when searching for the training plan offering.

        • DurationHours (integer) --

          The number of whole hours in the total duration for this training plan offering.

        • DurationMinutes (integer) --

          The additional minutes beyond whole hours in the total duration for this training plan offering.

        • UpfrontFee (string) --

          The upfront fee for this training plan offering.

        • CurrencyCode (string) --

          The currency code for the upfront fee (e.g., USD).

        • ReservedCapacityOfferings (list) --

          A list of reserved capacity offerings associated with this training plan offering.

          • (dict) --

            Details about a reserved capacity offering for a training plan offering.

            For more information about how to reserve GPU capacity for your SageMaker HyperPod clusters using Amazon SageMaker Training Plan, see ``CreateTrainingPlan ``.

            • InstanceType (string) --

              The instance type for the reserved capacity offering.

            • InstanceCount (integer) --

              The number of instances in the reserved capacity offering.

            • AvailabilityZone (string) --

              The availability zone for the reserved capacity offering.

            • DurationHours (integer) --

              The number of whole hours in the total duration for this reserved capacity offering.

            • DurationMinutes (integer) --

              The additional minutes beyond whole hours in the total duration for this reserved capacity offering.

            • StartTime (datetime) --

              The start time of the reserved capacity offering.

            • EndTime (datetime) --

              The end time of the reserved capacity offering.