AWS Parallel Computing Service

2024/08/28 - AWS Parallel Computing Service - 18 new api methods

Changes  Introducing AWS Parallel Computing Service (AWS PCS), a new service makes it easy to setup and manage high performance computing (HPC) clusters, and build scientific and engineering models at virtually any scale on AWS.

GetCluster (new) Link ¶

Returns detailed information about a running cluster in your account. This API action provides networking information, endpoint information for communication with the scheduler, and provisioning status.

See also: AWS API Documentation

Request Syntax

client.get_cluster(
    clusterIdentifier='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster of the queue.

rtype

dict

returns

Response Syntax

{
    'cluster': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'scheduler': {
            'type': 'SLURM',
            'version': 'string'
        },
        'size': 'SMALL'|'MEDIUM'|'LARGE',
        'slurmConfiguration': {
            'scaleDownIdleTimeInSeconds': 123,
            'slurmCustomSettings': [
                {
                    'parameterName': 'string',
                    'parameterValue': 'string'
                },
            ],
            'authKey': {
                'secretArn': 'string',
                'secretVersion': 'string'
            }
        },
        'networking': {
            'subnetIds': [
                'string',
            ],
            'securityGroupIds': [
                'string',
            ]
        },
        'endpoints': [
            {
                'type': 'SLURMCTLD'|'SLURMDBD',
                'privateIpAddress': 'string',
                'publicIpAddress': 'string',
                'port': 'string'
            },
        ],
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • cluster (dict) --

      The cluster resource.

      • name (string) --

        The name that identifies the cluster.

      • id (string) --

        The generated unique ID of the cluster.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the cluster.

      • status (string) --

        The provisioning status of the cluster.

        Note

        The provisioning status doesn't indicate the overall health of the cluster.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • scheduler (dict) --

        The cluster management and job scheduling software associated with the cluster.

        • type (string) --

          The software Amazon Web Services PCS uses to manage cluster scaling and job scheduling.

        • version (string) --

          The version of the specified scheduling software that Amazon Web Services PCS uses to manage cluster scaling and job scheduling.

      • size (string) --

        The size of the cluster.

        • SMALL : 32 compute nodes and 256 jobs

        • MEDIUM : 512 compute nodes and 8192 jobs

        • LARGE : 2048 compute nodes and 16,384 jobs

      • slurmConfiguration (dict) --

        Additional options related to the Slurm scheduler.

        • scaleDownIdleTimeInSeconds (integer) --

          The time before an idle node is scaled down.

        • slurmCustomSettings (list) --

          Additional Slurm-specific configuration that directly maps to Slurm settings.

          • (dict) --

            Additional settings that directly map to Slurm settings.

            • parameterName (string) --

              Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

            • parameterValue (string) --

              The values for the configured Slurm settings.

        • authKey (dict) --

          The shared Slurm key for authentication, also known as the cluster secret .

          • secretArn (string) --

            The Amazon Resource Name (ARN) of the the shared Slurm key.

          • secretVersion (string) --

            The version of the shared Slurm key.

      • networking (dict) --

        The networking configuration for the cluster's control plane.

        • subnetIds (list) --

          The ID of the subnet where Amazon Web Services PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and Amazon Web Services PCS resources. The subnet must have an available IP address, cannot reside in AWS Outposts, AWS Wavelength, or an AWS Local Zone.

          Example: subnet-abcd1234

          • (string) --

        • securityGroupIds (list) --

          The list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.

          The following rules are required:

          • Inbound rule 1

            • Protocol: All

            • Ports: All

            • Source: Self

          • Outbound rule 1

            • Protocol: All

            • Ports: All

            • Destination: 0.0.0.0/0 (IPv4)

          • Outbound rule 2

            • Protocol: All

            • Ports: All

            • Destination: Self

          • (string) --

      • endpoints (list) --

        The list of endpoints available for interaction with the scheduler.

        • (dict) --

          An endpoint available for interaction with the scheduler.

          • type (string) --

            Indicates the type of endpoint running at the specific IP address.

          • privateIpAddress (string) --

            The endpoint's private IP address.

            Example: 2.2.2.2

          • publicIpAddress (string) --

            The endpoint's public IP address.

            Example: 1.1.1.1

          • port (string) --

            The endpoint's connection port number.

            Example: 1234

      • errorInfo (list) --

        The list of errors that occurred during cluster provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

CreateComputeNodeGroup (new) Link ¶

Creates a managed set of compute nodes. You associate a compute node group with a cluster through 1 or more Amazon Web Services PCS queues or as part of the login fleet. A compute node group includes the definition of the compute properties and lifecycle management. Amazon Web Services PCS uses the information you provide to this API action to launch compute nodes in your account. You can only specify subnets in the same Amazon VPC as your cluster. You receive billing charges for the compute nodes that Amazon Web Services PCS launches in your account. You must already have a launch template before you call this API. For more information, see Launch an instance from a launch template in the Amazon Elastic Compute Cloud User Guide for Linux Instances .

See also: AWS API Documentation

Request Syntax

client.create_compute_node_group(
    clusterIdentifier='string',
    computeNodeGroupName='string',
    amiId='string',
    subnetIds=[
        'string',
    ],
    purchaseOption='ONDEMAND'|'SPOT',
    customLaunchTemplate={
        'id': 'string',
        'version': 'string'
    },
    iamInstanceProfileArn='string',
    scalingConfiguration={
        'minInstanceCount': 123,
        'maxInstanceCount': 123
    },
    instanceConfigs=[
        {
            'instanceType': 'string'
        },
    ],
    spotOptions={
        'allocationStrategy': 'lowest-price'|'capacity-optimized'|'price-capacity-optimized'
    },
    slurmConfiguration={
        'slurmCustomSettings': [
            {
                'parameterName': 'string',
                'parameterValue': 'string'
            },
        ]
    },
    clientToken='string',
    tags={
        'string': 'string'
    }
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster to create a compute node group in.

type computeNodeGroupName

string

param computeNodeGroupName

[REQUIRED]

A name to identify the cluster. Example: MyCluster

type amiId

string

param amiId

The ID of the Amazon Machine Image (AMI) that Amazon Web Services PCS uses to launch compute nodes (Amazon EC2 instances). If you don't provide this value, Amazon Web Services PCS uses the AMI ID specified in the custom launch template.

type subnetIds

list

param subnetIds

[REQUIRED]

The list of subnet IDs where the compute node group launches instances. Subnets must be in the same VPC as the cluster.

  • (string) --

type purchaseOption

string

param purchaseOption

Specifies how EC2 instances are purchased on your behalf. Amazon Web Services PCS supports On-Demand and Spot instances. For more information, see Instance purchasing options in the Amazon Elastic Compute Cloud User Guide . If you don't provide this option, it defaults to On-Demand.

type customLaunchTemplate

dict

param customLaunchTemplate

[REQUIRED]

An Amazon EC2 launch template Amazon Web Services PCS uses to launch compute nodes.

  • id (string) -- [REQUIRED]

    The ID of the EC2 launch template to use to provision instances.

    Example: lt-xxxx

  • version (string) -- [REQUIRED]

    The version of the EC2 launch template to use to provision instances.

type iamInstanceProfileArn

string

param iamInstanceProfileArn

[REQUIRED]

The Amazon Resource Name (ARN) of the IAM instance profile used to pass an IAM role when launching EC2 instances. The role contained in your instance profile must have pcs:RegisterComputeNodeGroupInstance permissions attached in order to provision instances correctly. The resource identifier of the ARN must start with AWSPCS . For example, arn:aws:iam:123456789012:instance-profile/AWSPCSMyComputeNodeInstanceProfile .

type scalingConfiguration

dict

param scalingConfiguration

[REQUIRED]

Specifies the boundaries of the compute node group auto scaling.

  • minInstanceCount (integer) -- [REQUIRED]

    The lower bound of the number of instances allowed in the compute fleet.

  • maxInstanceCount (integer) -- [REQUIRED]

    The upper bound of the number of instances allowed in the compute fleet.

type instanceConfigs

list

param instanceConfigs

[REQUIRED]

A list of EC2 instance configurations that Amazon Web Services PCS can provision in the compute node group.

  • (dict) --

    An EC2 instance configuration Amazon Web Services PCS uses to launch compute nodes.

    • instanceType (string) --

      The EC2 instance type that Amazon Web Services PCS can provision in the compute node group.

      Example: t2.xlarge

type spotOptions

dict

param spotOptions

Additional configuration when you specify SPOT as the purchaseOption for the CreateComputeNodeGroup API action.

type slurmConfiguration

dict

param slurmConfiguration

Additional options related to the Slurm scheduler.

  • slurmCustomSettings (list) --

    Additional Slurm-specific configuration that directly maps to Slurm settings.

    • (dict) --

      Additional settings that directly map to Slurm settings.

      • parameterName (string) -- [REQUIRED]

        Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

      • parameterValue (string) -- [REQUIRED]

        The values for the configured Slurm settings.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

type tags

dict

param tags

1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.

  • (string) --

    • (string) --

rtype

dict

returns

Response Syntax

{
    'computeNodeGroup': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'clusterId': 'string',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'DELETED',
        'amiId': 'string',
        'subnetIds': [
            'string',
        ],
        'purchaseOption': 'ONDEMAND'|'SPOT',
        'customLaunchTemplate': {
            'id': 'string',
            'version': 'string'
        },
        'iamInstanceProfileArn': 'string',
        'scalingConfiguration': {
            'minInstanceCount': 123,
            'maxInstanceCount': 123
        },
        'instanceConfigs': [
            {
                'instanceType': 'string'
            },
        ],
        'spotOptions': {
            'allocationStrategy': 'lowest-price'|'capacity-optimized'|'price-capacity-optimized'
        },
        'slurmConfiguration': {
            'slurmCustomSettings': [
                {
                    'parameterName': 'string',
                    'parameterValue': 'string'
                },
            ]
        },
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • computeNodeGroup (dict) --

      A compute node group associated with a cluster.

      • name (string) --

        The name that identifies the compute node group.

      • id (string) --

        The generated unique ID of the compute node group.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the compute node group.

      • clusterId (string) --

        The ID of the cluster of the compute node group.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • status (string) --

        The provisioning status of the compute node group.

        Note

        The provisioning status doesn't indicate the overall health of the compute node group.

      • amiId (string) --

        The ID of the Amazon Machine Image (AMI) that Amazon Web Services PCS uses to launch instances. If not provided, Amazon Web Services PCS uses the AMI ID specified in the custom launch template.

      • subnetIds (list) --

        The list of subnet IDs where instances are provisioned by the compute node group. The subnets must be in the same VPC as the cluster.

        • (string) --

      • purchaseOption (string) --

        Specifies how EC2 instances are purchased on your behalf. Amazon Web Services PCS supports On-Demand and Spot instances. For more information, see Instance purchasing options in the Amazon Elastic Compute Cloud User Guide . If you don't provide this option, it defaults to On-Demand.

      • customLaunchTemplate (dict) --

        An Amazon EC2 launch template Amazon Web Services PCS uses to launch compute nodes.

        • id (string) --

          The ID of the EC2 launch template to use to provision instances.

          Example: lt-xxxx

        • version (string) --

          The version of the EC2 launch template to use to provision instances.

      • iamInstanceProfileArn (string) --

        The Amazon Resource Name (ARN) of the IAM instance profile used to pass an IAM role when launching EC2 instances. The role contained in your instance profile must have pcs:RegisterComputeNodeGroupInstance permissions attached to provision instances correctly.

      • scalingConfiguration (dict) --

        Specifies the boundaries of the compute node group auto scaling.

        • minInstanceCount (integer) --

          The lower bound of the number of instances allowed in the compute fleet.

        • maxInstanceCount (integer) --

          The upper bound of the number of instances allowed in the compute fleet.

      • instanceConfigs (list) --

        A list of EC2 instance configurations that Amazon Web Services PCS can provision in the compute node group.

        • (dict) --

          An EC2 instance configuration Amazon Web Services PCS uses to launch compute nodes.

          • instanceType (string) --

            The EC2 instance type that Amazon Web Services PCS can provision in the compute node group.

            Example: t2.xlarge

      • spotOptions (dict) --

        Additional configuration when you specify SPOT as the purchaseOption for the CreateComputeNodeGroup API action.

      • slurmConfiguration (dict) --

        Additional options related to the Slurm scheduler.

        • slurmCustomSettings (list) --

          Additional Slurm-specific configuration that directly maps to Slurm settings.

          • (dict) --

            Additional settings that directly map to Slurm settings.

            • parameterName (string) --

              Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

            • parameterValue (string) --

              The values for the configured Slurm settings.

      • errorInfo (list) --

        The list of errors that occurred during compute node group provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

RegisterComputeNodeGroupInstance (new) Link ¶

Warning

This API action isn't intended for you to use.

Amazon Web Services PCS uses this API action to register the compute nodes it launches in your account.

See also: AWS API Documentation

Request Syntax

client.register_compute_node_group_instance(
    clusterIdentifier='string',
    bootstrapId='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster to register the compute node group instance in.

type bootstrapId

string

param bootstrapId

[REQUIRED]

The client-generated token to allow for retries.

rtype

dict

returns

Response Syntax

{
    'nodeID': 'string',
    'sharedSecret': 'string',
    'endpoints': [
        {
            'type': 'SLURMCTLD'|'SLURMDBD',
            'privateIpAddress': 'string',
            'publicIpAddress': 'string',
            'port': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    • nodeID (string) --

      The scheduler node ID for this instance.

    • sharedSecret (string) --

      For the Slurm scheduler, this is the shared Munge key the scheduler uses to authenticate compute node group instances.

    • endpoints (list) --

      The list of endpoints available for interaction with the scheduler.

      • (dict) --

        An endpoint available for interaction with the scheduler.

        • type (string) --

          Indicates the type of endpoint running at the specific IP address.

        • privateIpAddress (string) --

          The endpoint's private IP address.

          Example: 2.2.2.2

        • publicIpAddress (string) --

          The endpoint's public IP address.

          Example: 1.1.1.1

        • port (string) --

          The endpoint's connection port number.

          Example: 1234

UpdateComputeNodeGroup (new) Link ¶

Updates a compute node group. You can update many of the fields related to your compute node group including the configurations for networking, compute nodes, and settings specific to your scheduler (such as Slurm).

See also: AWS API Documentation

Request Syntax

client.update_compute_node_group(
    clusterIdentifier='string',
    computeNodeGroupIdentifier='string',
    amiId='string',
    subnetIds=[
        'string',
    ],
    customLaunchTemplate={
        'id': 'string',
        'version': 'string'
    },
    purchaseOption='ONDEMAND'|'SPOT',
    spotOptions={
        'allocationStrategy': 'lowest-price'|'capacity-optimized'|'price-capacity-optimized'
    },
    scalingConfiguration={
        'minInstanceCount': 123,
        'maxInstanceCount': 123
    },
    iamInstanceProfileArn='string',
    slurmConfiguration={
        'slurmCustomSettings': [
            {
                'parameterName': 'string',
                'parameterValue': 'string'
            },
        ]
    },
    clientToken='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster of the compute node group.

type computeNodeGroupIdentifier

string

param computeNodeGroupIdentifier

[REQUIRED]

The name or ID of the compute node group.

type amiId

string

param amiId

The ID of the Amazon Machine Image (AMI) that Amazon Web Services PCS uses to launch instances. If not provided, Amazon Web Services PCS uses the AMI ID specified in the custom launch template.

type subnetIds

list

param subnetIds

The list of subnet IDs where the compute node group provisions instances. The subnets must be in the same VPC as the cluster.

  • (string) --

type customLaunchTemplate

dict

param customLaunchTemplate

An Amazon EC2 launch template Amazon Web Services PCS uses to launch compute nodes.

  • id (string) -- [REQUIRED]

    The ID of the EC2 launch template to use to provision instances.

    Example: lt-xxxx

  • version (string) -- [REQUIRED]

    The version of the EC2 launch template to use to provision instances.

type purchaseOption

string

param purchaseOption

Specifies how EC2 instances are purchased on your behalf. Amazon Web Services PCS supports On-Demand and Spot instances. For more information, see Instance purchasing options in the Amazon Elastic Compute Cloud User Guide . If you don't provide this option, it defaults to On-Demand.

type spotOptions

dict

param spotOptions

Additional configuration when you specify SPOT as the purchaseOption for the CreateComputeNodeGroup API action.

type scalingConfiguration

dict

param scalingConfiguration

Specifies the boundaries of the compute node group auto scaling.

  • minInstanceCount (integer) -- [REQUIRED]

    The lower bound of the number of instances allowed in the compute fleet.

  • maxInstanceCount (integer) -- [REQUIRED]

    The upper bound of the number of instances allowed in the compute fleet.

type iamInstanceProfileArn

string

param iamInstanceProfileArn

The Amazon Resource Name (ARN) of the IAM instance profile used to pass an IAM role when launching EC2 instances. The role contained in your instance profile must have pcs:RegisterComputeNodeGroupInstance permissions attached to provision instances correctly.

type slurmConfiguration

dict

param slurmConfiguration

Additional options related to the Slurm scheduler.

  • slurmCustomSettings (list) --

    Additional Slurm-specific configuration that directly maps to Slurm settings.

    • (dict) --

      Additional settings that directly map to Slurm settings.

      • parameterName (string) -- [REQUIRED]

        Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

      • parameterValue (string) -- [REQUIRED]

        The values for the configured Slurm settings.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

rtype

dict

returns

Response Syntax

{
    'computeNodeGroup': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'clusterId': 'string',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'DELETED',
        'amiId': 'string',
        'subnetIds': [
            'string',
        ],
        'purchaseOption': 'ONDEMAND'|'SPOT',
        'customLaunchTemplate': {
            'id': 'string',
            'version': 'string'
        },
        'iamInstanceProfileArn': 'string',
        'scalingConfiguration': {
            'minInstanceCount': 123,
            'maxInstanceCount': 123
        },
        'instanceConfigs': [
            {
                'instanceType': 'string'
            },
        ],
        'spotOptions': {
            'allocationStrategy': 'lowest-price'|'capacity-optimized'|'price-capacity-optimized'
        },
        'slurmConfiguration': {
            'slurmCustomSettings': [
                {
                    'parameterName': 'string',
                    'parameterValue': 'string'
                },
            ]
        },
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • computeNodeGroup (dict) --

      A compute node group associated with a cluster.

      • name (string) --

        The name that identifies the compute node group.

      • id (string) --

        The generated unique ID of the compute node group.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the compute node group.

      • clusterId (string) --

        The ID of the cluster of the compute node group.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • status (string) --

        The provisioning status of the compute node group.

        Note

        The provisioning status doesn't indicate the overall health of the compute node group.

      • amiId (string) --

        The ID of the Amazon Machine Image (AMI) that Amazon Web Services PCS uses to launch instances. If not provided, Amazon Web Services PCS uses the AMI ID specified in the custom launch template.

      • subnetIds (list) --

        The list of subnet IDs where instances are provisioned by the compute node group. The subnets must be in the same VPC as the cluster.

        • (string) --

      • purchaseOption (string) --

        Specifies how EC2 instances are purchased on your behalf. Amazon Web Services PCS supports On-Demand and Spot instances. For more information, see Instance purchasing options in the Amazon Elastic Compute Cloud User Guide . If you don't provide this option, it defaults to On-Demand.

      • customLaunchTemplate (dict) --

        An Amazon EC2 launch template Amazon Web Services PCS uses to launch compute nodes.

        • id (string) --

          The ID of the EC2 launch template to use to provision instances.

          Example: lt-xxxx

        • version (string) --

          The version of the EC2 launch template to use to provision instances.

      • iamInstanceProfileArn (string) --

        The Amazon Resource Name (ARN) of the IAM instance profile used to pass an IAM role when launching EC2 instances. The role contained in your instance profile must have pcs:RegisterComputeNodeGroupInstance permissions attached to provision instances correctly.

      • scalingConfiguration (dict) --

        Specifies the boundaries of the compute node group auto scaling.

        • minInstanceCount (integer) --

          The lower bound of the number of instances allowed in the compute fleet.

        • maxInstanceCount (integer) --

          The upper bound of the number of instances allowed in the compute fleet.

      • instanceConfigs (list) --

        A list of EC2 instance configurations that Amazon Web Services PCS can provision in the compute node group.

        • (dict) --

          An EC2 instance configuration Amazon Web Services PCS uses to launch compute nodes.

          • instanceType (string) --

            The EC2 instance type that Amazon Web Services PCS can provision in the compute node group.

            Example: t2.xlarge

      • spotOptions (dict) --

        Additional configuration when you specify SPOT as the purchaseOption for the CreateComputeNodeGroup API action.

      • slurmConfiguration (dict) --

        Additional options related to the Slurm scheduler.

        • slurmCustomSettings (list) --

          Additional Slurm-specific configuration that directly maps to Slurm settings.

          • (dict) --

            Additional settings that directly map to Slurm settings.

            • parameterName (string) --

              Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

            • parameterValue (string) --

              The values for the configured Slurm settings.

      • errorInfo (list) --

        The list of errors that occurred during compute node group provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

DeleteComputeNodeGroup (new) Link ¶

Deletes a compute node group. You must delete all queues associated with the compute node group first.

See also: AWS API Documentation

Request Syntax

client.delete_compute_node_group(
    clusterIdentifier='string',
    computeNodeGroupIdentifier='string',
    clientToken='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster of the compute node group.

type computeNodeGroupIdentifier

string

param computeNodeGroupIdentifier

[REQUIRED]

The name or ID of the compute node group to delete.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

CreateQueue (new) Link ¶

Creates a job queue. You must associate 1 or more compute node groups with the queue. You can associate 1 compute node group with multiple queues.

See also: AWS API Documentation

Request Syntax

client.create_queue(
    clusterIdentifier='string',
    queueName='string',
    computeNodeGroupConfigurations=[
        {
            'computeNodeGroupId': 'string'
        },
    ],
    clientToken='string',
    tags={
        'string': 'string'
    }
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster for which to create a queue.

type queueName

string

param queueName

[REQUIRED]

A name to identify the queue.

type computeNodeGroupConfigurations

list

param computeNodeGroupConfigurations

The list of compute node group configurations to associate with the queue. Queues assign jobs to associated compute node groups.

  • (dict) --

    The compute node group configuration for a queue.

    • computeNodeGroupId (string) --

      The compute node group ID for the compute node group configuration.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

type tags

dict

param tags

1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.

  • (string) --

    • (string) --

rtype

dict

returns

Response Syntax

{
    'queue': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'clusterId': 'string',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED',
        'computeNodeGroupConfigurations': [
            {
                'computeNodeGroupId': 'string'
            },
        ],
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • queue (dict) --

      A queue resource.

      • name (string) --

        The name that identifies the queue.

      • id (string) --

        The generated unique ID of the queue.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the queue.

      • clusterId (string) --

        The ID of the cluster of the queue.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • status (string) --

        The provisioning status of the queue.

        Note

        The provisioning status doesn't indicate the overall health of the queue.

      • computeNodeGroupConfigurations (list) --

        The list of compute node group configurations associated with the queue. Queues assign jobs to associated compute node groups.

        • (dict) --

          The compute node group configuration for a queue.

          • computeNodeGroupId (string) --

            The compute node group ID for the compute node group configuration.

      • errorInfo (list) --

        The list of errors that occurred during queue provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

ListTagsForResource (new) Link ¶

Returns a list of all tags on an Amazon Web Services PCS resource.

See also: AWS API Documentation

Request Syntax

client.list_tags_for_resource(
    resourceArn='string'
)
type resourceArn

string

param resourceArn

[REQUIRED]

The Amazon Resource Name (ARN) of the resource for which to list tags.

rtype

dict

returns

Response Syntax

{
    'tags': {
        'string': 'string'
    }
}

Response Structure

  • (dict) --

    • tags (dict) --

      1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.

      • (string) --

        • (string) --

ListComputeNodeGroups (new) Link ¶

Returns a list of all compute node groups associated with a cluster.

See also: AWS API Documentation

Request Syntax

client.list_compute_node_groups(
    clusterIdentifier='string',
    nextToken='string',
    maxResults=123
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster to list compute node groups for.

type nextToken

string

param nextToken

The value of nextToken is a unique pagination token for each page of results returned. If nextToken is returned, there are more results available. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours. Using an expired pagination token returns an HTTP 400 InvalidToken error.

type maxResults

integer

param maxResults

The maximum number of results that are returned per call. You can use nextToken to obtain further pages of results. The default is 10 results, and the maximum allowed page size is 100 results. A value of 0 uses the default.

rtype

dict

returns

Response Syntax

{
    'computeNodeGroups': [
        {
            'name': 'string',
            'id': 'string',
            'arn': 'string',
            'clusterId': 'string',
            'createdAt': datetime(2015, 1, 1),
            'modifiedAt': datetime(2015, 1, 1),
            'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'DELETED'
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • computeNodeGroups (list) --

      The list of compute node groups for the cluster.

      • (dict) --

        The object returned by the ListComputeNodeGroups API action.

        • name (string) --

          The name that identifies the compute node group.

        • id (string) --

          The generated unique ID of the compute node group.

        • arn (string) --

          The unique Amazon Resource Name (ARN) of the compute node group.

        • clusterId (string) --

          The ID of the cluster of the compute node group.

        • createdAt (datetime) --

          The date and time the resource was created.

        • modifiedAt (datetime) --

          The date and time the resource was modified.

        • status (string) --

          The provisioning status of the compute node group.

          Note

          The provisioning status doesn't indicate the overall health of the compute node group.

    • nextToken (string) --

      The value of nextToken is a unique pagination token for each page of results returned. If nextToken is returned, there are more results available. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours. Using an expired pagination token returns an HTTP 400 InvalidToken error.

UntagResource (new) Link ¶

Deletes tags from an Amazon Web Services PCS resource. To delete a tag, specify the tag key and the Amazon Resource Name (ARN) of the Amazon Web Services PCS resource.

See also: AWS API Documentation

Request Syntax

client.untag_resource(
    resourceArn='string',
    tagKeys=[
        'string',
    ]
)
type resourceArn

string

param resourceArn

[REQUIRED]

The Amazon Resource Name (ARN) of the resource.

type tagKeys

list

param tagKeys

[REQUIRED]

1 or more tag keys to remove from the resource. Specify only tag keys and not tag values.

  • (string) --

returns

None

DeleteCluster (new) Link ¶

Deletes a cluster and all its linked resources. You must delete all queues and compute node groups associated with the cluster before you can delete the cluster.

See also: AWS API Documentation

Request Syntax

client.delete_cluster(
    clusterIdentifier='string',
    clientToken='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster to delete.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

DeleteQueue (new) Link ¶

Deletes a job queue. If the compute node group associated with this queue isn't associated with any other queues, Amazon Web Services PCS terminates all the compute nodes for this queue.

See also: AWS API Documentation

Request Syntax

client.delete_queue(
    clusterIdentifier='string',
    queueIdentifier='string',
    clientToken='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster of the queue.

type queueIdentifier

string

param queueIdentifier

[REQUIRED]

The name or ID of the queue to delete.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

GetQueue (new) Link ¶

Returns detailed information about a queue. The information includes the compute node groups that the queue uses to schedule jobs.

See also: AWS API Documentation

Request Syntax

client.get_queue(
    clusterIdentifier='string',
    queueIdentifier='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster of the queue.

type queueIdentifier

string

param queueIdentifier

[REQUIRED]

The name or ID of the queue.

rtype

dict

returns

Response Syntax

{
    'queue': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'clusterId': 'string',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED',
        'computeNodeGroupConfigurations': [
            {
                'computeNodeGroupId': 'string'
            },
        ],
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • queue (dict) --

      A queue resource.

      • name (string) --

        The name that identifies the queue.

      • id (string) --

        The generated unique ID of the queue.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the queue.

      • clusterId (string) --

        The ID of the cluster of the queue.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • status (string) --

        The provisioning status of the queue.

        Note

        The provisioning status doesn't indicate the overall health of the queue.

      • computeNodeGroupConfigurations (list) --

        The list of compute node group configurations associated with the queue. Queues assign jobs to associated compute node groups.

        • (dict) --

          The compute node group configuration for a queue.

          • computeNodeGroupId (string) --

            The compute node group ID for the compute node group configuration.

      • errorInfo (list) --

        The list of errors that occurred during queue provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

UpdateQueue (new) Link ¶

Updates the compute node group configuration of a queue. Use this API to change the compute node groups that the queue can send jobs to.

See also: AWS API Documentation

Request Syntax

client.update_queue(
    clusterIdentifier='string',
    queueIdentifier='string',
    computeNodeGroupConfigurations=[
        {
            'computeNodeGroupId': 'string'
        },
    ],
    clientToken='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster of the queue.

type queueIdentifier

string

param queueIdentifier

[REQUIRED]

The name or ID of the queue.

type computeNodeGroupConfigurations

list

param computeNodeGroupConfigurations

The list of compute node group configurations to associate with the queue. Queues assign jobs to associated compute node groups.

  • (dict) --

    The compute node group configuration for a queue.

    • computeNodeGroupId (string) --

      The compute node group ID for the compute node group configuration.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

rtype

dict

returns

Response Syntax

{
    'queue': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'clusterId': 'string',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED',
        'computeNodeGroupConfigurations': [
            {
                'computeNodeGroupId': 'string'
            },
        ],
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • queue (dict) --

      A queue resource.

      • name (string) --

        The name that identifies the queue.

      • id (string) --

        The generated unique ID of the queue.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the queue.

      • clusterId (string) --

        The ID of the cluster of the queue.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • status (string) --

        The provisioning status of the queue.

        Note

        The provisioning status doesn't indicate the overall health of the queue.

      • computeNodeGroupConfigurations (list) --

        The list of compute node group configurations associated with the queue. Queues assign jobs to associated compute node groups.

        • (dict) --

          The compute node group configuration for a queue.

          • computeNodeGroupId (string) --

            The compute node group ID for the compute node group configuration.

      • errorInfo (list) --

        The list of errors that occurred during queue provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

ListQueues (new) Link ¶

Returns a list of all queues associated with a cluster.

See also: AWS API Documentation

Request Syntax

client.list_queues(
    clusterIdentifier='string',
    nextToken='string',
    maxResults=123
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster to list queues for.

type nextToken

string

param nextToken

The value of nextToken is a unique pagination token for each page of results returned. If nextToken is returned, there are more results available. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours. Using an expired pagination token returns an HTTP 400 InvalidToken error.

type maxResults

integer

param maxResults

The maximum number of results that are returned per call. You can use nextToken to obtain further pages of results. The default is 10 results, and the maximum allowed page size is 100 results. A value of 0 uses the default.

rtype

dict

returns

Response Syntax

{
    'queues': [
        {
            'name': 'string',
            'id': 'string',
            'arn': 'string',
            'clusterId': 'string',
            'createdAt': datetime(2015, 1, 1),
            'modifiedAt': datetime(2015, 1, 1),
            'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • queues (list) --

      The list of queues associated with the cluster.

      • (dict) --

        The object returned by the ListQueues API action.

        • name (string) --

          The name that identifies the queue.

        • id (string) --

          The generated unique ID of the queue.

        • arn (string) --

          The unique Amazon Resource Name (ARN) of the queue.

        • clusterId (string) --

          The ID of the cluster of the queue.

        • createdAt (datetime) --

          The date and time the resource was created.

        • modifiedAt (datetime) --

          The date and time the resource was modified.

        • status (string) --

          The provisioning status of the queue.

          Note

          The provisioning status doesn't indicate the overall health of the queue.

    • nextToken (string) --

      The value of nextToken is a unique pagination token for each page of results returned. If nextToken is returned, there are more results available. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours. Using an expired pagination token returns an HTTP 400 InvalidToken error.

GetComputeNodeGroup (new) Link ¶

Returns detailed information about a compute node group. This API action provides networking information, EC2 instance type, compute node group status, and scheduler (such as Slurm) configuration.

See also: AWS API Documentation

Request Syntax

client.get_compute_node_group(
    clusterIdentifier='string',
    computeNodeGroupIdentifier='string'
)
type clusterIdentifier

string

param clusterIdentifier

[REQUIRED]

The name or ID of the cluster.

type computeNodeGroupIdentifier

string

param computeNodeGroupIdentifier

[REQUIRED]

The name or ID of the compute node group.

rtype

dict

returns

Response Syntax

{
    'computeNodeGroup': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'clusterId': 'string',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'DELETED',
        'amiId': 'string',
        'subnetIds': [
            'string',
        ],
        'purchaseOption': 'ONDEMAND'|'SPOT',
        'customLaunchTemplate': {
            'id': 'string',
            'version': 'string'
        },
        'iamInstanceProfileArn': 'string',
        'scalingConfiguration': {
            'minInstanceCount': 123,
            'maxInstanceCount': 123
        },
        'instanceConfigs': [
            {
                'instanceType': 'string'
            },
        ],
        'spotOptions': {
            'allocationStrategy': 'lowest-price'|'capacity-optimized'|'price-capacity-optimized'
        },
        'slurmConfiguration': {
            'slurmCustomSettings': [
                {
                    'parameterName': 'string',
                    'parameterValue': 'string'
                },
            ]
        },
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • computeNodeGroup (dict) --

      A compute node group associated with a cluster.

      • name (string) --

        The name that identifies the compute node group.

      • id (string) --

        The generated unique ID of the compute node group.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the compute node group.

      • clusterId (string) --

        The ID of the cluster of the compute node group.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • status (string) --

        The provisioning status of the compute node group.

        Note

        The provisioning status doesn't indicate the overall health of the compute node group.

      • amiId (string) --

        The ID of the Amazon Machine Image (AMI) that Amazon Web Services PCS uses to launch instances. If not provided, Amazon Web Services PCS uses the AMI ID specified in the custom launch template.

      • subnetIds (list) --

        The list of subnet IDs where instances are provisioned by the compute node group. The subnets must be in the same VPC as the cluster.

        • (string) --

      • purchaseOption (string) --

        Specifies how EC2 instances are purchased on your behalf. Amazon Web Services PCS supports On-Demand and Spot instances. For more information, see Instance purchasing options in the Amazon Elastic Compute Cloud User Guide . If you don't provide this option, it defaults to On-Demand.

      • customLaunchTemplate (dict) --

        An Amazon EC2 launch template Amazon Web Services PCS uses to launch compute nodes.

        • id (string) --

          The ID of the EC2 launch template to use to provision instances.

          Example: lt-xxxx

        • version (string) --

          The version of the EC2 launch template to use to provision instances.

      • iamInstanceProfileArn (string) --

        The Amazon Resource Name (ARN) of the IAM instance profile used to pass an IAM role when launching EC2 instances. The role contained in your instance profile must have pcs:RegisterComputeNodeGroupInstance permissions attached to provision instances correctly.

      • scalingConfiguration (dict) --

        Specifies the boundaries of the compute node group auto scaling.

        • minInstanceCount (integer) --

          The lower bound of the number of instances allowed in the compute fleet.

        • maxInstanceCount (integer) --

          The upper bound of the number of instances allowed in the compute fleet.

      • instanceConfigs (list) --

        A list of EC2 instance configurations that Amazon Web Services PCS can provision in the compute node group.

        • (dict) --

          An EC2 instance configuration Amazon Web Services PCS uses to launch compute nodes.

          • instanceType (string) --

            The EC2 instance type that Amazon Web Services PCS can provision in the compute node group.

            Example: t2.xlarge

      • spotOptions (dict) --

        Additional configuration when you specify SPOT as the purchaseOption for the CreateComputeNodeGroup API action.

      • slurmConfiguration (dict) --

        Additional options related to the Slurm scheduler.

        • slurmCustomSettings (list) --

          Additional Slurm-specific configuration that directly maps to Slurm settings.

          • (dict) --

            Additional settings that directly map to Slurm settings.

            • parameterName (string) --

              Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

            • parameterValue (string) --

              The values for the configured Slurm settings.

      • errorInfo (list) --

        The list of errors that occurred during compute node group provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

ListClusters (new) Link ¶

Returns a list of running clusters in your account.

See also: AWS API Documentation

Request Syntax

client.list_clusters(
    nextToken='string',
    maxResults=123
)
type nextToken

string

param nextToken

The value of nextToken is a unique pagination token for each page of results returned. If nextToken is returned, there are more results available. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours. Using an expired pagination token returns an HTTP 400 InvalidToken error.

type maxResults

integer

param maxResults

The maximum number of results that are returned per call. You can use nextToken to obtain further pages of results. The default is 10 results, and the maximum allowed page size is 100 results. A value of 0 uses the default.

rtype

dict

returns

Response Syntax

{
    'clusters': [
        {
            'name': 'string',
            'id': 'string',
            'arn': 'string',
            'createdAt': datetime(2015, 1, 1),
            'modifiedAt': datetime(2015, 1, 1),
            'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • clusters (list) --

      The list of clusters.

      • (dict) --

        The object returned by the ListClusters API action.

        • name (string) --

          The name that identifies the cluster.

        • id (string) --

          The generated unique ID of the cluster.

        • arn (string) --

          The unique Amazon Resource Name (ARN) of the cluster.

        • createdAt (datetime) --

          The date and time the resource was created.

        • modifiedAt (datetime) --

          The date and time the resource was modified.

        • status (string) --

          The provisioning status of the cluster.

          Note

          The provisioning status doesn't indicate the overall health of the cluster.

    • nextToken (string) --

      The value of nextToken is a unique pagination token for each page of results returned. If nextToken is returned, there are more results available. Make the call again using the returned token to retrieve the next page. Keep all other arguments unchanged. Each pagination token expires after 24 hours. Using an expired pagination token returns an HTTP 400 InvalidToken error.

CreateCluster (new) Link ¶

Creates a cluster in your account. Amazon Web Services PCS creates the cluster controller in a service-owned account. The cluster controller communicates with the cluster resources in your account. The subnets and security groups for the cluster must already exist before you use this API action.

Note

It takes time for Amazon Web Services PCS to create the cluster. The cluster is in a Creating state until it is ready to use. There can only be 1 cluster in a Creating state per Amazon Web Services Region per Amazon Web Services account. CreateCluster fails with a ServiceQuotaExceededException if there is already a cluster in a Creating state.

See also: AWS API Documentation

Request Syntax

client.create_cluster(
    clusterName='string',
    scheduler={
        'type': 'SLURM',
        'version': 'string'
    },
    size='SMALL'|'MEDIUM'|'LARGE',
    networking={
        'subnetIds': [
            'string',
        ],
        'securityGroupIds': [
            'string',
        ]
    },
    slurmConfiguration={
        'scaleDownIdleTimeInSeconds': 123,
        'slurmCustomSettings': [
            {
                'parameterName': 'string',
                'parameterValue': 'string'
            },
        ]
    },
    clientToken='string',
    tags={
        'string': 'string'
    }
)
type clusterName

string

param clusterName

[REQUIRED]

A name to identify the cluster. Example: MyCluster

type scheduler

dict

param scheduler

[REQUIRED]

The cluster management and job scheduling software associated with the cluster.

  • type (string) -- [REQUIRED]

    The software Amazon Web Services PCS uses to manage cluster scaling and job scheduling.

  • version (string) -- [REQUIRED]

    The version of the specified scheduling software that Amazon Web Services PCS uses to manage cluster scaling and job scheduling.

type size

string

param size

[REQUIRED]

A value that determines the maximum number of compute nodes in the cluster and the maximum number of jobs (active and queued).

  • SMALL : 32 compute nodes and 256 jobs

  • MEDIUM : 512 compute nodes and 8192 jobs

  • LARGE : 2048 compute nodes and 16,384 jobs

type networking

dict

param networking

[REQUIRED]

The networking configuration used to set up the cluster's control plane.

  • subnetIds (list) --

    The list of subnet IDs where Amazon Web Services PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and Amazon Web Services PCS resources. Subnet IDs have the form subnet-0123456789abcdef0 .

    Subnets can't be in Outposts, Wavelength or an Amazon Web Services Local Zone.

    Note

    Amazon Web Services PCS currently supports only 1 subnet in this list.

    • (string) --

  • securityGroupIds (list) --

    A list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.

    • (string) --

type slurmConfiguration

dict

param slurmConfiguration

Additional options related to the Slurm scheduler.

  • scaleDownIdleTimeInSeconds (integer) --

    The time before an idle node is scaled down.

  • slurmCustomSettings (list) --

    Additional Slurm-specific configuration that directly maps to Slurm settings.

    • (dict) --

      Additional settings that directly map to Slurm settings.

      • parameterName (string) -- [REQUIRED]

        Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

      • parameterValue (string) -- [REQUIRED]

        The values for the configured Slurm settings.

type clientToken

string

param clientToken

A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.

This field is autopopulated if not provided.

type tags

dict

param tags

1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.

  • (string) --

    • (string) --

rtype

dict

returns

Response Syntax

{
    'cluster': {
        'name': 'string',
        'id': 'string',
        'arn': 'string',
        'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED',
        'createdAt': datetime(2015, 1, 1),
        'modifiedAt': datetime(2015, 1, 1),
        'scheduler': {
            'type': 'SLURM',
            'version': 'string'
        },
        'size': 'SMALL'|'MEDIUM'|'LARGE',
        'slurmConfiguration': {
            'scaleDownIdleTimeInSeconds': 123,
            'slurmCustomSettings': [
                {
                    'parameterName': 'string',
                    'parameterValue': 'string'
                },
            ],
            'authKey': {
                'secretArn': 'string',
                'secretVersion': 'string'
            }
        },
        'networking': {
            'subnetIds': [
                'string',
            ],
            'securityGroupIds': [
                'string',
            ]
        },
        'endpoints': [
            {
                'type': 'SLURMCTLD'|'SLURMDBD',
                'privateIpAddress': 'string',
                'publicIpAddress': 'string',
                'port': 'string'
            },
        ],
        'errorInfo': [
            {
                'code': 'string',
                'message': 'string'
            },
        ]
    }
}

Response Structure

  • (dict) --

    • cluster (dict) --

      The cluster resource.

      • name (string) --

        The name that identifies the cluster.

      • id (string) --

        The generated unique ID of the cluster.

      • arn (string) --

        The unique Amazon Resource Name (ARN) of the cluster.

      • status (string) --

        The provisioning status of the cluster.

        Note

        The provisioning status doesn't indicate the overall health of the cluster.

      • createdAt (datetime) --

        The date and time the resource was created.

      • modifiedAt (datetime) --

        The date and time the resource was modified.

      • scheduler (dict) --

        The cluster management and job scheduling software associated with the cluster.

        • type (string) --

          The software Amazon Web Services PCS uses to manage cluster scaling and job scheduling.

        • version (string) --

          The version of the specified scheduling software that Amazon Web Services PCS uses to manage cluster scaling and job scheduling.

      • size (string) --

        The size of the cluster.

        • SMALL : 32 compute nodes and 256 jobs

        • MEDIUM : 512 compute nodes and 8192 jobs

        • LARGE : 2048 compute nodes and 16,384 jobs

      • slurmConfiguration (dict) --

        Additional options related to the Slurm scheduler.

        • scaleDownIdleTimeInSeconds (integer) --

          The time before an idle node is scaled down.

        • slurmCustomSettings (list) --

          Additional Slurm-specific configuration that directly maps to Slurm settings.

          • (dict) --

            Additional settings that directly map to Slurm settings.

            • parameterName (string) --

              Amazon Web Services PCS supports configuration of the following Slurm parameters: Prolog, Epilog, and SelectTypeParameters.

            • parameterValue (string) --

              The values for the configured Slurm settings.

        • authKey (dict) --

          The shared Slurm key for authentication, also known as the cluster secret .

          • secretArn (string) --

            The Amazon Resource Name (ARN) of the the shared Slurm key.

          • secretVersion (string) --

            The version of the shared Slurm key.

      • networking (dict) --

        The networking configuration for the cluster's control plane.

        • subnetIds (list) --

          The ID of the subnet where Amazon Web Services PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and Amazon Web Services PCS resources. The subnet must have an available IP address, cannot reside in AWS Outposts, AWS Wavelength, or an AWS Local Zone.

          Example: subnet-abcd1234

          • (string) --

        • securityGroupIds (list) --

          The list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.

          The following rules are required:

          • Inbound rule 1

            • Protocol: All

            • Ports: All

            • Source: Self

          • Outbound rule 1

            • Protocol: All

            • Ports: All

            • Destination: 0.0.0.0/0 (IPv4)

          • Outbound rule 2

            • Protocol: All

            • Ports: All

            • Destination: Self

          • (string) --

      • endpoints (list) --

        The list of endpoints available for interaction with the scheduler.

        • (dict) --

          An endpoint available for interaction with the scheduler.

          • type (string) --

            Indicates the type of endpoint running at the specific IP address.

          • privateIpAddress (string) --

            The endpoint's private IP address.

            Example: 2.2.2.2

          • publicIpAddress (string) --

            The endpoint's public IP address.

            Example: 1.1.1.1

          • port (string) --

            The endpoint's connection port number.

            Example: 1234

      • errorInfo (list) --

        The list of errors that occurred during cluster provisioning.

        • (dict) --

          An error that occurred during resource creation.

          • code (string) --

            The short-form error code.

          • message (string) --

            The detailed error information.

TagResource (new) Link ¶

Adds or edits tags on an Amazon Web Services PCS resource. Each tag consists of a tag key and a tag value. The tag key and tag value are case-sensitive strings. The tag value can be an empty (null) string. To add a tag, specify a new tag key and a tag value. To edit a tag, specify an existing tag key and a new tag value.

See also: AWS API Documentation

Request Syntax

client.tag_resource(
    resourceArn='string',
    tags={
        'string': 'string'
    }
)
type resourceArn

string

param resourceArn

[REQUIRED]

The Amazon Resource Name (ARN) of the resource.

type tags

dict

param tags

[REQUIRED]

1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.

  • (string) --

    • (string) --

returns

None