2026/03/24 - AWS Parallel Computing Service - 3 updated api methods
Changes This release adds support for custom slurmdbd and cgroup configuration in AWS PCS. Customers can now specify slurmdbd and cgroup settings to configure database accounting and reporting for their HPC workloads, and control resource allocation and limits for compute jobs.
{'slurmConfiguration': {'cgroupCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}],
'slurmdbdCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}]}}
Response {'cluster': {'slurmConfiguration': {'cgroupCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}],
'slurmdbdCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}]}}}
Creates a cluster in your account. PCS creates the cluster controller in a service-owned account. The cluster controller communicates with the cluster resources in your account. The subnets and security groups for the cluster must already exist before you use this API action.
See also: AWS API Documentation
Request Syntax
client.create_cluster(
clusterName='string',
scheduler={
'type': 'SLURM',
'version': 'string'
},
size='SMALL'|'MEDIUM'|'LARGE',
networking={
'subnetIds': [
'string',
],
'securityGroupIds': [
'string',
],
'networkType': 'IPV4'|'IPV6'
},
slurmConfiguration={
'scaleDownIdleTimeInSeconds': 123,
'slurmCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'slurmdbdCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'cgroupCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'accounting': {
'defaultPurgeTimeInDays': 123,
'mode': 'STANDARD'|'NONE'
},
'slurmRest': {
'mode': 'STANDARD'|'NONE'
}
},
clientToken='string',
tags={
'string': 'string'
}
)
string
[REQUIRED]
A name to identify the cluster. Example: MyCluster
dict
[REQUIRED]
The cluster management and job scheduling software associated with the cluster.
type (string) -- [REQUIRED]
The software PCS uses to manage cluster scaling and job scheduling.
version (string) -- [REQUIRED]
The version of the specified scheduling software that PCS uses to manage cluster scaling and job scheduling. For more information, see Slurm versions in PCS in the PCS User Guide.
Valid Values: 24.11 | 25.05
string
[REQUIRED]
A value that determines the maximum number of compute nodes in the cluster and the maximum number of jobs (active and queued).
SMALL: 32 compute nodes and 256 jobs
MEDIUM: 512 compute nodes and 8192 jobs
LARGE: 2048 compute nodes and 16,384 jobs
dict
[REQUIRED]
The networking configuration used to set up the cluster's control plane.
subnetIds (list) --
The list of subnet IDs where PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and PCS resources. Subnet IDs have the form subnet-0123456789abcdef0.
Subnets can't be in Outposts, Wavelength or an Amazon Web Services Local Zone.
(string) --
securityGroupIds (list) --
A list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.
(string) --
networkType (string) --
The IP address version the cluster uses. The default is IPV4.
dict
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) --
The time (in seconds) before an idle node is scaled down.
Default: 600
slurmCustomSettings (list) --
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) --
Additional settings that directly map to Slurm settings.
parameterName (string) -- [REQUIRED]
PCS supports custom Slurm settings for clusters, compute node groups, and queues. For more information, see Configuring custom Slurm settings in PCS in the PCS User Guide.
parameterValue (string) -- [REQUIRED]
The values for the configured Slurm settings.
slurmdbdCustomSettings (list) --
Additional SlurmDBD-specific configuration that directly maps to SlurmDBD settings.
(dict) --
Additional settings that directly map to SlurmDBD settings.
parameterName (string) -- [REQUIRED]
PCS supports custom SlurmDBD settings for clusters. For more information, see Configuring custom SlurmDBD settings in PCS in the PCS User Guide.
parameterValue (string) -- [REQUIRED]
The values for the configured SlurmDBD settings.
cgroupCustomSettings (list) --
Additional Cgroup-specific configuration that directly maps to Cgroup settings.
(dict) --
Additional settings that directly map to Cgroup settings.
parameterName (string) -- [REQUIRED]
PCS supports custom Cgroup settings for clusters. For more information, see Configuring custom Cgroup settings in PCS in the PCS User Guide.
parameterValue (string) -- [REQUIRED]
The values for the configured Cgroup settings.
accounting (dict) --
The accounting configuration includes configurable settings for Slurm accounting.
defaultPurgeTimeInDays (integer) --
The default value for all purge settings for slurmdbd.conf. For more information, see the slurmdbd.conf documentation at SchedMD.
The default value for defaultPurgeTimeInDays is -1.
A value of -1 means there is no purge time and records persist as long as the cluster exists.
mode (string) -- [REQUIRED]
The default value for mode is NONE. A value of STANDARD means Slurm accounting is enabled.
slurmRest (dict) --
The Slurm REST API configuration for the cluster.
mode (string) -- [REQUIRED]
The default value for mode is NONE. A value of STANDARD means the Slurm REST API is enabled.
string
A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.
This field is autopopulated if not provided.
dict
1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.
(string) --
(string) --
dict
Response Syntax
{
'cluster': {
'name': 'string',
'id': 'string',
'arn': 'string',
'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'SUSPENDING'|'SUSPENDED'|'RESUMING',
'createdAt': datetime(2015, 1, 1),
'modifiedAt': datetime(2015, 1, 1),
'scheduler': {
'type': 'SLURM',
'version': 'string'
},
'size': 'SMALL'|'MEDIUM'|'LARGE',
'slurmConfiguration': {
'scaleDownIdleTimeInSeconds': 123,
'slurmCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'slurmdbdCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'cgroupCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'authKey': {
'secretArn': 'string',
'secretVersion': 'string'
},
'jwtAuth': {
'jwtKey': {
'secretArn': 'string',
'secretVersion': 'string'
}
},
'accounting': {
'defaultPurgeTimeInDays': 123,
'mode': 'STANDARD'|'NONE'
},
'slurmRest': {
'mode': 'STANDARD'|'NONE'
}
},
'networking': {
'subnetIds': [
'string',
],
'securityGroupIds': [
'string',
],
'networkType': 'IPV4'|'IPV6'
},
'endpoints': [
{
'type': 'SLURMCTLD'|'SLURMDBD'|'SLURMRESTD',
'privateIpAddress': 'string',
'publicIpAddress': 'string',
'ipv6Address': 'string',
'port': 'string'
},
],
'errorInfo': [
{
'code': 'string',
'message': 'string'
},
]
}
}
Response Structure
(dict) --
cluster (dict) --
The cluster resource.
name (string) --
The name that identifies the cluster.
id (string) --
The generated unique ID of the cluster.
arn (string) --
The unique Amazon Resource Name (ARN) of the cluster.
status (string) --
The provisioning status of the cluster.
createdAt (datetime) --
The date and time the resource was created.
modifiedAt (datetime) --
The date and time the resource was modified.
scheduler (dict) --
The cluster management and job scheduling software associated with the cluster.
type (string) --
The software PCS uses to manage cluster scaling and job scheduling.
version (string) --
The version of the specified scheduling software that PCS uses to manage cluster scaling and job scheduling. For more information, see Slurm versions in PCS in the PCS User Guide.
Valid Values: 23.11 | 24.05 | 24.11 | 25.05
size (string) --
The size of the cluster.
SMALL: 32 compute nodes and 256 jobs
MEDIUM: 512 compute nodes and 8192 jobs
LARGE: 2048 compute nodes and 16,384 jobs
slurmConfiguration (dict) --
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) --
The time (in seconds) before an idle node is scaled down.
Default: 600
slurmCustomSettings (list) --
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) --
Additional settings that directly map to Slurm settings.
parameterName (string) --
PCS supports custom Slurm settings for clusters, compute node groups, and queues. For more information, see Configuring custom Slurm settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured Slurm settings.
slurmdbdCustomSettings (list) --
Additional SlurmDBD-specific configuration that directly maps to SlurmDBD settings.
(dict) --
Additional settings that directly map to SlurmDBD settings.
parameterName (string) --
PCS supports custom SlurmDBD settings for clusters. For more information, see Configuring custom SlurmDBD settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured SlurmDBD settings.
cgroupCustomSettings (list) --
Additional Cgroup-specific configuration that directly maps to Cgroup settings.
(dict) --
Additional settings that directly map to Cgroup settings.
parameterName (string) --
PCS supports custom Cgroup settings for clusters. For more information, see Configuring custom Cgroup settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured Cgroup settings.
authKey (dict) --
The shared Slurm key for authentication, also known as the cluster secret.
secretArn (string) --
The Amazon Resource Name (ARN) of the shared Slurm key.
secretVersion (string) --
The version of the shared Slurm key.
jwtAuth (dict) --
The JWT authentication configuration for Slurm REST API access.
jwtKey (dict) --
The JWT key for Slurm REST API authentication.
secretArn (string) --
The Amazon Resource Name (ARN) of the Amazon Web Services Secrets Manager secret containing the JWT key.
secretVersion (string) --
The version of the Amazon Web Services Secrets Manager secret containing the JWT key.
accounting (dict) --
The accounting configuration includes configurable settings for Slurm accounting.
defaultPurgeTimeInDays (integer) --
The default value for all purge settings for slurmdbd.conf. For more information, see the slurmdbd.conf documentation at SchedMD.
The default value for defaultPurgeTimeInDays is -1.
A value of -1 means there is no purge time and records persist as long as the cluster exists.
mode (string) --
The default value for mode is NONE. A value of STANDARD means Slurm accounting is enabled.
slurmRest (dict) --
The Slurm REST API configuration for the cluster.
mode (string) --
The default value for mode is NONE. A value of STANDARD means the Slurm REST API is enabled.
networking (dict) --
The networking configuration for the cluster's control plane.
subnetIds (list) --
The ID of the subnet where PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and PCS resources. The subnet must have an available IP address, cannot reside in Outposts, Wavelength, or an Amazon Web Services Local Zone.
Example: subnet-abcd1234
(string) --
securityGroupIds (list) --
The list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.
The following rules are required:
Inbound rule 1
Protocol: All
Ports: All
Source: Self
Outbound rule 1
Protocol: All
Ports: All
Destination: 0.0.0.0/0 (IPv4) or ::/0 (IPv6)
Outbound rule 2
Protocol: All
Ports: All
Destination: Self
(string) --
networkType (string) --
The IP address version the cluster uses. The default is IPV4.
endpoints (list) --
The list of endpoints available for interaction with the scheduler.
(dict) --
An endpoint available for interaction with the scheduler.
type (string) --
Indicates the type of endpoint running at the specific IP address.
privateIpAddress (string) --
For clusters that use IPv4, this is the endpoint's private IP address.
Example: 10.1.2.3
For clusters configured to use IPv6, this is an empty string.
publicIpAddress (string) --
The endpoint's public IP address.
Example: 192.0.2.1
ipv6Address (string) --
The endpoint's IPv6 address.
Example: 2001:db8::1
port (string) --
The endpoint's connection port number.
Example: 1234
errorInfo (list) --
The list of errors that occurred during cluster provisioning.
(dict) --
An error that occurred during resource creation.
code (string) --
The short-form error code.
message (string) --
The detailed error information.
{'cluster': {'slurmConfiguration': {'cgroupCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}],
'slurmdbdCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}]}}}
Returns detailed information about a running cluster in your account. This API action provides networking information, endpoint information for communication with the scheduler, and provisioning status.
See also: AWS API Documentation
Request Syntax
client.get_cluster(
clusterIdentifier='string'
)
string
[REQUIRED]
The name or ID of the cluster.
dict
Response Syntax
{
'cluster': {
'name': 'string',
'id': 'string',
'arn': 'string',
'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'SUSPENDING'|'SUSPENDED'|'RESUMING',
'createdAt': datetime(2015, 1, 1),
'modifiedAt': datetime(2015, 1, 1),
'scheduler': {
'type': 'SLURM',
'version': 'string'
},
'size': 'SMALL'|'MEDIUM'|'LARGE',
'slurmConfiguration': {
'scaleDownIdleTimeInSeconds': 123,
'slurmCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'slurmdbdCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'cgroupCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'authKey': {
'secretArn': 'string',
'secretVersion': 'string'
},
'jwtAuth': {
'jwtKey': {
'secretArn': 'string',
'secretVersion': 'string'
}
},
'accounting': {
'defaultPurgeTimeInDays': 123,
'mode': 'STANDARD'|'NONE'
},
'slurmRest': {
'mode': 'STANDARD'|'NONE'
}
},
'networking': {
'subnetIds': [
'string',
],
'securityGroupIds': [
'string',
],
'networkType': 'IPV4'|'IPV6'
},
'endpoints': [
{
'type': 'SLURMCTLD'|'SLURMDBD'|'SLURMRESTD',
'privateIpAddress': 'string',
'publicIpAddress': 'string',
'ipv6Address': 'string',
'port': 'string'
},
],
'errorInfo': [
{
'code': 'string',
'message': 'string'
},
]
}
}
Response Structure
(dict) --
cluster (dict) --
The cluster resource.
name (string) --
The name that identifies the cluster.
id (string) --
The generated unique ID of the cluster.
arn (string) --
The unique Amazon Resource Name (ARN) of the cluster.
status (string) --
The provisioning status of the cluster.
createdAt (datetime) --
The date and time the resource was created.
modifiedAt (datetime) --
The date and time the resource was modified.
scheduler (dict) --
The cluster management and job scheduling software associated with the cluster.
type (string) --
The software PCS uses to manage cluster scaling and job scheduling.
version (string) --
The version of the specified scheduling software that PCS uses to manage cluster scaling and job scheduling. For more information, see Slurm versions in PCS in the PCS User Guide.
Valid Values: 23.11 | 24.05 | 24.11 | 25.05
size (string) --
The size of the cluster.
SMALL: 32 compute nodes and 256 jobs
MEDIUM: 512 compute nodes and 8192 jobs
LARGE: 2048 compute nodes and 16,384 jobs
slurmConfiguration (dict) --
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) --
The time (in seconds) before an idle node is scaled down.
Default: 600
slurmCustomSettings (list) --
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) --
Additional settings that directly map to Slurm settings.
parameterName (string) --
PCS supports custom Slurm settings for clusters, compute node groups, and queues. For more information, see Configuring custom Slurm settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured Slurm settings.
slurmdbdCustomSettings (list) --
Additional SlurmDBD-specific configuration that directly maps to SlurmDBD settings.
(dict) --
Additional settings that directly map to SlurmDBD settings.
parameterName (string) --
PCS supports custom SlurmDBD settings for clusters. For more information, see Configuring custom SlurmDBD settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured SlurmDBD settings.
cgroupCustomSettings (list) --
Additional Cgroup-specific configuration that directly maps to Cgroup settings.
(dict) --
Additional settings that directly map to Cgroup settings.
parameterName (string) --
PCS supports custom Cgroup settings for clusters. For more information, see Configuring custom Cgroup settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured Cgroup settings.
authKey (dict) --
The shared Slurm key for authentication, also known as the cluster secret.
secretArn (string) --
The Amazon Resource Name (ARN) of the shared Slurm key.
secretVersion (string) --
The version of the shared Slurm key.
jwtAuth (dict) --
The JWT authentication configuration for Slurm REST API access.
jwtKey (dict) --
The JWT key for Slurm REST API authentication.
secretArn (string) --
The Amazon Resource Name (ARN) of the Amazon Web Services Secrets Manager secret containing the JWT key.
secretVersion (string) --
The version of the Amazon Web Services Secrets Manager secret containing the JWT key.
accounting (dict) --
The accounting configuration includes configurable settings for Slurm accounting.
defaultPurgeTimeInDays (integer) --
The default value for all purge settings for slurmdbd.conf. For more information, see the slurmdbd.conf documentation at SchedMD.
The default value for defaultPurgeTimeInDays is -1.
A value of -1 means there is no purge time and records persist as long as the cluster exists.
mode (string) --
The default value for mode is NONE. A value of STANDARD means Slurm accounting is enabled.
slurmRest (dict) --
The Slurm REST API configuration for the cluster.
mode (string) --
The default value for mode is NONE. A value of STANDARD means the Slurm REST API is enabled.
networking (dict) --
The networking configuration for the cluster's control plane.
subnetIds (list) --
The ID of the subnet where PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and PCS resources. The subnet must have an available IP address, cannot reside in Outposts, Wavelength, or an Amazon Web Services Local Zone.
Example: subnet-abcd1234
(string) --
securityGroupIds (list) --
The list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.
The following rules are required:
Inbound rule 1
Protocol: All
Ports: All
Source: Self
Outbound rule 1
Protocol: All
Ports: All
Destination: 0.0.0.0/0 (IPv4) or ::/0 (IPv6)
Outbound rule 2
Protocol: All
Ports: All
Destination: Self
(string) --
networkType (string) --
The IP address version the cluster uses. The default is IPV4.
endpoints (list) --
The list of endpoints available for interaction with the scheduler.
(dict) --
An endpoint available for interaction with the scheduler.
type (string) --
Indicates the type of endpoint running at the specific IP address.
privateIpAddress (string) --
For clusters that use IPv4, this is the endpoint's private IP address.
Example: 10.1.2.3
For clusters configured to use IPv6, this is an empty string.
publicIpAddress (string) --
The endpoint's public IP address.
Example: 192.0.2.1
ipv6Address (string) --
The endpoint's IPv6 address.
Example: 2001:db8::1
port (string) --
The endpoint's connection port number.
Example: 1234
errorInfo (list) --
The list of errors that occurred during cluster provisioning.
(dict) --
An error that occurred during resource creation.
code (string) --
The short-form error code.
message (string) --
The detailed error information.
{'slurmConfiguration': {'cgroupCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}],
'slurmdbdCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}]}}
Response {'cluster': {'slurmConfiguration': {'cgroupCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}],
'slurmdbdCustomSettings': [{'parameterName': 'string',
'parameterValue': 'string'}]}}}
Updates a cluster configuration. You can modify Slurm scheduler settings, accounting configuration, and security groups for an existing cluster.
See also: AWS API Documentation
Request Syntax
client.update_cluster(
clusterIdentifier='string',
clientToken='string',
slurmConfiguration={
'scaleDownIdleTimeInSeconds': 123,
'slurmCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'slurmdbdCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'cgroupCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'accounting': {
'defaultPurgeTimeInDays': 123,
'mode': 'STANDARD'|'NONE'
},
'slurmRest': {
'mode': 'STANDARD'|'NONE'
}
}
)
string
[REQUIRED]
The name or ID of the cluster to update.
string
A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don't specify a client token, the CLI and SDK automatically generate 1 for you.
This field is autopopulated if not provided.
dict
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) --
The time (in seconds) before an idle node is scaled down.
Default: 600
slurmCustomSettings (list) --
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) --
Additional settings that directly map to Slurm settings.
parameterName (string) -- [REQUIRED]
PCS supports custom Slurm settings for clusters, compute node groups, and queues. For more information, see Configuring custom Slurm settings in PCS in the PCS User Guide.
parameterValue (string) -- [REQUIRED]
The values for the configured Slurm settings.
slurmdbdCustomSettings (list) --
Additional SlurmDBD-specific configuration that directly maps to SlurmDBD settings.
(dict) --
Additional settings that directly map to SlurmDBD settings.
parameterName (string) -- [REQUIRED]
PCS supports custom SlurmDBD settings for clusters. For more information, see Configuring custom SlurmDBD settings in PCS in the PCS User Guide.
parameterValue (string) -- [REQUIRED]
The values for the configured SlurmDBD settings.
cgroupCustomSettings (list) --
Additional Cgroup-specific configuration that directly maps to Cgroup settings.
(dict) --
Additional settings that directly map to Cgroup settings.
parameterName (string) -- [REQUIRED]
PCS supports custom Cgroup settings for clusters. For more information, see Configuring custom Cgroup settings in PCS in the PCS User Guide.
parameterValue (string) -- [REQUIRED]
The values for the configured Cgroup settings.
accounting (dict) --
The accounting configuration includes configurable settings for Slurm accounting.
defaultPurgeTimeInDays (integer) --
The default value for all purge settings for slurmdbd.conf. For more information, see the slurmdbd.conf documentation at SchedMD.
The default value for defaultPurgeTimeInDays is -1.
A value of -1 means there is no purge time and records persist as long as the cluster exists.
mode (string) --
The default value for mode is NONE. A value of STANDARD means Slurm accounting is enabled.
slurmRest (dict) --
The Slurm REST API configuration for the cluster.
mode (string) --
The default value for mode is NONE. A value of STANDARD means the Slurm REST API is enabled.
dict
Response Syntax
{
'cluster': {
'name': 'string',
'id': 'string',
'arn': 'string',
'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED'|'SUSPENDING'|'SUSPENDED'|'RESUMING',
'createdAt': datetime(2015, 1, 1),
'modifiedAt': datetime(2015, 1, 1),
'scheduler': {
'type': 'SLURM',
'version': 'string'
},
'size': 'SMALL'|'MEDIUM'|'LARGE',
'slurmConfiguration': {
'scaleDownIdleTimeInSeconds': 123,
'slurmCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'slurmdbdCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'cgroupCustomSettings': [
{
'parameterName': 'string',
'parameterValue': 'string'
},
],
'authKey': {
'secretArn': 'string',
'secretVersion': 'string'
},
'jwtAuth': {
'jwtKey': {
'secretArn': 'string',
'secretVersion': 'string'
}
},
'accounting': {
'defaultPurgeTimeInDays': 123,
'mode': 'STANDARD'|'NONE'
},
'slurmRest': {
'mode': 'STANDARD'|'NONE'
}
},
'networking': {
'subnetIds': [
'string',
],
'securityGroupIds': [
'string',
],
'networkType': 'IPV4'|'IPV6'
},
'endpoints': [
{
'type': 'SLURMCTLD'|'SLURMDBD'|'SLURMRESTD',
'privateIpAddress': 'string',
'publicIpAddress': 'string',
'ipv6Address': 'string',
'port': 'string'
},
],
'errorInfo': [
{
'code': 'string',
'message': 'string'
},
]
}
}
Response Structure
(dict) --
cluster (dict) --
The cluster resource and configuration.
name (string) --
The name that identifies the cluster.
id (string) --
The generated unique ID of the cluster.
arn (string) --
The unique Amazon Resource Name (ARN) of the cluster.
status (string) --
The provisioning status of the cluster.
createdAt (datetime) --
The date and time the resource was created.
modifiedAt (datetime) --
The date and time the resource was modified.
scheduler (dict) --
The cluster management and job scheduling software associated with the cluster.
type (string) --
The software PCS uses to manage cluster scaling and job scheduling.
version (string) --
The version of the specified scheduling software that PCS uses to manage cluster scaling and job scheduling. For more information, see Slurm versions in PCS in the PCS User Guide.
Valid Values: 23.11 | 24.05 | 24.11 | 25.05
size (string) --
The size of the cluster.
SMALL: 32 compute nodes and 256 jobs
MEDIUM: 512 compute nodes and 8192 jobs
LARGE: 2048 compute nodes and 16,384 jobs
slurmConfiguration (dict) --
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) --
The time (in seconds) before an idle node is scaled down.
Default: 600
slurmCustomSettings (list) --
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) --
Additional settings that directly map to Slurm settings.
parameterName (string) --
PCS supports custom Slurm settings for clusters, compute node groups, and queues. For more information, see Configuring custom Slurm settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured Slurm settings.
slurmdbdCustomSettings (list) --
Additional SlurmDBD-specific configuration that directly maps to SlurmDBD settings.
(dict) --
Additional settings that directly map to SlurmDBD settings.
parameterName (string) --
PCS supports custom SlurmDBD settings for clusters. For more information, see Configuring custom SlurmDBD settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured SlurmDBD settings.
cgroupCustomSettings (list) --
Additional Cgroup-specific configuration that directly maps to Cgroup settings.
(dict) --
Additional settings that directly map to Cgroup settings.
parameterName (string) --
PCS supports custom Cgroup settings for clusters. For more information, see Configuring custom Cgroup settings in PCS in the PCS User Guide.
parameterValue (string) --
The values for the configured Cgroup settings.
authKey (dict) --
The shared Slurm key for authentication, also known as the cluster secret.
secretArn (string) --
The Amazon Resource Name (ARN) of the shared Slurm key.
secretVersion (string) --
The version of the shared Slurm key.
jwtAuth (dict) --
The JWT authentication configuration for Slurm REST API access.
jwtKey (dict) --
The JWT key for Slurm REST API authentication.
secretArn (string) --
The Amazon Resource Name (ARN) of the Amazon Web Services Secrets Manager secret containing the JWT key.
secretVersion (string) --
The version of the Amazon Web Services Secrets Manager secret containing the JWT key.
accounting (dict) --
The accounting configuration includes configurable settings for Slurm accounting.
defaultPurgeTimeInDays (integer) --
The default value for all purge settings for slurmdbd.conf. For more information, see the slurmdbd.conf documentation at SchedMD.
The default value for defaultPurgeTimeInDays is -1.
A value of -1 means there is no purge time and records persist as long as the cluster exists.
mode (string) --
The default value for mode is NONE. A value of STANDARD means Slurm accounting is enabled.
slurmRest (dict) --
The Slurm REST API configuration for the cluster.
mode (string) --
The default value for mode is NONE. A value of STANDARD means the Slurm REST API is enabled.
networking (dict) --
The networking configuration for the cluster's control plane.
subnetIds (list) --
The ID of the subnet where PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and PCS resources. The subnet must have an available IP address, cannot reside in Outposts, Wavelength, or an Amazon Web Services Local Zone.
Example: subnet-abcd1234
(string) --
securityGroupIds (list) --
The list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.
The following rules are required:
Inbound rule 1
Protocol: All
Ports: All
Source: Self
Outbound rule 1
Protocol: All
Ports: All
Destination: 0.0.0.0/0 (IPv4) or ::/0 (IPv6)
Outbound rule 2
Protocol: All
Ports: All
Destination: Self
(string) --
networkType (string) --
The IP address version the cluster uses. The default is IPV4.
endpoints (list) --
The list of endpoints available for interaction with the scheduler.
(dict) --
An endpoint available for interaction with the scheduler.
type (string) --
Indicates the type of endpoint running at the specific IP address.
privateIpAddress (string) --
For clusters that use IPv4, this is the endpoint's private IP address.
Example: 10.1.2.3
For clusters configured to use IPv6, this is an empty string.
publicIpAddress (string) --
The endpoint's public IP address.
Example: 192.0.2.1
ipv6Address (string) --
The endpoint's IPv6 address.
Example: 2001:db8::1
port (string) --
The endpoint's connection port number.
Example: 1234
errorInfo (list) --
The list of errors that occurred during cluster provisioning.
(dict) --
An error that occurred during resource creation.
code (string) --
The short-form error code.
message (string) --
The detailed error information.