Amazon EMR Containers

2023/06/07 - Amazon EMR Containers - 6 updated api methods

Changes  EMR on EKS adds support for log rotation of Spark container logs with EMR-6.11.0 onwards, to the StartJobRun API.

CreateManagedEndpoint (updated) Link ¶
Changes (request)
{'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                              'rotationSize': 'string'}}}}

Creates a managed endpoint. A managed endpoint is a gateway that connects Amazon EMR Studio to Amazon EMR on EKS so that Amazon EMR Studio can communicate with your virtual cluster.

See also: AWS API Documentation

Request Syntax

client.create_managed_endpoint(
    name='string',
    virtualClusterId='string',
    type='string',
    releaseLabel='string',
    executionRoleArn='string',
    certificateArn='string',
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            },
            'containerLogRotationConfiguration': {
                'rotationSize': 'string',
                'maxFilesToKeep': 123
            }
        }
    },
    clientToken='string',
    tags={
        'string': 'string'
    }
)
type name

string

param name

[REQUIRED]

The name of the managed endpoint.

type virtualClusterId

string

param virtualClusterId

[REQUIRED]

The ID of the virtual cluster for which a managed endpoint is created.

type type

string

param type

[REQUIRED]

The type of the managed endpoint.

type releaseLabel

string

param releaseLabel

[REQUIRED]

The Amazon EMR release version.

type executionRoleArn

string

param executionRoleArn

[REQUIRED]

The ARN of the execution role.

type certificateArn

string

param certificateArn

The certificate ARN provided by users for the managed endpoint. This field is under deprecation and will be removed in future releases.

type configurationOverrides

dict

param configurationOverrides

The configuration settings that will be used to override existing configurations.

  • applicationConfiguration (list) --

    The configurations for the application running by the job run.

    • (dict) --

      A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

      • classification (string) -- [REQUIRED]

        The classification within a configuration.

      • properties (dict) --

        A set of properties specified within a configuration classification.

        • (string) --

          • (string) --

      • configurations (list) --

        A list of additional configurations to apply within a configuration object.

  • monitoringConfiguration (dict) --

    The configurations for monitoring.

    • persistentAppUI (string) --

      Monitoring configurations for the persistent application UI.

    • cloudWatchMonitoringConfiguration (dict) --

      Monitoring configurations for CloudWatch.

      • logGroupName (string) -- [REQUIRED]

        The name of the log group for log publishing.

      • logStreamNamePrefix (string) --

        The specified name prefix for log streams.

    • s3MonitoringConfiguration (dict) --

      Amazon S3 configuration for monitoring log publishing.

      • logUri (string) -- [REQUIRED]

        Amazon S3 destination URI for log publishing.

    • containerLogRotationConfiguration (dict) --

      Enable or disable container log rotation.

      • rotationSize (string) -- [REQUIRED]

        The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

      • maxFilesToKeep (integer) -- [REQUIRED]

        The number of files to keep in container after rotation.

type clientToken

string

param clientToken

[REQUIRED]

The client idempotency token for this create call.

This field is autopopulated if not provided.

type tags

dict

param tags

The tags of the managed endpoint.

  • (string) --

    • (string) --

rtype

dict

returns

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) --

    • id (string) --

      The output contains the ID of the managed endpoint.

    • name (string) --

      The output contains the name of the managed endpoint.

    • arn (string) --

      The output contains the ARN of the managed endpoint.

    • virtualClusterId (string) --

      The output contains the ID of the virtual cluster.

DescribeJobRun (updated) Link ¶
Changes (response)
{'jobRun': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                         'rotationSize': 'string'}}}}}

Displays detailed information about a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

client.describe_job_run(
    id='string',
    virtualClusterId='string'
)
type id

string

param id

[REQUIRED]

The ID of the job run request.

type virtualClusterId

string

param virtualClusterId

[REQUIRED]

The ID of the virtual cluster for which the job run is submitted.

rtype

dict

returns

Response Syntax

{
    'jobRun': {
        'id': 'string',
        'name': 'string',
        'virtualClusterId': 'string',
        'arn': 'string',
        'state': 'PENDING'|'SUBMITTED'|'RUNNING'|'FAILED'|'CANCELLED'|'CANCEL_PENDING'|'COMPLETED',
        'clientToken': 'string',
        'executionRoleArn': 'string',
        'releaseLabel': 'string',
        'configurationOverrides': {
            'applicationConfiguration': [
                {
                    'classification': 'string',
                    'properties': {
                        'string': 'string'
                    },
                    'configurations': {'... recursive ...'}
                },
            ],
            'monitoringConfiguration': {
                'persistentAppUI': 'ENABLED'|'DISABLED',
                'cloudWatchMonitoringConfiguration': {
                    'logGroupName': 'string',
                    'logStreamNamePrefix': 'string'
                },
                's3MonitoringConfiguration': {
                    'logUri': 'string'
                },
                'containerLogRotationConfiguration': {
                    'rotationSize': 'string',
                    'maxFilesToKeep': 123
                }
            }
        },
        'jobDriver': {
            'sparkSubmitJobDriver': {
                'entryPoint': 'string',
                'entryPointArguments': [
                    'string',
                ],
                'sparkSubmitParameters': 'string'
            },
            'sparkSqlJobDriver': {
                'entryPoint': 'string',
                'sparkSqlParameters': 'string'
            }
        },
        'createdAt': datetime(2015, 1, 1),
        'createdBy': 'string',
        'finishedAt': datetime(2015, 1, 1),
        'stateDetails': 'string',
        'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
        'tags': {
            'string': 'string'
        },
        'retryPolicyConfiguration': {
            'maxAttempts': 123
        },
        'retryPolicyExecution': {
            'currentAttemptCount': 123
        }
    }
}

Response Structure

  • (dict) --

    • jobRun (dict) --

      The output displays information about a job run.

      • id (string) --

        The ID of the job run.

      • name (string) --

        The name of the job run.

      • virtualClusterId (string) --

        The ID of the job run's virtual cluster.

      • arn (string) --

        The ARN of job run.

      • state (string) --

        The state of the job run.

      • clientToken (string) --

        The client token used to start a job run.

      • executionRoleArn (string) --

        The execution role ARN of the job run.

      • releaseLabel (string) --

        The release version of Amazon EMR.

      • configurationOverrides (dict) --

        The configuration settings that are used to override default configuration.

        • applicationConfiguration (list) --

          The configurations for the application running by the job run.

          • (dict) --

            A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

            • classification (string) --

              The classification within a configuration.

            • properties (dict) --

              A set of properties specified within a configuration classification.

              • (string) --

                • (string) --

            • configurations (list) --

              A list of additional configurations to apply within a configuration object.

        • monitoringConfiguration (dict) --

          The configurations for monitoring.

          • persistentAppUI (string) --

            Monitoring configurations for the persistent application UI.

          • cloudWatchMonitoringConfiguration (dict) --

            Monitoring configurations for CloudWatch.

            • logGroupName (string) --

              The name of the log group for log publishing.

            • logStreamNamePrefix (string) --

              The specified name prefix for log streams.

          • s3MonitoringConfiguration (dict) --

            Amazon S3 configuration for monitoring log publishing.

            • logUri (string) --

              Amazon S3 destination URI for log publishing.

          • containerLogRotationConfiguration (dict) --

            Enable or disable container log rotation.

            • rotationSize (string) --

              The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

            • maxFilesToKeep (integer) --

              The number of files to keep in container after rotation.

      • jobDriver (dict) --

        Parameters of job driver for the job run.

        • sparkSubmitJobDriver (dict) --

          The job driver parameters specified for spark submit.

          • entryPoint (string) --

            The entry point of job application.

          • entryPointArguments (list) --

            The arguments for job application.

            • (string) --

          • sparkSubmitParameters (string) --

            The Spark submit parameters that are used for job runs.

        • sparkSqlJobDriver (dict) --

          The job driver for job type.

          • entryPoint (string) --

            The SQL file to be executed.

          • sparkSqlParameters (string) --

            The Spark parameters to be included in the Spark SQL command.

      • createdAt (datetime) --

        The date and time when the job run was created.

      • createdBy (string) --

        The user who created the job run.

      • finishedAt (datetime) --

        The date and time when the job run has finished.

      • stateDetails (string) --

        Additional details of the job run state.

      • failureReason (string) --

        The reasons why the job run has failed.

      • tags (dict) --

        The assigned tags of the job run.

        • (string) --

          • (string) --

      • retryPolicyConfiguration (dict) --

        The configuration of the retry policy that the job runs on.

        • maxAttempts (integer) --

          The maximum number of attempts on the job's driver.

      • retryPolicyExecution (dict) --

        The current status of the retry policy executed on the job.

        • currentAttemptCount (integer) --

          The current number of attempts made on the driver of the job.

DescribeManagedEndpoint (updated) Link ¶
Changes (response)
{'endpoint': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                           'rotationSize': 'string'}}}}}

Displays detailed information about a managed endpoint. A managed endpoint is a gateway that connects Amazon EMR Studio to Amazon EMR on EKS so that Amazon EMR Studio can communicate with your virtual cluster.

See also: AWS API Documentation

Request Syntax

client.describe_managed_endpoint(
    id='string',
    virtualClusterId='string'
)
type id

string

param id

[REQUIRED]

This output displays ID of the managed endpoint.

type virtualClusterId

string

param virtualClusterId

[REQUIRED]

The ID of the endpoint's virtual cluster.

rtype

dict

returns

Response Syntax

{
    'endpoint': {
        'id': 'string',
        'name': 'string',
        'arn': 'string',
        'virtualClusterId': 'string',
        'type': 'string',
        'state': 'CREATING'|'ACTIVE'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
        'releaseLabel': 'string',
        'executionRoleArn': 'string',
        'certificateArn': 'string',
        'certificateAuthority': {
            'certificateArn': 'string',
            'certificateData': 'string'
        },
        'configurationOverrides': {
            'applicationConfiguration': [
                {
                    'classification': 'string',
                    'properties': {
                        'string': 'string'
                    },
                    'configurations': {'... recursive ...'}
                },
            ],
            'monitoringConfiguration': {
                'persistentAppUI': 'ENABLED'|'DISABLED',
                'cloudWatchMonitoringConfiguration': {
                    'logGroupName': 'string',
                    'logStreamNamePrefix': 'string'
                },
                's3MonitoringConfiguration': {
                    'logUri': 'string'
                },
                'containerLogRotationConfiguration': {
                    'rotationSize': 'string',
                    'maxFilesToKeep': 123
                }
            }
        },
        'serverUrl': 'string',
        'createdAt': datetime(2015, 1, 1),
        'securityGroup': 'string',
        'subnetIds': [
            'string',
        ],
        'stateDetails': 'string',
        'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
        'tags': {
            'string': 'string'
        }
    }
}

Response Structure

  • (dict) --

    • endpoint (dict) --

      This output displays information about a managed endpoint.

      • id (string) --

        The ID of the endpoint.

      • name (string) --

        The name of the endpoint.

      • arn (string) --

        The ARN of the endpoint.

      • virtualClusterId (string) --

        The ID of the endpoint's virtual cluster.

      • type (string) --

        The type of the endpoint.

      • state (string) --

        The state of the endpoint.

      • releaseLabel (string) --

        The EMR release version to be used for the endpoint.

      • executionRoleArn (string) --

        The execution role ARN of the endpoint.

      • certificateArn (string) --

        The certificate ARN of the endpoint. This field is under deprecation and will be removed in future.

      • certificateAuthority (dict) --

        The certificate generated by emr control plane on customer behalf to secure the managed endpoint.

        • certificateArn (string) --

          The ARN of the certificate generated for managed endpoint.

        • certificateData (string) --

          The base64 encoded PEM certificate data generated for managed endpoint.

      • configurationOverrides (dict) --

        The configuration settings that are used to override existing configurations for endpoints.

        • applicationConfiguration (list) --

          The configurations for the application running by the job run.

          • (dict) --

            A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

            • classification (string) --

              The classification within a configuration.

            • properties (dict) --

              A set of properties specified within a configuration classification.

              • (string) --

                • (string) --

            • configurations (list) --

              A list of additional configurations to apply within a configuration object.

        • monitoringConfiguration (dict) --

          The configurations for monitoring.

          • persistentAppUI (string) --

            Monitoring configurations for the persistent application UI.

          • cloudWatchMonitoringConfiguration (dict) --

            Monitoring configurations for CloudWatch.

            • logGroupName (string) --

              The name of the log group for log publishing.

            • logStreamNamePrefix (string) --

              The specified name prefix for log streams.

          • s3MonitoringConfiguration (dict) --

            Amazon S3 configuration for monitoring log publishing.

            • logUri (string) --

              Amazon S3 destination URI for log publishing.

          • containerLogRotationConfiguration (dict) --

            Enable or disable container log rotation.

            • rotationSize (string) --

              The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

            • maxFilesToKeep (integer) --

              The number of files to keep in container after rotation.

      • serverUrl (string) --

        The server URL of the endpoint.

      • createdAt (datetime) --

        The date and time when the endpoint was created.

      • securityGroup (string) --

        The security group configuration of the endpoint.

      • subnetIds (list) --

        The subnet IDs of the endpoint.

        • (string) --

      • stateDetails (string) --

        Additional details of the endpoint state.

      • failureReason (string) --

        The reasons why the endpoint has failed.

      • tags (dict) --

        The tags of the endpoint.

        • (string) --

          • (string) --

ListJobRuns (updated) Link ¶
Changes (response)
{'jobRuns': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                          'rotationSize': 'string'}}}}}

Lists job runs based on a set of parameters. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

client.list_job_runs(
    virtualClusterId='string',
    createdBefore=datetime(2015, 1, 1),
    createdAfter=datetime(2015, 1, 1),
    name='string',
    states=[
        'PENDING'|'SUBMITTED'|'RUNNING'|'FAILED'|'CANCELLED'|'CANCEL_PENDING'|'COMPLETED',
    ],
    maxResults=123,
    nextToken='string'
)
type virtualClusterId

string

param virtualClusterId

[REQUIRED]

The ID of the virtual cluster for which to list the job run.

type createdBefore

datetime

param createdBefore

The date and time before which the job runs were submitted.

type createdAfter

datetime

param createdAfter

The date and time after which the job runs were submitted.

type name

string

param name

The name of the job run.

type states

list

param states

The states of the job run.

  • (string) --

type maxResults

integer

param maxResults

The maximum number of job runs that can be listed.

type nextToken

string

param nextToken

The token for the next set of job runs to return.

rtype

dict

returns

Response Syntax

{
    'jobRuns': [
        {
            'id': 'string',
            'name': 'string',
            'virtualClusterId': 'string',
            'arn': 'string',
            'state': 'PENDING'|'SUBMITTED'|'RUNNING'|'FAILED'|'CANCELLED'|'CANCEL_PENDING'|'COMPLETED',
            'clientToken': 'string',
            'executionRoleArn': 'string',
            'releaseLabel': 'string',
            'configurationOverrides': {
                'applicationConfiguration': [
                    {
                        'classification': 'string',
                        'properties': {
                            'string': 'string'
                        },
                        'configurations': {'... recursive ...'}
                    },
                ],
                'monitoringConfiguration': {
                    'persistentAppUI': 'ENABLED'|'DISABLED',
                    'cloudWatchMonitoringConfiguration': {
                        'logGroupName': 'string',
                        'logStreamNamePrefix': 'string'
                    },
                    's3MonitoringConfiguration': {
                        'logUri': 'string'
                    },
                    'containerLogRotationConfiguration': {
                        'rotationSize': 'string',
                        'maxFilesToKeep': 123
                    }
                }
            },
            'jobDriver': {
                'sparkSubmitJobDriver': {
                    'entryPoint': 'string',
                    'entryPointArguments': [
                        'string',
                    ],
                    'sparkSubmitParameters': 'string'
                },
                'sparkSqlJobDriver': {
                    'entryPoint': 'string',
                    'sparkSqlParameters': 'string'
                }
            },
            'createdAt': datetime(2015, 1, 1),
            'createdBy': 'string',
            'finishedAt': datetime(2015, 1, 1),
            'stateDetails': 'string',
            'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
            'tags': {
                'string': 'string'
            },
            'retryPolicyConfiguration': {
                'maxAttempts': 123
            },
            'retryPolicyExecution': {
                'currentAttemptCount': 123
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • jobRuns (list) --

      This output lists information about the specified job runs.

      • (dict) --

        This entity describes a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

        • id (string) --

          The ID of the job run.

        • name (string) --

          The name of the job run.

        • virtualClusterId (string) --

          The ID of the job run's virtual cluster.

        • arn (string) --

          The ARN of job run.

        • state (string) --

          The state of the job run.

        • clientToken (string) --

          The client token used to start a job run.

        • executionRoleArn (string) --

          The execution role ARN of the job run.

        • releaseLabel (string) --

          The release version of Amazon EMR.

        • configurationOverrides (dict) --

          The configuration settings that are used to override default configuration.

          • applicationConfiguration (list) --

            The configurations for the application running by the job run.

            • (dict) --

              A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

              • classification (string) --

                The classification within a configuration.

              • properties (dict) --

                A set of properties specified within a configuration classification.

                • (string) --

                  • (string) --

              • configurations (list) --

                A list of additional configurations to apply within a configuration object.

          • monitoringConfiguration (dict) --

            The configurations for monitoring.

            • persistentAppUI (string) --

              Monitoring configurations for the persistent application UI.

            • cloudWatchMonitoringConfiguration (dict) --

              Monitoring configurations for CloudWatch.

              • logGroupName (string) --

                The name of the log group for log publishing.

              • logStreamNamePrefix (string) --

                The specified name prefix for log streams.

            • s3MonitoringConfiguration (dict) --

              Amazon S3 configuration for monitoring log publishing.

              • logUri (string) --

                Amazon S3 destination URI for log publishing.

            • containerLogRotationConfiguration (dict) --

              Enable or disable container log rotation.

              • rotationSize (string) --

                The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

              • maxFilesToKeep (integer) --

                The number of files to keep in container after rotation.

        • jobDriver (dict) --

          Parameters of job driver for the job run.

          • sparkSubmitJobDriver (dict) --

            The job driver parameters specified for spark submit.

            • entryPoint (string) --

              The entry point of job application.

            • entryPointArguments (list) --

              The arguments for job application.

              • (string) --

            • sparkSubmitParameters (string) --

              The Spark submit parameters that are used for job runs.

          • sparkSqlJobDriver (dict) --

            The job driver for job type.

            • entryPoint (string) --

              The SQL file to be executed.

            • sparkSqlParameters (string) --

              The Spark parameters to be included in the Spark SQL command.

        • createdAt (datetime) --

          The date and time when the job run was created.

        • createdBy (string) --

          The user who created the job run.

        • finishedAt (datetime) --

          The date and time when the job run has finished.

        • stateDetails (string) --

          Additional details of the job run state.

        • failureReason (string) --

          The reasons why the job run has failed.

        • tags (dict) --

          The assigned tags of the job run.

          • (string) --

            • (string) --

        • retryPolicyConfiguration (dict) --

          The configuration of the retry policy that the job runs on.

          • maxAttempts (integer) --

            The maximum number of attempts on the job's driver.

        • retryPolicyExecution (dict) --

          The current status of the retry policy executed on the job.

          • currentAttemptCount (integer) --

            The current number of attempts made on the driver of the job.

    • nextToken (string) --

      This output displays the token for the next set of job runs.

ListManagedEndpoints (updated) Link ¶
Changes (response)
{'endpoints': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                            'rotationSize': 'string'}}}}}

Lists managed endpoints based on a set of parameters. A managed endpoint is a gateway that connects Amazon EMR Studio to Amazon EMR on EKS so that Amazon EMR Studio can communicate with your virtual cluster.

See also: AWS API Documentation

Request Syntax

client.list_managed_endpoints(
    virtualClusterId='string',
    createdBefore=datetime(2015, 1, 1),
    createdAfter=datetime(2015, 1, 1),
    types=[
        'string',
    ],
    states=[
        'CREATING'|'ACTIVE'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
    ],
    maxResults=123,
    nextToken='string'
)
type virtualClusterId

string

param virtualClusterId

[REQUIRED]

The ID of the virtual cluster.

type createdBefore

datetime

param createdBefore

The date and time before which the endpoints are created.

type createdAfter

datetime

param createdAfter

The date and time after which the endpoints are created.

type types

list

param types

The types of the managed endpoints.

  • (string) --

type states

list

param states

The states of the managed endpoints.

  • (string) --

type maxResults

integer

param maxResults

The maximum number of managed endpoints that can be listed.

type nextToken

string

param nextToken

The token for the next set of managed endpoints to return.

rtype

dict

returns

Response Syntax

{
    'endpoints': [
        {
            'id': 'string',
            'name': 'string',
            'arn': 'string',
            'virtualClusterId': 'string',
            'type': 'string',
            'state': 'CREATING'|'ACTIVE'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
            'releaseLabel': 'string',
            'executionRoleArn': 'string',
            'certificateArn': 'string',
            'certificateAuthority': {
                'certificateArn': 'string',
                'certificateData': 'string'
            },
            'configurationOverrides': {
                'applicationConfiguration': [
                    {
                        'classification': 'string',
                        'properties': {
                            'string': 'string'
                        },
                        'configurations': {'... recursive ...'}
                    },
                ],
                'monitoringConfiguration': {
                    'persistentAppUI': 'ENABLED'|'DISABLED',
                    'cloudWatchMonitoringConfiguration': {
                        'logGroupName': 'string',
                        'logStreamNamePrefix': 'string'
                    },
                    's3MonitoringConfiguration': {
                        'logUri': 'string'
                    },
                    'containerLogRotationConfiguration': {
                        'rotationSize': 'string',
                        'maxFilesToKeep': 123
                    }
                }
            },
            'serverUrl': 'string',
            'createdAt': datetime(2015, 1, 1),
            'securityGroup': 'string',
            'subnetIds': [
                'string',
            ],
            'stateDetails': 'string',
            'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
            'tags': {
                'string': 'string'
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • endpoints (list) --

      The managed endpoints to be listed.

      • (dict) --

        This entity represents the endpoint that is managed by Amazon EMR on EKS.

        • id (string) --

          The ID of the endpoint.

        • name (string) --

          The name of the endpoint.

        • arn (string) --

          The ARN of the endpoint.

        • virtualClusterId (string) --

          The ID of the endpoint's virtual cluster.

        • type (string) --

          The type of the endpoint.

        • state (string) --

          The state of the endpoint.

        • releaseLabel (string) --

          The EMR release version to be used for the endpoint.

        • executionRoleArn (string) --

          The execution role ARN of the endpoint.

        • certificateArn (string) --

          The certificate ARN of the endpoint. This field is under deprecation and will be removed in future.

        • certificateAuthority (dict) --

          The certificate generated by emr control plane on customer behalf to secure the managed endpoint.

          • certificateArn (string) --

            The ARN of the certificate generated for managed endpoint.

          • certificateData (string) --

            The base64 encoded PEM certificate data generated for managed endpoint.

        • configurationOverrides (dict) --

          The configuration settings that are used to override existing configurations for endpoints.

          • applicationConfiguration (list) --

            The configurations for the application running by the job run.

            • (dict) --

              A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

              • classification (string) --

                The classification within a configuration.

              • properties (dict) --

                A set of properties specified within a configuration classification.

                • (string) --

                  • (string) --

              • configurations (list) --

                A list of additional configurations to apply within a configuration object.

          • monitoringConfiguration (dict) --

            The configurations for monitoring.

            • persistentAppUI (string) --

              Monitoring configurations for the persistent application UI.

            • cloudWatchMonitoringConfiguration (dict) --

              Monitoring configurations for CloudWatch.

              • logGroupName (string) --

                The name of the log group for log publishing.

              • logStreamNamePrefix (string) --

                The specified name prefix for log streams.

            • s3MonitoringConfiguration (dict) --

              Amazon S3 configuration for monitoring log publishing.

              • logUri (string) --

                Amazon S3 destination URI for log publishing.

            • containerLogRotationConfiguration (dict) --

              Enable or disable container log rotation.

              • rotationSize (string) --

                The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

              • maxFilesToKeep (integer) --

                The number of files to keep in container after rotation.

        • serverUrl (string) --

          The server URL of the endpoint.

        • createdAt (datetime) --

          The date and time when the endpoint was created.

        • securityGroup (string) --

          The security group configuration of the endpoint.

        • subnetIds (list) --

          The subnet IDs of the endpoint.

          • (string) --

        • stateDetails (string) --

          Additional details of the endpoint state.

        • failureReason (string) --

          The reasons why the endpoint has failed.

        • tags (dict) --

          The tags of the endpoint.

          • (string) --

            • (string) --

    • nextToken (string) --

      The token for the next set of endpoints to return.

StartJobRun (updated) Link ¶
Changes (request)
{'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                              'rotationSize': 'string'}}}}

Starts a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

client.start_job_run(
    name='string',
    virtualClusterId='string',
    clientToken='string',
    executionRoleArn='string',
    releaseLabel='string',
    jobDriver={
        'sparkSubmitJobDriver': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'sparkSqlJobDriver': {
            'entryPoint': 'string',
            'sparkSqlParameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            },
            'containerLogRotationConfiguration': {
                'rotationSize': 'string',
                'maxFilesToKeep': 123
            }
        }
    },
    tags={
        'string': 'string'
    },
    jobTemplateId='string',
    jobTemplateParameters={
        'string': 'string'
    },
    retryPolicyConfiguration={
        'maxAttempts': 123
    }
)
type name

string

param name

The name of the job run.

type virtualClusterId

string

param virtualClusterId

[REQUIRED]

The virtual cluster ID for which the job run request is submitted.

type clientToken

string

param clientToken

[REQUIRED]

The client idempotency token of the job run request.

This field is autopopulated if not provided.

type executionRoleArn

string

param executionRoleArn

The execution role ARN for the job run.

type releaseLabel

string

param releaseLabel

The Amazon EMR release version to use for the job run.

type jobDriver

dict

param jobDriver

The job driver for the job run.

  • sparkSubmitJobDriver (dict) --

    The job driver parameters specified for spark submit.

    • entryPoint (string) -- [REQUIRED]

      The entry point of job application.

    • entryPointArguments (list) --

      The arguments for job application.

      • (string) --

    • sparkSubmitParameters (string) --

      The Spark submit parameters that are used for job runs.

  • sparkSqlJobDriver (dict) --

    The job driver for job type.

    • entryPoint (string) --

      The SQL file to be executed.

    • sparkSqlParameters (string) --

      The Spark parameters to be included in the Spark SQL command.

type configurationOverrides

dict

param configurationOverrides

The configuration overrides for the job run.

  • applicationConfiguration (list) --

    The configurations for the application running by the job run.

    • (dict) --

      A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

      • classification (string) -- [REQUIRED]

        The classification within a configuration.

      • properties (dict) --

        A set of properties specified within a configuration classification.

        • (string) --

          • (string) --

      • configurations (list) --

        A list of additional configurations to apply within a configuration object.

  • monitoringConfiguration (dict) --

    The configurations for monitoring.

    • persistentAppUI (string) --

      Monitoring configurations for the persistent application UI.

    • cloudWatchMonitoringConfiguration (dict) --

      Monitoring configurations for CloudWatch.

      • logGroupName (string) -- [REQUIRED]

        The name of the log group for log publishing.

      • logStreamNamePrefix (string) --

        The specified name prefix for log streams.

    • s3MonitoringConfiguration (dict) --

      Amazon S3 configuration for monitoring log publishing.

      • logUri (string) -- [REQUIRED]

        Amazon S3 destination URI for log publishing.

    • containerLogRotationConfiguration (dict) --

      Enable or disable container log rotation.

      • rotationSize (string) -- [REQUIRED]

        The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

      • maxFilesToKeep (integer) -- [REQUIRED]

        The number of files to keep in container after rotation.

type tags

dict

param tags

The tags assigned to job runs.

  • (string) --

    • (string) --

type jobTemplateId

string

param jobTemplateId

The job template ID to be used to start the job run.

type jobTemplateParameters

dict

param jobTemplateParameters

The values of job template parameters to start a job run.

  • (string) --

    • (string) --

type retryPolicyConfiguration

dict

param retryPolicyConfiguration

The retry policy configuration for the job run.

  • maxAttempts (integer) -- [REQUIRED]

    The maximum number of attempts on the job's driver.

rtype

dict

returns

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) --

    • id (string) --

      This output displays the started job run ID.

    • name (string) --

      This output displays the name of the started job run.

    • arn (string) --

      This output lists the ARN of job run.

    • virtualClusterId (string) --

      This output displays the virtual cluster ID for which the job run was submitted.