Amazon EMR Containers

2023/06/07 - Amazon EMR Containers - 6 updated api methods

Changes  EMR on EKS adds support for log rotation of Spark container logs with EMR-6.11.0 onwards, to the StartJobRun API.

CreateManagedEndpoint (updated) Link ¶
Changes (request)
{'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                              'rotationSize': 'string'}}}}

Creates a managed endpoint. A managed endpoint is a gateway that connects Amazon EMR Studio to Amazon EMR on EKS so that Amazon EMR Studio can communicate with your virtual cluster.

See also: AWS API Documentation

Request Syntax

client.create_managed_endpoint(
    name='string',
    virtualClusterId='string',
    type='string',
    releaseLabel='string',
    executionRoleArn='string',
    certificateArn='string',
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            },
            'containerLogRotationConfiguration': {
                'rotationSize': 'string',
                'maxFilesToKeep': 123
            }
        }
    },
    clientToken='string',
    tags={
        'string': 'string'
    }
)
type name:

string

param name:

[REQUIRED]

The name of the managed endpoint.

type virtualClusterId:

string

param virtualClusterId:

[REQUIRED]

The ID of the virtual cluster for which a managed endpoint is created.

type type:

string

param type:

[REQUIRED]

The type of the managed endpoint.

type releaseLabel:

string

param releaseLabel:

[REQUIRED]

The Amazon EMR release version.

type executionRoleArn:

string

param executionRoleArn:

[REQUIRED]

The ARN of the execution role.

type certificateArn:

string

param certificateArn:

The certificate ARN provided by users for the managed endpoint. This field is under deprecation and will be removed in future releases.

type configurationOverrides:

dict

param configurationOverrides:

The configuration settings that will be used to override existing configurations.

  • applicationConfiguration (list) --

    The configurations for the application running by the job run.

    • (dict) --

      A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

      • classification (string) -- [REQUIRED]

        The classification within a configuration.

      • properties (dict) --

        A set of properties specified within a configuration classification.

        • (string) --

          • (string) --

      • configurations (list) --

        A list of additional configurations to apply within a configuration object.

  • monitoringConfiguration (dict) --

    The configurations for monitoring.

    • persistentAppUI (string) --

      Monitoring configurations for the persistent application UI.

    • cloudWatchMonitoringConfiguration (dict) --

      Monitoring configurations for CloudWatch.

      • logGroupName (string) -- [REQUIRED]

        The name of the log group for log publishing.

      • logStreamNamePrefix (string) --

        The specified name prefix for log streams.

    • s3MonitoringConfiguration (dict) --

      Amazon S3 configuration for monitoring log publishing.

      • logUri (string) -- [REQUIRED]

        Amazon S3 destination URI for log publishing.

    • containerLogRotationConfiguration (dict) --

      Enable or disable container log rotation.

      • rotationSize (string) -- [REQUIRED]

        The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

      • maxFilesToKeep (integer) -- [REQUIRED]

        The number of files to keep in container after rotation.

type clientToken:

string

param clientToken:

[REQUIRED]

The client idempotency token for this create call.

This field is autopopulated if not provided.

type tags:

dict

param tags:

The tags of the managed endpoint.

  • (string) --

    • (string) --

rtype:

dict

returns:

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) --

    • id (string) --

      The output contains the ID of the managed endpoint.

    • name (string) --

      The output contains the name of the managed endpoint.

    • arn (string) --

      The output contains the ARN of the managed endpoint.

    • virtualClusterId (string) --

      The output contains the ID of the virtual cluster.

DescribeJobRun (updated) Link ¶
Changes (response)
{'jobRun': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                         'rotationSize': 'string'}}}}}

Displays detailed information about a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

client.describe_job_run(
    id='string',
    virtualClusterId='string'
)
type id:

string

param id:

[REQUIRED]

The ID of the job run request.

type virtualClusterId:

string

param virtualClusterId:

[REQUIRED]

The ID of the virtual cluster for which the job run is submitted.

rtype:

dict

returns:

Response Syntax

{
    'jobRun': {
        'id': 'string',
        'name': 'string',
        'virtualClusterId': 'string',
        'arn': 'string',
        'state': 'PENDING'|'SUBMITTED'|'RUNNING'|'FAILED'|'CANCELLED'|'CANCEL_PENDING'|'COMPLETED',
        'clientToken': 'string',
        'executionRoleArn': 'string',
        'releaseLabel': 'string',
        'configurationOverrides': {
            'applicationConfiguration': [
                {
                    'classification': 'string',
                    'properties': {
                        'string': 'string'
                    },
                    'configurations': {'... recursive ...'}
                },
            ],
            'monitoringConfiguration': {
                'persistentAppUI': 'ENABLED'|'DISABLED',
                'cloudWatchMonitoringConfiguration': {
                    'logGroupName': 'string',
                    'logStreamNamePrefix': 'string'
                },
                's3MonitoringConfiguration': {
                    'logUri': 'string'
                },
                'containerLogRotationConfiguration': {
                    'rotationSize': 'string',
                    'maxFilesToKeep': 123
                }
            }
        },
        'jobDriver': {
            'sparkSubmitJobDriver': {
                'entryPoint': 'string',
                'entryPointArguments': [
                    'string',
                ],
                'sparkSubmitParameters': 'string'
            },
            'sparkSqlJobDriver': {
                'entryPoint': 'string',
                'sparkSqlParameters': 'string'
            }
        },
        'createdAt': datetime(2015, 1, 1),
        'createdBy': 'string',
        'finishedAt': datetime(2015, 1, 1),
        'stateDetails': 'string',
        'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
        'tags': {
            'string': 'string'
        },
        'retryPolicyConfiguration': {
            'maxAttempts': 123
        },
        'retryPolicyExecution': {
            'currentAttemptCount': 123
        }
    }
}

Response Structure

  • (dict) --

    • jobRun (dict) --

      The output displays information about a job run.

      • id (string) --

        The ID of the job run.

      • name (string) --

        The name of the job run.

      • virtualClusterId (string) --

        The ID of the job run's virtual cluster.

      • arn (string) --

        The ARN of job run.

      • state (string) --

        The state of the job run.

      • clientToken (string) --

        The client token used to start a job run.

      • executionRoleArn (string) --

        The execution role ARN of the job run.

      • releaseLabel (string) --

        The release version of Amazon EMR.

      • configurationOverrides (dict) --

        The configuration settings that are used to override default configuration.

        • applicationConfiguration (list) --

          The configurations for the application running by the job run.

          • (dict) --

            A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

            • classification (string) --

              The classification within a configuration.

            • properties (dict) --

              A set of properties specified within a configuration classification.

              • (string) --

                • (string) --

            • configurations (list) --

              A list of additional configurations to apply within a configuration object.

        • monitoringConfiguration (dict) --

          The configurations for monitoring.

          • persistentAppUI (string) --

            Monitoring configurations for the persistent application UI.

          • cloudWatchMonitoringConfiguration (dict) --

            Monitoring configurations for CloudWatch.

            • logGroupName (string) --

              The name of the log group for log publishing.

            • logStreamNamePrefix (string) --

              The specified name prefix for log streams.

          • s3MonitoringConfiguration (dict) --

            Amazon S3 configuration for monitoring log publishing.

            • logUri (string) --

              Amazon S3 destination URI for log publishing.

          • containerLogRotationConfiguration (dict) --

            Enable or disable container log rotation.

            • rotationSize (string) --

              The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

            • maxFilesToKeep (integer) --

              The number of files to keep in container after rotation.

      • jobDriver (dict) --

        Parameters of job driver for the job run.

        • sparkSubmitJobDriver (dict) --

          The job driver parameters specified for spark submit.

          • entryPoint (string) --

            The entry point of job application.

          • entryPointArguments (list) --

            The arguments for job application.

            • (string) --

          • sparkSubmitParameters (string) --

            The Spark submit parameters that are used for job runs.

        • sparkSqlJobDriver (dict) --

          The job driver for job type.

          • entryPoint (string) --

            The SQL file to be executed.

          • sparkSqlParameters (string) --

            The Spark parameters to be included in the Spark SQL command.

      • createdAt (datetime) --

        The date and time when the job run was created.

      • createdBy (string) --

        The user who created the job run.

      • finishedAt (datetime) --

        The date and time when the job run has finished.

      • stateDetails (string) --

        Additional details of the job run state.

      • failureReason (string) --

        The reasons why the job run has failed.

      • tags (dict) --

        The assigned tags of the job run.

        • (string) --

          • (string) --

      • retryPolicyConfiguration (dict) --

        The configuration of the retry policy that the job runs on.

        • maxAttempts (integer) --

          The maximum number of attempts on the job's driver.

      • retryPolicyExecution (dict) --

        The current status of the retry policy executed on the job.

        • currentAttemptCount (integer) --

          The current number of attempts made on the driver of the job.

DescribeManagedEndpoint (updated) Link ¶
Changes (response)
{'endpoint': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                           'rotationSize': 'string'}}}}}

Displays detailed information about a managed endpoint. A managed endpoint is a gateway that connects Amazon EMR Studio to Amazon EMR on EKS so that Amazon EMR Studio can communicate with your virtual cluster.

See also: AWS API Documentation

Request Syntax

client.describe_managed_endpoint(
    id='string',
    virtualClusterId='string'
)
type id:

string

param id:

[REQUIRED]

This output displays ID of the managed endpoint.

type virtualClusterId:

string

param virtualClusterId:

[REQUIRED]

The ID of the endpoint's virtual cluster.

rtype:

dict

returns:

Response Syntax

{
    'endpoint': {
        'id': 'string',
        'name': 'string',
        'arn': 'string',
        'virtualClusterId': 'string',
        'type': 'string',
        'state': 'CREATING'|'ACTIVE'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
        'releaseLabel': 'string',
        'executionRoleArn': 'string',
        'certificateArn': 'string',
        'certificateAuthority': {
            'certificateArn': 'string',
            'certificateData': 'string'
        },
        'configurationOverrides': {
            'applicationConfiguration': [
                {
                    'classification': 'string',
                    'properties': {
                        'string': 'string'
                    },
                    'configurations': {'... recursive ...'}
                },
            ],
            'monitoringConfiguration': {
                'persistentAppUI': 'ENABLED'|'DISABLED',
                'cloudWatchMonitoringConfiguration': {
                    'logGroupName': 'string',
                    'logStreamNamePrefix': 'string'
                },
                's3MonitoringConfiguration': {
                    'logUri': 'string'
                },
                'containerLogRotationConfiguration': {
                    'rotationSize': 'string',
                    'maxFilesToKeep': 123
                }
            }
        },
        'serverUrl': 'string',
        'createdAt': datetime(2015, 1, 1),
        'securityGroup': 'string',
        'subnetIds': [
            'string',
        ],
        'stateDetails': 'string',
        'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
        'tags': {
            'string': 'string'
        }
    }
}

Response Structure

  • (dict) --

    • endpoint (dict) --

      This output displays information about a managed endpoint.

      • id (string) --

        The ID of the endpoint.

      • name (string) --

        The name of the endpoint.

      • arn (string) --

        The ARN of the endpoint.

      • virtualClusterId (string) --

        The ID of the endpoint's virtual cluster.

      • type (string) --

        The type of the endpoint.

      • state (string) --

        The state of the endpoint.

      • releaseLabel (string) --

        The EMR release version to be used for the endpoint.

      • executionRoleArn (string) --

        The execution role ARN of the endpoint.

      • certificateArn (string) --

        The certificate ARN of the endpoint. This field is under deprecation and will be removed in future.

      • certificateAuthority (dict) --

        The certificate generated by emr control plane on customer behalf to secure the managed endpoint.

        • certificateArn (string) --

          The ARN of the certificate generated for managed endpoint.

        • certificateData (string) --

          The base64 encoded PEM certificate data generated for managed endpoint.

      • configurationOverrides (dict) --

        The configuration settings that are used to override existing configurations for endpoints.

        • applicationConfiguration (list) --

          The configurations for the application running by the job run.

          • (dict) --

            A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

            • classification (string) --

              The classification within a configuration.

            • properties (dict) --

              A set of properties specified within a configuration classification.

              • (string) --

                • (string) --

            • configurations (list) --

              A list of additional configurations to apply within a configuration object.

        • monitoringConfiguration (dict) --

          The configurations for monitoring.

          • persistentAppUI (string) --

            Monitoring configurations for the persistent application UI.

          • cloudWatchMonitoringConfiguration (dict) --

            Monitoring configurations for CloudWatch.

            • logGroupName (string) --

              The name of the log group for log publishing.

            • logStreamNamePrefix (string) --

              The specified name prefix for log streams.

          • s3MonitoringConfiguration (dict) --

            Amazon S3 configuration for monitoring log publishing.

            • logUri (string) --

              Amazon S3 destination URI for log publishing.

          • containerLogRotationConfiguration (dict) --

            Enable or disable container log rotation.

            • rotationSize (string) --

              The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

            • maxFilesToKeep (integer) --

              The number of files to keep in container after rotation.

      • serverUrl (string) --

        The server URL of the endpoint.

      • createdAt (datetime) --

        The date and time when the endpoint was created.

      • securityGroup (string) --

        The security group configuration of the endpoint.

      • subnetIds (list) --

        The subnet IDs of the endpoint.

        • (string) --

      • stateDetails (string) --

        Additional details of the endpoint state.

      • failureReason (string) --

        The reasons why the endpoint has failed.

      • tags (dict) --

        The tags of the endpoint.

        • (string) --

          • (string) --

ListJobRuns (updated) Link ¶
Changes (response)
{'jobRuns': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                          'rotationSize': 'string'}}}}}

Lists job runs based on a set of parameters. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

client.list_job_runs(
    virtualClusterId='string',
    createdBefore=datetime(2015, 1, 1),
    createdAfter=datetime(2015, 1, 1),
    name='string',
    states=[
        'PENDING'|'SUBMITTED'|'RUNNING'|'FAILED'|'CANCELLED'|'CANCEL_PENDING'|'COMPLETED',
    ],
    maxResults=123,
    nextToken='string'
)
type virtualClusterId:

string

param virtualClusterId:

[REQUIRED]

The ID of the virtual cluster for which to list the job run.

type createdBefore:

datetime

param createdBefore:

The date and time before which the job runs were submitted.

type createdAfter:

datetime

param createdAfter:

The date and time after which the job runs were submitted.

type name:

string

param name:

The name of the job run.

type states:

list

param states:

The states of the job run.

  • (string) --

type maxResults:

integer

param maxResults:

The maximum number of job runs that can be listed.

type nextToken:

string

param nextToken:

The token for the next set of job runs to return.

rtype:

dict

returns:

Response Syntax

{
    'jobRuns': [
        {
            'id': 'string',
            'name': 'string',
            'virtualClusterId': 'string',
            'arn': 'string',
            'state': 'PENDING'|'SUBMITTED'|'RUNNING'|'FAILED'|'CANCELLED'|'CANCEL_PENDING'|'COMPLETED',
            'clientToken': 'string',
            'executionRoleArn': 'string',
            'releaseLabel': 'string',
            'configurationOverrides': {
                'applicationConfiguration': [
                    {
                        'classification': 'string',
                        'properties': {
                            'string': 'string'
                        },
                        'configurations': {'... recursive ...'}
                    },
                ],
                'monitoringConfiguration': {
                    'persistentAppUI': 'ENABLED'|'DISABLED',
                    'cloudWatchMonitoringConfiguration': {
                        'logGroupName': 'string',
                        'logStreamNamePrefix': 'string'
                    },
                    's3MonitoringConfiguration': {
                        'logUri': 'string'
                    },
                    'containerLogRotationConfiguration': {
                        'rotationSize': 'string',
                        'maxFilesToKeep': 123
                    }
                }
            },
            'jobDriver': {
                'sparkSubmitJobDriver': {
                    'entryPoint': 'string',
                    'entryPointArguments': [
                        'string',
                    ],
                    'sparkSubmitParameters': 'string'
                },
                'sparkSqlJobDriver': {
                    'entryPoint': 'string',
                    'sparkSqlParameters': 'string'
                }
            },
            'createdAt': datetime(2015, 1, 1),
            'createdBy': 'string',
            'finishedAt': datetime(2015, 1, 1),
            'stateDetails': 'string',
            'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
            'tags': {
                'string': 'string'
            },
            'retryPolicyConfiguration': {
                'maxAttempts': 123
            },
            'retryPolicyExecution': {
                'currentAttemptCount': 123
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • jobRuns (list) --

      This output lists information about the specified job runs.

      • (dict) --

        This entity describes a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

        • id (string) --

          The ID of the job run.

        • name (string) --

          The name of the job run.

        • virtualClusterId (string) --

          The ID of the job run's virtual cluster.

        • arn (string) --

          The ARN of job run.

        • state (string) --

          The state of the job run.

        • clientToken (string) --

          The client token used to start a job run.

        • executionRoleArn (string) --

          The execution role ARN of the job run.

        • releaseLabel (string) --

          The release version of Amazon EMR.

        • configurationOverrides (dict) --

          The configuration settings that are used to override default configuration.

          • applicationConfiguration (list) --

            The configurations for the application running by the job run.

            • (dict) --

              A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

              • classification (string) --

                The classification within a configuration.

              • properties (dict) --

                A set of properties specified within a configuration classification.

                • (string) --

                  • (string) --

              • configurations (list) --

                A list of additional configurations to apply within a configuration object.

          • monitoringConfiguration (dict) --

            The configurations for monitoring.

            • persistentAppUI (string) --

              Monitoring configurations for the persistent application UI.

            • cloudWatchMonitoringConfiguration (dict) --

              Monitoring configurations for CloudWatch.

              • logGroupName (string) --

                The name of the log group for log publishing.

              • logStreamNamePrefix (string) --

                The specified name prefix for log streams.

            • s3MonitoringConfiguration (dict) --

              Amazon S3 configuration for monitoring log publishing.

              • logUri (string) --

                Amazon S3 destination URI for log publishing.

            • containerLogRotationConfiguration (dict) --

              Enable or disable container log rotation.

              • rotationSize (string) --

                The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

              • maxFilesToKeep (integer) --

                The number of files to keep in container after rotation.

        • jobDriver (dict) --

          Parameters of job driver for the job run.

          • sparkSubmitJobDriver (dict) --

            The job driver parameters specified for spark submit.

            • entryPoint (string) --

              The entry point of job application.

            • entryPointArguments (list) --

              The arguments for job application.

              • (string) --

            • sparkSubmitParameters (string) --

              The Spark submit parameters that are used for job runs.

          • sparkSqlJobDriver (dict) --

            The job driver for job type.

            • entryPoint (string) --

              The SQL file to be executed.

            • sparkSqlParameters (string) --

              The Spark parameters to be included in the Spark SQL command.

        • createdAt (datetime) --

          The date and time when the job run was created.

        • createdBy (string) --

          The user who created the job run.

        • finishedAt (datetime) --

          The date and time when the job run has finished.

        • stateDetails (string) --

          Additional details of the job run state.

        • failureReason (string) --

          The reasons why the job run has failed.

        • tags (dict) --

          The assigned tags of the job run.

          • (string) --

            • (string) --

        • retryPolicyConfiguration (dict) --

          The configuration of the retry policy that the job runs on.

          • maxAttempts (integer) --

            The maximum number of attempts on the job's driver.

        • retryPolicyExecution (dict) --

          The current status of the retry policy executed on the job.

          • currentAttemptCount (integer) --

            The current number of attempts made on the driver of the job.

    • nextToken (string) --

      This output displays the token for the next set of job runs.

ListManagedEndpoints (updated) Link ¶
Changes (response)
{'endpoints': {'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                                            'rotationSize': 'string'}}}}}

Lists managed endpoints based on a set of parameters. A managed endpoint is a gateway that connects Amazon EMR Studio to Amazon EMR on EKS so that Amazon EMR Studio can communicate with your virtual cluster.

See also: AWS API Documentation

Request Syntax

client.list_managed_endpoints(
    virtualClusterId='string',
    createdBefore=datetime(2015, 1, 1),
    createdAfter=datetime(2015, 1, 1),
    types=[
        'string',
    ],
    states=[
        'CREATING'|'ACTIVE'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
    ],
    maxResults=123,
    nextToken='string'
)
type virtualClusterId:

string

param virtualClusterId:

[REQUIRED]

The ID of the virtual cluster.

type createdBefore:

datetime

param createdBefore:

The date and time before which the endpoints are created.

type createdAfter:

datetime

param createdAfter:

The date and time after which the endpoints are created.

type types:

list

param types:

The types of the managed endpoints.

  • (string) --

type states:

list

param states:

The states of the managed endpoints.

  • (string) --

type maxResults:

integer

param maxResults:

The maximum number of managed endpoints that can be listed.

type nextToken:

string

param nextToken:

The token for the next set of managed endpoints to return.

rtype:

dict

returns:

Response Syntax

{
    'endpoints': [
        {
            'id': 'string',
            'name': 'string',
            'arn': 'string',
            'virtualClusterId': 'string',
            'type': 'string',
            'state': 'CREATING'|'ACTIVE'|'TERMINATING'|'TERMINATED'|'TERMINATED_WITH_ERRORS',
            'releaseLabel': 'string',
            'executionRoleArn': 'string',
            'certificateArn': 'string',
            'certificateAuthority': {
                'certificateArn': 'string',
                'certificateData': 'string'
            },
            'configurationOverrides': {
                'applicationConfiguration': [
                    {
                        'classification': 'string',
                        'properties': {
                            'string': 'string'
                        },
                        'configurations': {'... recursive ...'}
                    },
                ],
                'monitoringConfiguration': {
                    'persistentAppUI': 'ENABLED'|'DISABLED',
                    'cloudWatchMonitoringConfiguration': {
                        'logGroupName': 'string',
                        'logStreamNamePrefix': 'string'
                    },
                    's3MonitoringConfiguration': {
                        'logUri': 'string'
                    },
                    'containerLogRotationConfiguration': {
                        'rotationSize': 'string',
                        'maxFilesToKeep': 123
                    }
                }
            },
            'serverUrl': 'string',
            'createdAt': datetime(2015, 1, 1),
            'securityGroup': 'string',
            'subnetIds': [
                'string',
            ],
            'stateDetails': 'string',
            'failureReason': 'INTERNAL_ERROR'|'USER_ERROR'|'VALIDATION_ERROR'|'CLUSTER_UNAVAILABLE',
            'tags': {
                'string': 'string'
            }
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • endpoints (list) --

      The managed endpoints to be listed.

      • (dict) --

        This entity represents the endpoint that is managed by Amazon EMR on EKS.

        • id (string) --

          The ID of the endpoint.

        • name (string) --

          The name of the endpoint.

        • arn (string) --

          The ARN of the endpoint.

        • virtualClusterId (string) --

          The ID of the endpoint's virtual cluster.

        • type (string) --

          The type of the endpoint.

        • state (string) --

          The state of the endpoint.

        • releaseLabel (string) --

          The EMR release version to be used for the endpoint.

        • executionRoleArn (string) --

          The execution role ARN of the endpoint.

        • certificateArn (string) --

          The certificate ARN of the endpoint. This field is under deprecation and will be removed in future.

        • certificateAuthority (dict) --

          The certificate generated by emr control plane on customer behalf to secure the managed endpoint.

          • certificateArn (string) --

            The ARN of the certificate generated for managed endpoint.

          • certificateData (string) --

            The base64 encoded PEM certificate data generated for managed endpoint.

        • configurationOverrides (dict) --

          The configuration settings that are used to override existing configurations for endpoints.

          • applicationConfiguration (list) --

            The configurations for the application running by the job run.

            • (dict) --

              A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

              • classification (string) --

                The classification within a configuration.

              • properties (dict) --

                A set of properties specified within a configuration classification.

                • (string) --

                  • (string) --

              • configurations (list) --

                A list of additional configurations to apply within a configuration object.

          • monitoringConfiguration (dict) --

            The configurations for monitoring.

            • persistentAppUI (string) --

              Monitoring configurations for the persistent application UI.

            • cloudWatchMonitoringConfiguration (dict) --

              Monitoring configurations for CloudWatch.

              • logGroupName (string) --

                The name of the log group for log publishing.

              • logStreamNamePrefix (string) --

                The specified name prefix for log streams.

            • s3MonitoringConfiguration (dict) --

              Amazon S3 configuration for monitoring log publishing.

              • logUri (string) --

                Amazon S3 destination URI for log publishing.

            • containerLogRotationConfiguration (dict) --

              Enable or disable container log rotation.

              • rotationSize (string) --

                The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

              • maxFilesToKeep (integer) --

                The number of files to keep in container after rotation.

        • serverUrl (string) --

          The server URL of the endpoint.

        • createdAt (datetime) --

          The date and time when the endpoint was created.

        • securityGroup (string) --

          The security group configuration of the endpoint.

        • subnetIds (list) --

          The subnet IDs of the endpoint.

          • (string) --

        • stateDetails (string) --

          Additional details of the endpoint state.

        • failureReason (string) --

          The reasons why the endpoint has failed.

        • tags (dict) --

          The tags of the endpoint.

          • (string) --

            • (string) --

    • nextToken (string) --

      The token for the next set of endpoints to return.

StartJobRun (updated) Link ¶
Changes (request)
{'configurationOverrides': {'monitoringConfiguration': {'containerLogRotationConfiguration': {'maxFilesToKeep': 'integer',
                                                                                              'rotationSize': 'string'}}}}

Starts a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

client.start_job_run(
    name='string',
    virtualClusterId='string',
    clientToken='string',
    executionRoleArn='string',
    releaseLabel='string',
    jobDriver={
        'sparkSubmitJobDriver': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'sparkSqlJobDriver': {
            'entryPoint': 'string',
            'sparkSqlParameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            },
            'containerLogRotationConfiguration': {
                'rotationSize': 'string',
                'maxFilesToKeep': 123
            }
        }
    },
    tags={
        'string': 'string'
    },
    jobTemplateId='string',
    jobTemplateParameters={
        'string': 'string'
    },
    retryPolicyConfiguration={
        'maxAttempts': 123
    }
)
type name:

string

param name:

The name of the job run.

type virtualClusterId:

string

param virtualClusterId:

[REQUIRED]

The virtual cluster ID for which the job run request is submitted.

type clientToken:

string

param clientToken:

[REQUIRED]

The client idempotency token of the job run request.

This field is autopopulated if not provided.

type executionRoleArn:

string

param executionRoleArn:

The execution role ARN for the job run.

type releaseLabel:

string

param releaseLabel:

The Amazon EMR release version to use for the job run.

type jobDriver:

dict

param jobDriver:

The job driver for the job run.

  • sparkSubmitJobDriver (dict) --

    The job driver parameters specified for spark submit.

    • entryPoint (string) -- [REQUIRED]

      The entry point of job application.

    • entryPointArguments (list) --

      The arguments for job application.

      • (string) --

    • sparkSubmitParameters (string) --

      The Spark submit parameters that are used for job runs.

  • sparkSqlJobDriver (dict) --

    The job driver for job type.

    • entryPoint (string) --

      The SQL file to be executed.

    • sparkSqlParameters (string) --

      The Spark parameters to be included in the Spark SQL command.

type configurationOverrides:

dict

param configurationOverrides:

The configuration overrides for the job run.

  • applicationConfiguration (list) --

    The configurations for the application running by the job run.

    • (dict) --

      A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

      • classification (string) -- [REQUIRED]

        The classification within a configuration.

      • properties (dict) --

        A set of properties specified within a configuration classification.

        • (string) --

          • (string) --

      • configurations (list) --

        A list of additional configurations to apply within a configuration object.

  • monitoringConfiguration (dict) --

    The configurations for monitoring.

    • persistentAppUI (string) --

      Monitoring configurations for the persistent application UI.

    • cloudWatchMonitoringConfiguration (dict) --

      Monitoring configurations for CloudWatch.

      • logGroupName (string) -- [REQUIRED]

        The name of the log group for log publishing.

      • logStreamNamePrefix (string) --

        The specified name prefix for log streams.

    • s3MonitoringConfiguration (dict) --

      Amazon S3 configuration for monitoring log publishing.

      • logUri (string) -- [REQUIRED]

        Amazon S3 destination URI for log publishing.

    • containerLogRotationConfiguration (dict) --

      Enable or disable container log rotation.

      • rotationSize (string) -- [REQUIRED]

        The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.

      • maxFilesToKeep (integer) -- [REQUIRED]

        The number of files to keep in container after rotation.

type tags:

dict

param tags:

The tags assigned to job runs.

  • (string) --

    • (string) --

type jobTemplateId:

string

param jobTemplateId:

The job template ID to be used to start the job run.

type jobTemplateParameters:

dict

param jobTemplateParameters:

The values of job template parameters to start a job run.

  • (string) --

    • (string) --

type retryPolicyConfiguration:

dict

param retryPolicyConfiguration:

The retry policy configuration for the job run.

  • maxAttempts (integer) -- [REQUIRED]

    The maximum number of attempts on the job's driver.

rtype:

dict

returns:

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) --

    • id (string) --

      This output displays the started job run ID.

    • name (string) --

      This output displays the name of the started job run.

    • arn (string) --

      This output lists the ARN of job run.

    • virtualClusterId (string) --

      This output displays the virtual cluster ID for which the job run was submitted.