2025/05/30 - EMR Serverless - 1 updated api methods
Changes This release adds the capability for users to specify an optional Execution IAM policy in the StartJobRun action. The resulting permissions assumed by the job run is the intersection of the permissions in the Execution Role and the specified Execution IAM Policy.
{'executionIamPolicy': {'policy': 'string', 'policyArns': ['string']}}
Starts a job run.
See also: AWS API Documentation
Request Syntax
client.start_job_run( applicationId='string', clientToken='string', executionRoleArn='string', executionIamPolicy={ 'policy': 'string', 'policyArns': [ 'string', ] }, jobDriver={ 'sparkSubmit': { 'entryPoint': 'string', 'entryPointArguments': [ 'string', ], 'sparkSubmitParameters': 'string' }, 'hive': { 'query': 'string', 'initQueryFile': 'string', 'parameters': 'string' } }, configurationOverrides={ 'applicationConfiguration': [ { 'classification': 'string', 'properties': { 'string': 'string' }, 'configurations': {'... recursive ...'} }, ], 'monitoringConfiguration': { 's3MonitoringConfiguration': { 'logUri': 'string', 'encryptionKeyArn': 'string' }, 'managedPersistenceMonitoringConfiguration': { 'enabled': True|False, 'encryptionKeyArn': 'string' }, 'cloudWatchLoggingConfiguration': { 'enabled': True|False, 'logGroupName': 'string', 'logStreamNamePrefix': 'string', 'encryptionKeyArn': 'string', 'logTypes': { 'string': [ 'string', ] } }, 'prometheusMonitoringConfiguration': { 'remoteWriteUrl': 'string' } } }, tags={ 'string': 'string' }, executionTimeoutMinutes=123, name='string', mode='BATCH'|'STREAMING', retryPolicy={ 'maxAttempts': 123, 'maxFailedAttemptsPerHour': 123 } )
string
[REQUIRED]
The ID of the application on which to run the job.
string
[REQUIRED]
The client idempotency token of the job run to start. Its value must be unique for each request.
This field is autopopulated if not provided.
string
[REQUIRED]
The execution role ARN for the job run.
dict
You can pass an optional IAM policy. The resulting job IAM role permissions will be an intersection of this policy and the policy associated with your job execution role.
policy (string) --
An IAM inline policy to use as an execution IAM policy.
policyArns (list) --
A list of Amazon Resource Names (ARNs) to use as an execution IAM policy.
(string) --
dict
The job driver for the job run.
sparkSubmit (dict) --
The job driver parameters specified for Spark.
entryPoint (string) -- [REQUIRED]
The entry point for the Spark submit job run.
entryPointArguments (list) --
The arguments for the Spark submit job run.
(string) --
sparkSubmitParameters (string) --
The parameters for the Spark submit job run.
hive (dict) --
The job driver parameters specified for Hive.
query (string) -- [REQUIRED]
The query for the Hive job run.
initQueryFile (string) --
The query file for the Hive job run.
parameters (string) --
The parameters for the Hive job run.
dict
The configuration overrides for the job run.
applicationConfiguration (list) --
The override configurations for the application.
(dict) --
A configuration specification to be used when provisioning an application. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.
classification (string) -- [REQUIRED]
The classification within a configuration.
properties (dict) --
A set of properties specified within a configuration classification.
(string) --
(string) --
configurations (list) --
A list of additional configurations to apply within a configuration object.
monitoringConfiguration (dict) --
The override configurations for monitoring.
s3MonitoringConfiguration (dict) --
The Amazon S3 configuration for monitoring log publishing.
logUri (string) --
The Amazon S3 destination URI for log publishing.
encryptionKeyArn (string) --
The KMS key ARN to encrypt the logs published to the given Amazon S3 destination.
managedPersistenceMonitoringConfiguration (dict) --
The managed log persistence configuration for a job run.
enabled (boolean) --
Enables managed logging and defaults to true. If set to false, managed logging will be turned off.
encryptionKeyArn (string) --
The KMS key ARN to encrypt the logs stored in managed log persistence.
cloudWatchLoggingConfiguration (dict) --
The Amazon CloudWatch configuration for monitoring logs. You can configure your jobs to send log information to CloudWatch.
enabled (boolean) -- [REQUIRED]
Enables CloudWatch logging.
logGroupName (string) --
The name of the log group in Amazon CloudWatch Logs where you want to publish your logs.
logStreamNamePrefix (string) --
Prefix for the CloudWatch log stream name.
encryptionKeyArn (string) --
The Key Management Service (KMS) key ARN to encrypt the logs that you store in CloudWatch Logs.
logTypes (dict) --
The types of logs that you want to publish to CloudWatch. If you don't specify any log types, driver STDOUT and STDERR logs will be published to CloudWatch Logs by default. For more information including the supported worker types for Hive and Spark, see Logging for EMR Serverless with CloudWatch.
Key Valid Values: SPARK_DRIVER, SPARK_EXECUTOR, HIVE_DRIVER, TEZ_TASK
Array Members Valid Values: STDOUT, STDERR, HIVE_LOG, TEZ_AM, SYSTEM_LOGS
(string) --
Worker type for an analytics framework.
(list) --
(string) --
Log type for a Spark/Hive job-run.
prometheusMonitoringConfiguration (dict) --
The monitoring configuration object you can configure to send metrics to Amazon Managed Service for Prometheus for a job run.
remoteWriteUrl (string) --
The remote write URL in the Amazon Managed Service for Prometheus workspace to send metrics to.
dict
The tags assigned to the job run.
(string) --
(string) --
integer
The maximum duration for the job run to run. If the job run runs beyond this duration, it will be automatically cancelled.
string
The optional job run name. This doesn't have to be unique.
string
The mode of the job run when it starts.
dict
The retry policy when job run starts.
maxAttempts (integer) --
Maximum number of attempts for the job run. This parameter is only applicable for BATCH mode.
maxFailedAttemptsPerHour (integer) --
Maximum number of failed attempts per hour. This [arameter is only applicable for STREAMING mode.
dict
Response Syntax
{ 'applicationId': 'string', 'jobRunId': 'string', 'arn': 'string' }
Response Structure
(dict) --
applicationId (string) --
This output displays the application ID on which the job run was submitted.
jobRunId (string) --
The output contains the ID of the started job run.
arn (string) --
This output displays the ARN of the job run..