AWS Data Pipeline

2014/11/25 - AWS Data Pipeline - 5 updated api methods

ActivatePipeline (updated) Link ¶
Changes (request)
{'parameterValues': [{'id': 'string', 'stringValue': 'string'}]}

Validates a pipeline and initiates processing. If the pipeline does not pass validation, activation fails. You cannot perform this operation on FINISHED pipelines and attempting to do so will return an InvalidRequestException.

Call this action to start processing pipeline tasks of a pipeline you've created using the CreatePipeline and PutPipelineDefinition actions. A pipeline cannot be modified after it has been successfully activated.

Request Syntax

client.activate_pipeline(
    pipelineId='string',
    parameterValues=[
        {
            'id': 'string',
            'stringValue': 'string'
        },
    ]
)
type pipelineId

string

param pipelineId

[REQUIRED]

The identifier of the pipeline to activate.

type parameterValues

list

param parameterValues

Returns a list of parameter values to pass to the pipeline at activation.

  • (dict) --

    A value or list of parameter values.

    • id (string) -- [REQUIRED]

      Identifier of the parameter value.

    • stringValue (string) -- [REQUIRED]

      The field value, expressed as a String.

rtype

dict

returns

Response Syntax

{}

Response Structure

  • (dict) --

    Contains the output from the ActivatePipeline action.

GetPipelineDefinition (updated) Link ¶
Changes (response)
{'parameterObjects': [{'attributes': [{'key': 'string',
                                       'stringValue': 'string'}],
                       'id': 'string'}],
 'parameterValues': [{'id': 'string', 'stringValue': 'string'}]}

Returns the definition of the specified pipeline. You can call GetPipelineDefinition to retrieve the pipeline definition you provided using PutPipelineDefinition.

Request Syntax

client.get_pipeline_definition(
    pipelineId='string',
    version='string'
)
type pipelineId

string

param pipelineId

[REQUIRED]

The identifier of the pipeline.

type version

string

param version

The version of the pipeline definition to retrieve. This parameter accepts the values latest (default) and active . Where latest indicates the last definition saved to the pipeline and active indicates the last definition of the pipeline that was activated.

rtype

dict

returns

Response Syntax

{
    'pipelineObjects': [
        {
            'id': 'string',
            'name': 'string',
            'fields': [
                {
                    'key': 'string',
                    'stringValue': 'string',
                    'refValue': 'string'
                },
            ]
        },
    ],
    'parameterObjects': [
        {
            'id': 'string',
            'attributes': [
                {
                    'key': 'string',
                    'stringValue': 'string'
                },
            ]
        },
    ],
    'parameterValues': [
        {
            'id': 'string',
            'stringValue': 'string'
        },
    ]
}

Response Structure

  • (dict) --

    Contains the output from the GetPipelineDefinition action.

    • pipelineObjects (list) --

      An array of objects defined in the pipeline.

      • (dict) --

        Contains information about a pipeline object. This can be a logical, physical, or physical attempt pipeline object. The complete set of components of a pipeline defines the pipeline.

        • id (string) --

          Identifier of the object.

        • name (string) --

          Name of the object.

        • fields (list) --

          Key-value pairs that define the properties of the object.

          • (dict) --

            A key-value pair that describes a property of a pipeline object. The value is specified as either a string value ( StringValue ) or a reference to another object ( RefValue ) but not as both.

            • key (string) --

              The field identifier.

            • stringValue (string) --

              The field value, expressed as a String.

            • refValue (string) --

              The field value, expressed as the identifier of another object.

    • parameterObjects (list) --

      Returns a list of parameter objects used in the pipeline definition.

      • (dict) --

        Contains information about a parameter object.

        • id (string) --

          Identifier of the parameter object.

        • attributes (list) --

          The attributes of the parameter object.

          • (dict) --

            The attributes allowed or specified with a parameter object.

            • key (string) --

              The field identifier.

            • stringValue (string) --

              The field value, expressed as a String.

    • parameterValues (list) --

      Returns a list of parameter values used in the pipeline definition.

      • (dict) --

        A value or list of parameter values.

        • id (string) --

          Identifier of the parameter value.

        • stringValue (string) --

          The field value, expressed as a String.

PutPipelineDefinition (updated) Link ¶
Changes (request)
{'parameterObjects': [{'attributes': [{'key': 'string',
                                       'stringValue': 'string'}],
                       'id': 'string'}],
 'parameterValues': [{'id': 'string', 'stringValue': 'string'}]}

Adds tasks, schedules, and preconditions that control the behavior of the pipeline. You can use PutPipelineDefinition to populate a new pipeline.

PutPipelineDefinition also validates the configuration as it adds it to the pipeline. Changes to the pipeline are saved unless one of the following three validation errors exists in the pipeline.

  • An object is missing a name or identifier field.

  • A string or reference field is empty.

  • The number of objects in the pipeline exceeds the maximum allowed objects.

  • The pipeline is in a FINISHED state.

Pipeline object definitions are passed to the PutPipelineDefinition action and returned by the GetPipelineDefinition action.

Request Syntax

client.put_pipeline_definition(
    pipelineId='string',
    pipelineObjects=[
        {
            'id': 'string',
            'name': 'string',
            'fields': [
                {
                    'key': 'string',
                    'stringValue': 'string',
                    'refValue': 'string'
                },
            ]
        },
    ],
    parameterObjects=[
        {
            'id': 'string',
            'attributes': [
                {
                    'key': 'string',
                    'stringValue': 'string'
                },
            ]
        },
    ],
    parameterValues=[
        {
            'id': 'string',
            'stringValue': 'string'
        },
    ]
)
type pipelineId

string

param pipelineId

[REQUIRED]

The identifier of the pipeline to be configured.

type pipelineObjects

list

param pipelineObjects

[REQUIRED]

The objects that define the pipeline. These will overwrite the existing pipeline definition.

  • (dict) --

    Contains information about a pipeline object. This can be a logical, physical, or physical attempt pipeline object. The complete set of components of a pipeline defines the pipeline.

    • id (string) -- [REQUIRED]

      Identifier of the object.

    • name (string) -- [REQUIRED]

      Name of the object.

    • fields (list) -- [REQUIRED]

      Key-value pairs that define the properties of the object.

      • (dict) --

        A key-value pair that describes a property of a pipeline object. The value is specified as either a string value ( StringValue ) or a reference to another object ( RefValue ) but not as both.

        • key (string) -- [REQUIRED]

          The field identifier.

        • stringValue (string) --

          The field value, expressed as a String.

        • refValue (string) --

          The field value, expressed as the identifier of another object.

type parameterObjects

list

param parameterObjects

A list of parameter objects used with the pipeline.

  • (dict) --

    Contains information about a parameter object.

    • id (string) -- [REQUIRED]

      Identifier of the parameter object.

    • attributes (list) -- [REQUIRED]

      The attributes of the parameter object.

      • (dict) --

        The attributes allowed or specified with a parameter object.

        • key (string) -- [REQUIRED]

          The field identifier.

        • stringValue (string) -- [REQUIRED]

          The field value, expressed as a String.

type parameterValues

list

param parameterValues

A list of parameter values used with the pipeline.

  • (dict) --

    A value or list of parameter values.

    • id (string) -- [REQUIRED]

      Identifier of the parameter value.

    • stringValue (string) -- [REQUIRED]

      The field value, expressed as a String.

rtype

dict

returns

Response Syntax

{
    'validationErrors': [
        {
            'id': 'string',
            'errors': [
                'string',
            ]
        },
    ],
    'validationWarnings': [
        {
            'id': 'string',
            'warnings': [
                'string',
            ]
        },
    ],
    'errored': True|False
}

Response Structure

  • (dict) --

    Contains the output of the PutPipelineDefinition action.

    • validationErrors (list) --

      A list of the validation errors that are associated with the objects defined in pipelineObjects .

      • (dict) --

        Defines a validation error returned by PutPipelineDefinition or ValidatePipelineDefinition. Validation errors prevent pipeline activation. The set of validation errors that can be returned are defined by AWS Data Pipeline.

        • id (string) --

          The identifier of the object that contains the validation error.

        • errors (list) --

          A description of the validation error.

          • (string) --

    • validationWarnings (list) --

      A list of the validation warnings that are associated with the objects defined in pipelineObjects .

      • (dict) --

        Defines a validation warning returned by PutPipelineDefinition or ValidatePipelineDefinition. Validation warnings do not prevent pipeline activation. The set of validation warnings that can be returned are defined by AWS Data Pipeline.

        • id (string) --

          The identifier of the object that contains the validation warning.

        • warnings (list) --

          A description of the validation warning.

          • (string) --

    • errored (boolean) --

      If True , there were validation errors. If errored is True , the pipeline definition is stored but cannot be activated until you correct the pipeline and call PutPipelineDefinition to commit the corrected pipeline.

ReportTaskProgress (updated) Link ¶
Changes (request)
{'fields': [{'key': 'string', 'refValue': 'string', 'stringValue': 'string'}]}

Updates the AWS Data Pipeline service on the progress of the calling task runner. When the task runner is assigned a task, it should call ReportTaskProgress to acknowledge that it has the task within 2 minutes. If the web service does not recieve this acknowledgement within the 2 minute window, it will assign the task in a subsequent PollForTask call. After this initial acknowledgement, the task runner only needs to report progress every 15 minutes to maintain its ownership of the task. You can change this reporting time from 15 minutes by specifying a reportProgressTimeout field in your pipeline. If a task runner does not report its status after 5 minutes, AWS Data Pipeline will assume that the task runner is unable to process the task and will reassign the task in a subsequent response to PollForTask. task runners should call ReportTaskProgress every 60 seconds.

Request Syntax

client.report_task_progress(
    taskId='string',
    fields=[
        {
            'key': 'string',
            'stringValue': 'string',
            'refValue': 'string'
        },
    ]
)
type taskId

string

param taskId

[REQUIRED]

Identifier of the task assigned to the task runner. This value is provided in the TaskObject that the service returns with the response for the PollForTask action.

type fields

list

param fields

Key-value pairs that define the properties of the ReportTaskProgressInput object.

  • (dict) --

    A key-value pair that describes a property of a pipeline object. The value is specified as either a string value ( StringValue ) or a reference to another object ( RefValue ) but not as both.

    • key (string) -- [REQUIRED]

      The field identifier.

    • stringValue (string) --

      The field value, expressed as a String.

    • refValue (string) --

      The field value, expressed as the identifier of another object.

rtype

dict

returns

Response Syntax

{
    'canceled': True|False
}

Response Structure

  • (dict) --

    Contains the output from the ReportTaskProgress action.

    • canceled (boolean) --

      If True , the calling task runner should cancel processing of the task. The task runner does not need to call SetTaskStatus for canceled tasks.

ValidatePipelineDefinition (updated) Link ¶
Changes (request)
{'parameterObjects': [{'attributes': [{'key': 'string',
                                       'stringValue': 'string'}],
                       'id': 'string'}],
 'parameterValues': [{'id': 'string', 'stringValue': 'string'}]}

Tests the pipeline definition with a set of validation checks to ensure that it is well formed and can run without error.

Request Syntax

client.validate_pipeline_definition(
    pipelineId='string',
    pipelineObjects=[
        {
            'id': 'string',
            'name': 'string',
            'fields': [
                {
                    'key': 'string',
                    'stringValue': 'string',
                    'refValue': 'string'
                },
            ]
        },
    ],
    parameterObjects=[
        {
            'id': 'string',
            'attributes': [
                {
                    'key': 'string',
                    'stringValue': 'string'
                },
            ]
        },
    ],
    parameterValues=[
        {
            'id': 'string',
            'stringValue': 'string'
        },
    ]
)
type pipelineId

string

param pipelineId

[REQUIRED]

Identifies the pipeline whose definition is to be validated.

type pipelineObjects

list

param pipelineObjects

[REQUIRED]

A list of objects that define the pipeline changes to validate against the pipeline.

  • (dict) --

    Contains information about a pipeline object. This can be a logical, physical, or physical attempt pipeline object. The complete set of components of a pipeline defines the pipeline.

    • id (string) -- [REQUIRED]

      Identifier of the object.

    • name (string) -- [REQUIRED]

      Name of the object.

    • fields (list) -- [REQUIRED]

      Key-value pairs that define the properties of the object.

      • (dict) --

        A key-value pair that describes a property of a pipeline object. The value is specified as either a string value ( StringValue ) or a reference to another object ( RefValue ) but not as both.

        • key (string) -- [REQUIRED]

          The field identifier.

        • stringValue (string) --

          The field value, expressed as a String.

        • refValue (string) --

          The field value, expressed as the identifier of another object.

type parameterObjects

list

param parameterObjects

A list of parameter objects used with the pipeline.

  • (dict) --

    Contains information about a parameter object.

    • id (string) -- [REQUIRED]

      Identifier of the parameter object.

    • attributes (list) -- [REQUIRED]

      The attributes of the parameter object.

      • (dict) --

        The attributes allowed or specified with a parameter object.

        • key (string) -- [REQUIRED]

          The field identifier.

        • stringValue (string) -- [REQUIRED]

          The field value, expressed as a String.

type parameterValues

list

param parameterValues

A list of parameter values used with the pipeline.

  • (dict) --

    A value or list of parameter values.

    • id (string) -- [REQUIRED]

      Identifier of the parameter value.

    • stringValue (string) -- [REQUIRED]

      The field value, expressed as a String.

rtype

dict

returns

Response Syntax

{
    'validationErrors': [
        {
            'id': 'string',
            'errors': [
                'string',
            ]
        },
    ],
    'validationWarnings': [
        {
            'id': 'string',
            'warnings': [
                'string',
            ]
        },
    ],
    'errored': True|False
}

Response Structure

  • (dict) --

    Contains the output from the ValidatePipelineDefinition action.

    • validationErrors (list) --

      Lists the validation errors that were found by ValidatePipelineDefinition.

      • (dict) --

        Defines a validation error returned by PutPipelineDefinition or ValidatePipelineDefinition. Validation errors prevent pipeline activation. The set of validation errors that can be returned are defined by AWS Data Pipeline.

        • id (string) --

          The identifier of the object that contains the validation error.

        • errors (list) --

          A description of the validation error.

          • (string) --

    • validationWarnings (list) --

      Lists the validation warnings that were found by ValidatePipelineDefinition.

      • (dict) --

        Defines a validation warning returned by PutPipelineDefinition or ValidatePipelineDefinition. Validation warnings do not prevent pipeline activation. The set of validation warnings that can be returned are defined by AWS Data Pipeline.

        • id (string) --

          The identifier of the object that contains the validation warning.

        • warnings (list) --

          A description of the validation warning.

          • (string) --

    • errored (boolean) --

      If True , there were validation errors.