AWS DataSync

2019/08/05 - AWS DataSync - 3 updated api methods

Changes  Support VPC endpoints.

CreateAgent (updated) Link ¶
Changes (request)
{'SecurityGroupArns': ['string'],
 'SubnetArns': ['string'],
 'VpcEndpointId': 'string'}

Activates an AWS DataSync agent that you have deployed on your host. The activation process associates your agent with your account. In the activation process, you specify information such as the AWS Region that you want to activate the agent in. You activate the agent in the AWS Region where your target locations (in Amazon S3 or Amazon EFS) reside. Your tasks are created in this AWS Region.

You can activate the agent in a VPC (Virtual private Cloud) or provide the agent access to a VPC endpoint so you can run tasks without going over the public Internet.

You can use an agent for more than one location. If a task uses multiple agents, all of them need to have status AVAILABLE for the task to run. If you use multiple agents for a source location, the status of all the agents must be AVAILABLE for the task to run.

Agents are automatically updated by AWS on a regular basis, using a mechanism that ensures minimal interruption to your tasks.

See also: AWS API Documentation

Request Syntax

client.create_agent(
    ActivationKey='string',
    AgentName='string',
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ],
    VpcEndpointId='string',
    SubnetArns=[
        'string',
    ],
    SecurityGroupArns=[
        'string',
    ]
)
type ActivationKey

string

param ActivationKey

[REQUIRED]

Your agent activation key. You can get the activation key either by sending an HTTP GET request with redirects that enable you to get the agent IP address (port 80). Alternatively, you can get it from the AWS DataSync console.

The redirect URL returned in the response provides you the activation key for your agent in the query string parameter activationKey . It might also include other activation-related parameters; however, these are merely defaults. The arguments you pass to this API call determine the actual configuration of your agent.

For more information, see Activating an Agent in the AWS DataSync User Guide.

type AgentName

string

param AgentName

The name you configured for your agent. This value is a text reference that is used to identify the agent in the console.

type Tags

list

param Tags

The key-value pair that represents the tag that you want to associate with the agent. The value can be an empty string. This value helps you manage, filter, and search for your agents.

Note

Valid characters for key and value are letters, spaces, and numbers representable in UTF-8 format, and the following special characters: + - = . _ : / @.

  • (dict) --

    Represents a single entry in a list of AWS resource tags. TagListEntry returns an array that contains a list of tasks when the ListTagsForResource operation is called.

    • Key (string) -- [REQUIRED]

      The key for an AWS resource tag.

    • Value (string) --

      The value for an AWS resource tag.

type VpcEndpointId

string

param VpcEndpointId

The ID of the VPC (Virtual Private Cloud) endpoint that the agent has access to. This is the client-side VPC endpoint, also called a PrivateLink. If you don't have a PrivateLink VPC endpoint, see Creating a VPC Endpoint Service Configuration in the AWS VPC User Guide.

VPC endpoint ID looks like this: vpce-01234d5aff67890e1 .

type SubnetArns

list

param SubnetArns

The Amazon Resource Names (ARNs) of the subnets in which DataSync will create Elastic Network Interfaces (ENIs) for each data transfer task. The agent that runs a task must be private. When you start a task that is associated with an agent created in a VPC, or one that has access to an IP address in a VPC, then the task is also private. In this case, DataSync creates four ENIs for each task in your subnet. For a data transfer to work, the agent must be able to route to all these four ENIs.

  • (string) --

type SecurityGroupArns

list

param SecurityGroupArns

The ARNs of the security groups used to protect your data transfer task subnets. See CreateAgentRequest$SubnetArns.

  • (string) --

rtype

dict

returns

Response Syntax

{
    'AgentArn': 'string'
}

Response Structure

  • (dict) --

    CreateAgentResponse

    • AgentArn (string) --

      The Amazon Resource Name (ARN) of the agent. Use the ListAgents operation to return a list of agents for your account and AWS Region.

DescribeAgent (updated) Link ¶
Changes (response)
{'EndpointType': 'PUBLIC | PRIVATE_LINK',
 'PrivateLinkConfig': {'VpcEndpointId': 'string'}}

Returns metadata such as the name, the network interfaces, and the status (that is, whether the agent is running or not) for an agent. To specify which agent to describe, use the Amazon Resource Name (ARN) of the agent in your request.

See also: AWS API Documentation

Request Syntax

client.describe_agent(
    AgentArn='string'
)
type AgentArn

string

param AgentArn

[REQUIRED]

The Amazon Resource Name (ARN) of the agent to describe.

rtype

dict

returns

Response Syntax

{
    'AgentArn': 'string',
    'Name': 'string',
    'Status': 'ONLINE'|'OFFLINE',
    'LastConnectionTime': datetime(2015, 1, 1),
    'CreationTime': datetime(2015, 1, 1),
    'EndpointType': 'PUBLIC'|'PRIVATE_LINK',
    'PrivateLinkConfig': {
        'VpcEndpointId': 'string',
        'PrivateLinkEndpoint': 'string',
        'SubnetArns': [
            'string',
        ],
        'SecurityGroupArns': [
            'string',
        ]
    }
}

Response Structure

  • (dict) --

    DescribeAgentResponse

    • AgentArn (string) --

      The Amazon Resource Name (ARN) of the agent.

    • Name (string) --

      The name of the agent.

    • Status (string) --

      The status of the agent. If the status is ONLINE, then the agent is configured properly and is available to use. The Running status is the normal running status for an agent. If the status is OFFLINE, the agent's VM is turned off or the agent is in an unhealthy state. When the issue that caused the unhealthy state is resolved, the agent returns to ONLINE status.

    • LastConnectionTime (datetime) --

      The time that the agent last connected to DataSyc.

    • CreationTime (datetime) --

      The time that the agent was activated (that is, created in your account).

    • EndpointType (string) --

      The type of endpoint that your agent is connected to. If the endpoint is a VPC endpoint, the agent is not accessible over the public Internet.

    • PrivateLinkConfig (dict) --

      The subnet and the security group that DataSync used to access a VPC endpoint.

      • VpcEndpointId (string) --

        The ID of the VPC endpoint that is configured for an agent. An agent that is configured with a VPC endpoint will not be accessible over the public Internet.

      • PrivateLinkEndpoint (string) --

        The private endpoint that is configured for an agent that has access to IP addresses in a PrivateLink. An agent that is configured with this endpoint will not be accessible over the public Internet.

      • SubnetArns (list) --

        The Amazon Resource Names (ARNs) of the subnets that are configured for an agent activated in a VPC or an agent that has access to a VPC endpoint.

        • (string) --

      • SecurityGroupArns (list) --

        The Amazon Resource Names (ARNs) of the security groups that are configured for the EC2 resource that hosts an agent activated in a VPC or an agent that has access to a VPC endpoint.

        • (string) --

DescribeTask (updated) Link ¶
Changes (response)
{'DestinationNetworkInterfaceArns': ['string'],
 'SourceNetworkInterfaceArns': ['string']}

Returns metadata about a task.

See also: AWS API Documentation

Request Syntax

client.describe_task(
    TaskArn='string'
)
type TaskArn

string

param TaskArn

[REQUIRED]

The Amazon Resource Name (ARN) of the task to describe.

rtype

dict

returns

Response Syntax

{
    'TaskArn': 'string',
    'Status': 'AVAILABLE'|'CREATING'|'RUNNING'|'UNAVAILABLE',
    'Name': 'string',
    'CurrentTaskExecutionArn': 'string',
    'SourceLocationArn': 'string',
    'DestinationLocationArn': 'string',
    'CloudWatchLogGroupArn': 'string',
    'SourceNetworkInterfaceArns': [
        'string',
    ],
    'DestinationNetworkInterfaceArns': [
        'string',
    ],
    'Options': {
        'VerifyMode': 'POINT_IN_TIME_CONSISTENT'|'NONE',
        'Atime': 'NONE'|'BEST_EFFORT',
        'Mtime': 'NONE'|'PRESERVE',
        'Uid': 'NONE'|'INT_VALUE'|'NAME'|'BOTH',
        'Gid': 'NONE'|'INT_VALUE'|'NAME'|'BOTH',
        'PreserveDeletedFiles': 'PRESERVE'|'REMOVE',
        'PreserveDevices': 'NONE'|'PRESERVE',
        'PosixPermissions': 'NONE'|'BEST_EFFORT'|'PRESERVE',
        'BytesPerSecond': 123
    },
    'Excludes': [
        {
            'FilterType': 'SIMPLE_PATTERN',
            'Value': 'string'
        },
    ],
    'ErrorCode': 'string',
    'ErrorDetail': 'string',
    'CreationTime': datetime(2015, 1, 1)
}

Response Structure

  • (dict) --

    DescribeTaskResponse

    • TaskArn (string) --

      The Amazon Resource Name (ARN) of the task that was described.

    • Status (string) --

      The status of the task that was described.

      For detailed information about task execution statuses, see Understanding Task Statuses in the AWS DataSync User Guide.

    • Name (string) --

      The name of the task that was described.

    • CurrentTaskExecutionArn (string) --

      The Amazon Resource Name (ARN) of the task execution that is syncing files.

    • SourceLocationArn (string) --

      The Amazon Resource Name (ARN) of the source file system's location.

    • DestinationLocationArn (string) --

      The Amazon Resource Name (ARN) of the AWS storage resource's location.

    • CloudWatchLogGroupArn (string) --

      The Amazon Resource Name (ARN) of the Amazon CloudWatch log group that was used to monitor and log events in the task.

      For more information on these groups, see Working with Log Groups and Log Streams in the Amazon CloudWatch User Guide .

    • SourceNetworkInterfaceArns (list) --

      The Amazon Resource Name (ARN) of the source ENIs (Elastic Network Interface) that was created for your subnet.

      • (string) --

    • DestinationNetworkInterfaceArns (list) --

      The Amazon Resource Name (ARN) of the destination ENIs (Elastic Network Interface) that was created for your subnet.

      • (string) --

    • Options (dict) --

      The set of configuration options that control the behavior of a single execution of the task that occurs when you call StartTaskExecution . You can configure these options to preserve metadata such as user ID (UID) and group (GID), file permissions, data integrity verification, and so on.

      For each individual task execution, you can override these options by specifying the overriding OverrideOptions value to operation.

      • VerifyMode (string) --

        A value that determines whether a data integrity verification should be performed at the end of a task execution after all data and metadata have been transferred.

        Default value: POINT_IN_TIME_CONSISTENT.

        POINT_IN_TIME_CONSISTENT: Perform verification (recommended).

        NONE: Skip verification.

      • Atime (string) --

        A file metadata value that shows the last time a file was accessed (that is, when the file was read or written to). If you set Atime to BEST_EFFORT, DataSync attempts to preserve the original Atime attribute on all source files (that is, the version before the PREPARING phase). However, Atime 's behavior is not fully standard across platforms, so AWS DataSync can only do this on a best-effort basis.

        Default value: BEST_EFFORT.

        BEST_EFFORT: Attempt to preserve the per-file Atime value (recommended).

        NONE: Ignore Atime .

        Note

        If Atime is set to BEST_EFFORT, Mtime must be set to PRESERVE.

        If Atime is set to NONE, Mtime must also be NONE.

      • Mtime (string) --

        A value that indicates the last time that a file was modified (that is, a file was written to) before the PREPARING phase.

        Default value: PRESERVE.

        PRESERVE: Preserve original Mtime (recommended)

        NONE: Ignore Mtime .

        Note

        If Mtime is set to PRESERVE, Atime must be set to BEST_EFFORT.

        If Mtime is set to NONE, Atime must also be set to NONE.

      • Uid (string) --

        The user ID (UID) of the file's owner.

        Default value: INT_VALUE. This preserves the integer value of the ID.

        INT_VALUE: Preserve the integer value of UID and group ID (GID) (recommended).

        NONE: Ignore UID and GID.

      • Gid (string) --

        The group ID (GID) of the file's owners.

        Default value: INT_VALUE. This preserves the integer value of the ID.

        INT_VALUE: Preserve the integer value of user ID (UID) and GID (recommended).

        NONE: Ignore UID and GID.

      • PreserveDeletedFiles (string) --

        A value that specifies whether files in the destination that don't exist in the source file system should be preserved.

        Default value: PRESERVE.

        PRESERVE: Ignore such destination files (recommended).

        REMOVE: Delete destination files that aren’t present in the source.

      • PreserveDevices (string) --

        A value that determines whether AWS DataSync should preserve the metadata of block and character devices in the source file system, and recreate the files with that device name and metadata on the destination.

        Note

        AWS DataSync can't sync the actual contents of such devices, because they are nonterminal and don't return an end-of-file (EOF) marker.

        Default value: NONE.

        NONE: Ignore special devices (recommended).

        PRESERVE: Preserve character and block device metadata. This option isn't currently supported for Amazon EFS.

      • PosixPermissions (string) --

        A value that determines which users or groups can access a file for a specific purpose such as reading, writing, or execution of the file.

        Default value: PRESERVE.

        PRESERVE: Preserve POSIX-style permissions (recommended).

        NONE: Ignore permissions.

        Note

        AWS DataSync can preserve extant permissions of a source location.

      • BytesPerSecond (integer) --

        A value that limits the bandwidth used by AWS DataSync. For example, if you want AWS DataSync to use a maximum of 1 MB, set this value to 1048576 ( =1024*1024 ).

    • Excludes (list) --

      A list of filter rules that determines which files to exclude from a task. The list should contain a single filter string that consists of the patterns to exclude. The patterns are delimited by "|" (that is, a pipe), for example: "/folder1|/folder2"

      • (dict) --

        Specifies which files, folders and objects to include or exclude when transferring files from source to destination.

        • FilterType (string) --

          The type of filter rule to apply. AWS DataSync only supports the SIMPLE_PATTERN rule type.

        • Value (string) --

          A single filter string that consists of the patterns to include or exclude. The patterns are delimited by "|" (that is, a pipe), for example: /folder1|/folder2

    • ErrorCode (string) --

      Errors that AWS DataSync encountered during execution of the task. You can use this error code to help troubleshoot issues.

    • ErrorDetail (string) --

      Detailed description of an error that was encountered during the task execution. You can use this information to help troubleshoot issues.

    • CreationTime (datetime) --

      The time that the task was created.