AWS DataSync

2022/12/16 - AWS DataSync - 3 updated api methods

Changes  AWS DataSync now supports the use of tags with task executions. With this new feature, you can apply tags each time you execute a task, giving you greater control and management over your task executions.

CreateLocationS3 (updated) Link ¶
Changes (request)
{'S3StorageClass': {'GLACIER_INSTANT_RETRIEVAL'}}

Creates an endpoint for an Amazon S3 bucket that DataSync can access for a transfer.

For more information, see Create an Amazon S3 location in the DataSync User Guide .

See also: AWS API Documentation

Request Syntax

client.create_location_s3(
    Subdirectory='string',
    S3BucketArn='string',
    S3StorageClass='STANDARD'|'STANDARD_IA'|'ONEZONE_IA'|'INTELLIGENT_TIERING'|'GLACIER'|'DEEP_ARCHIVE'|'OUTPOSTS'|'GLACIER_INSTANT_RETRIEVAL',
    S3Config={
        'BucketAccessRoleArn': 'string'
    },
    AgentArns=[
        'string',
    ],
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
type Subdirectory

string

param Subdirectory

A subdirectory in the Amazon S3 bucket. This subdirectory in Amazon S3 is used to read data from the S3 source location or write data to the S3 destination.

type S3BucketArn

string

param S3BucketArn

[REQUIRED]

The ARN of the Amazon S3 bucket. If the bucket is on an Amazon Web Services Outpost, this must be an access point ARN.

type S3StorageClass

string

param S3StorageClass

The Amazon S3 storage class that you want to store your files in when this location is used as a task destination. For buckets in Amazon Web Services Regions, the storage class defaults to Standard. For buckets on Outposts, the storage class defaults to Amazon Web Services S3 Outposts.

For more information about S3 storage classes, see Amazon S3 Storage Classes. Some storage classes have behaviors that can affect your S3 storage cost. For detailed information, see Considerations when working with S3 storage classes in DataSync.

type S3Config

dict

param S3Config

[REQUIRED]

The Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role used to access an Amazon S3 bucket.

For detailed information about using such a role, see Creating a Location for Amazon S3 in the DataSync User Guide .

  • BucketAccessRoleArn (string) -- [REQUIRED]

    The ARN of the IAM role for accessing the S3 bucket.

type AgentArns

list

param AgentArns

If you're using DataSync on an Amazon Web Services Outpost, specify the Amazon Resource Names (ARNs) of the DataSync agents deployed on your Outpost. For more information about launching a DataSync agent on an Amazon Web Services Outpost, see Deploy your DataSync agent on Outposts.

  • (string) --

type Tags

list

param Tags

The key-value pair that represents the tag that you want to add to the location. The value can be an empty string. We recommend using tags to name your resources.

  • (dict) --

    A key-value pair representing a single tag that's been applied to an Amazon Web Services resource.

    • Key (string) -- [REQUIRED]

      The key for an Amazon Web Services resource tag.

    • Value (string) --

      The value for an Amazon Web Services resource tag.

rtype

dict

returns

Response Syntax

{
    'LocationArn': 'string'
}

Response Structure

  • (dict) --

    CreateLocationS3Response

    • LocationArn (string) --

      The Amazon Resource Name (ARN) of the source Amazon S3 bucket location that is created.

DescribeLocationS3 (updated) Link ¶
Changes (response)
{'S3StorageClass': {'GLACIER_INSTANT_RETRIEVAL'}}

Returns metadata, such as bucket name, about an Amazon S3 bucket location.

See also: AWS API Documentation

Request Syntax

client.describe_location_s3(
    LocationArn='string'
)
type LocationArn

string

param LocationArn

[REQUIRED]

The Amazon Resource Name (ARN) of the Amazon S3 bucket location to describe.

rtype

dict

returns

Response Syntax

{
    'LocationArn': 'string',
    'LocationUri': 'string',
    'S3StorageClass': 'STANDARD'|'STANDARD_IA'|'ONEZONE_IA'|'INTELLIGENT_TIERING'|'GLACIER'|'DEEP_ARCHIVE'|'OUTPOSTS'|'GLACIER_INSTANT_RETRIEVAL',
    'S3Config': {
        'BucketAccessRoleArn': 'string'
    },
    'AgentArns': [
        'string',
    ],
    'CreationTime': datetime(2015, 1, 1)
}

Response Structure

  • (dict) --

    DescribeLocationS3Response

    • LocationArn (string) --

      The Amazon Resource Name (ARN) of the Amazon S3 bucket or access point.

    • LocationUri (string) --

      The URL of the Amazon S3 location that was described.

    • S3StorageClass (string) --

      The Amazon S3 storage class that you chose to store your files in when this location is used as a task destination. For more information about S3 storage classes, see Amazon S3 Storage Classes. Some storage classes have behaviors that can affect your S3 storage cost. For detailed information, see Considerations when working with S3 storage classes in DataSync.

    • S3Config (dict) --

      The Amazon Resource Name (ARN) of the Identity and Access Management (IAM) role used to access an Amazon S3 bucket.

      For detailed information about using such a role, see Creating a Location for Amazon S3 in the DataSync User Guide .

      • BucketAccessRoleArn (string) --

        The ARN of the IAM role for accessing the S3 bucket.

    • AgentArns (list) --

      If you are using DataSync on an Amazon Web Services Outpost, the Amazon Resource Name (ARNs) of the EC2 agents deployed on your Outpost. For more information about launching a DataSync agent on an Amazon Web Services Outpost, see Deploy your DataSync agent on Outposts.

      • (string) --

    • CreationTime (datetime) --

      The time that the Amazon S3 bucket location was created.

StartTaskExecution (updated) Link ¶
Changes (request)
{'Tags': [{'Key': 'string', 'Value': 'string'}]}

Starts an DataSync task. For each task, you can only run one task execution at a time.

There are several phases to a task execution. For more information, see Task execution statuses.

See also: AWS API Documentation

Request Syntax

client.start_task_execution(
    TaskArn='string',
    OverrideOptions={
        'VerifyMode': 'POINT_IN_TIME_CONSISTENT'|'ONLY_FILES_TRANSFERRED'|'NONE',
        'OverwriteMode': 'ALWAYS'|'NEVER',
        'Atime': 'NONE'|'BEST_EFFORT',
        'Mtime': 'NONE'|'PRESERVE',
        'Uid': 'NONE'|'INT_VALUE'|'NAME'|'BOTH',
        'Gid': 'NONE'|'INT_VALUE'|'NAME'|'BOTH',
        'PreserveDeletedFiles': 'PRESERVE'|'REMOVE',
        'PreserveDevices': 'NONE'|'PRESERVE',
        'PosixPermissions': 'NONE'|'PRESERVE',
        'BytesPerSecond': 123,
        'TaskQueueing': 'ENABLED'|'DISABLED',
        'LogLevel': 'OFF'|'BASIC'|'TRANSFER',
        'TransferMode': 'CHANGED'|'ALL',
        'SecurityDescriptorCopyFlags': 'NONE'|'OWNER_DACL'|'OWNER_DACL_SACL',
        'ObjectTags': 'PRESERVE'|'NONE'
    },
    Includes=[
        {
            'FilterType': 'SIMPLE_PATTERN',
            'Value': 'string'
        },
    ],
    Excludes=[
        {
            'FilterType': 'SIMPLE_PATTERN',
            'Value': 'string'
        },
    ],
    Tags=[
        {
            'Key': 'string',
            'Value': 'string'
        },
    ]
)
type TaskArn

string

param TaskArn

[REQUIRED]

Specifies the Amazon Resource Name (ARN) of the task that you want to start.

type OverrideOptions

dict

param OverrideOptions

Configures your DataSync task settings. These options include how DataSync handles files, objects, and their associated metadata. You also can specify how DataSync verifies data integrity, set bandwidth limits for your task, among other options.

Each task setting has a default value. Unless you need to, you don't have to configure any of these Options before starting your task.

  • VerifyMode (string) --

    Specifies how and when DataSync checks the integrity of your data during a transfer.

    Default value: POINT_IN_TIME_CONSISTENT

    ONLY_FILES_TRANSFERRED (recommended): DataSync calculates the checksum of transferred files and metadata at the source location. At the end of the transfer, DataSync then compares this checksum to the checksum calculated on those files at the destination.

    We recommend this option when transferring to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. For more information, see Storage class considerations with Amazon S3 locations.

    POINT_IN_TIME_CONSISTENT : At the end of the transfer, DataSync scans the entire source and destination to verify that both locations are fully synchronized.

    You can't use this option when transferring to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. For more information, see Storage class considerations with Amazon S3 locations.

    NONE : DataSync doesn't run additional verification at the end of the transfer. All data transmissions are still integrity-checked with checksum verification during the transfer.

  • OverwriteMode (string) --

    Specifies whether data at the destination location should be overwritten or preserved. If set to NEVER , a destination file for example will not be replaced by a source file (even if the destination file differs from the source file). If you modify files in the destination and you sync the files, you can use this value to protect against overwriting those changes.

    Some storage classes have specific behaviors that can affect your Amazon S3 storage cost. For detailed information, see Considerations when working with Amazon S3 storage classes in DataSync.

  • Atime (string) --

    Specifies whether to preserve metadata indicating the last time a file was read or written to. If you set Atime to BEST_EFFORT , DataSync attempts to preserve the original Atime attribute on all source files (that is, the version before the PREPARING phase of the task execution).

    Note

    The behavior of Atime isn't fully standard across platforms, so DataSync can only do this on a best-effort basis.

    Default value: BEST_EFFORT

    BEST_EFFORT : Attempt to preserve the per-file Atime value (recommended).

    NONE : Ignore Atime .

    Note

    If Atime is set to BEST_EFFORT , Mtime must be set to PRESERVE .

    If Atime is set to NONE , Mtime must also be NONE .

  • Mtime (string) --

    Specifies whether to preserve metadata indicating the last time that a file was written to before the PREPARING phase of your task execution. This option is required when you need to run the a task more than once.

    Default Value: PRESERVE

    PRESERVE : Preserve original Mtime (recommended)

    NONE : Ignore Mtime .

    Note

    If Mtime is set to PRESERVE , Atime must be set to BEST_EFFORT .

    If Mtime is set to NONE , Atime must also be set to NONE .

  • Uid (string) --

    Specifies the POSIX user ID (UID) of the file's owner.

    For more information, see Metadata copied by DataSync.

    Default value: INT_VALUE . This preserves the integer value of the ID.

    INT_VALUE : Preserve the integer value of UID and group ID (GID) (recommended).

    NONE : Ignore UID and GID.

  • Gid (string) --

    Specifies the POSIX group ID (GID) of the file's owners.

    For more information, see Metadata copied by DataSync.

    Default value: INT_VALUE . This preserves the integer value of the ID.

    INT_VALUE : Preserve the integer value of user ID (UID) and GID (recommended).

    NONE : Ignore UID and GID.

  • PreserveDeletedFiles (string) --

    Specifies whether files in the destination location that don't exist in the source should be preserved. This option can affect your Amazon S3 storage cost. If your task deletes objects, you might incur minimum storage duration charges for certain storage classes. For detailed information, see Considerations when working with Amazon S3 storage classes in DataSync.

    Default value: PRESERVE

    PRESERVE : Ignore such destination files (recommended).

    REMOVE : Delete destination files that aren’t present in the source.

  • PreserveDevices (string) --

    Specifies whether DataSync should preserve the metadata of block and character devices in the source location and recreate the files with that device name and metadata on the destination. DataSync copies only the name and metadata of such devices.

    Note

    DataSync can't copy the actual contents of these devices because they're nonterminal and don't return an end-of-file (EOF) marker.

    Default value: NONE

    NONE : Ignore special devices (recommended).

    PRESERVE : Preserve character and block device metadata. This option currently isn't supported for Amazon EFS.

  • PosixPermissions (string) --

    Specifies which users or groups can access a file for a specific purpose such as reading, writing, or execution of the file.

    For more information, see Metadata copied by DataSync.

    Default value: PRESERVE

    PRESERVE : Preserve POSIX-style permissions (recommended).

    NONE : Ignore permissions.

    Note

    DataSync can preserve extant permissions of a source location.

  • BytesPerSecond (integer) --

    Limits the bandwidth used by a DataSync task. For example, if you want DataSync to use a maximum of 1 MB, set this value to 1048576 ( =1024*1024 ).

  • TaskQueueing (string) --

    Specifies whether tasks should be queued before executing the tasks. The default is ENABLED , which means the tasks will be queued.

    If you use the same agent to run multiple tasks, you can enable the tasks to run in series. For more information, see Queueing task executions.

  • LogLevel (string) --

    Specifies the type of logs that DataSync publishes to a Amazon CloudWatch Logs log group. To specify the log group, see CloudWatchLogGroupArn.

    If you set LogLevel to OFF , no logs are published. BASIC publishes logs on errors for individual files transferred. TRANSFER publishes logs for every file or object that is transferred and integrity checked.

  • TransferMode (string) --

    Determines whether DataSync transfers only the data and metadata that differ between the source and the destination location or transfers all the content from the source (without comparing what's in the destination).

    CHANGED : DataSync copies only data or metadata that is new or different content from the source location to the destination location.

    ALL : DataSync copies all source location content to the destination (without comparing what's in the destination).

  • SecurityDescriptorCopyFlags (string) --

    Specifies which components of the SMB security descriptor are copied from source to destination objects.

    This value is only used for transfers between SMB and Amazon FSx for Windows File Server locations or between two FSx for Windows File Server locations. For more information, see how DataSync handles metadata.

    Default value: OWNER_DACL

    OWNER_DACL : For each copied object, DataSync copies the following metadata:

    • The object owner.

    • NTFS discretionary access control lists (DACLs), which determine whether to grant access to an object. DataSync won't copy NTFS system access control lists (SACLs) with this option.

    OWNER_DACL_SACL : For each copied object, DataSync copies the following metadata:

    • The object owner.

    • NTFS discretionary access control lists (DACLs), which determine whether to grant access to an object.

    • SACLs, which are used by administrators to log attempts to access a secured object. Copying SACLs requires granting additional permissions to the Windows user that DataSync uses to access your SMB location. For information about choosing a user that ensures sufficient permissions to files, folders, and metadata, see user.

    NONE : None of the SMB security descriptor components are copied. Destination objects are owned by the user that was provided for accessing the destination location. DACLs and SACLs are set based on the destination server’s configuration.

  • ObjectTags (string) --

    Specifies whether object tags are preserved when transferring between object storage systems. If you want your DataSync task to ignore object tags, specify the NONE value.

    Default Value: PRESERVE

type Includes

list

param Includes

Specifies a list of filter rules that determines which files to include when running a task. The pattern should contain a single filter string that consists of the patterns to include. The patterns are delimited by "|" (that is, a pipe), for example, "/folder1|/folder2" .

  • (dict) --

    Specifies which files, folders, and objects to include or exclude when transferring files from source to destination.

    • FilterType (string) --

      The type of filter rule to apply. DataSync only supports the SIMPLE_PATTERN rule type.

    • Value (string) --

      A single filter string that consists of the patterns to include or exclude. The patterns are delimited by "|" (that is, a pipe), for example: /folder1|/folder2

type Excludes

list

param Excludes

Specifies a list of filter rules that determines which files to exclude from a task. The list contains a single filter string that consists of the patterns to exclude. The patterns are delimited by "|" (that is, a pipe), for example, "/folder1|/folder2" .

  • (dict) --

    Specifies which files, folders, and objects to include or exclude when transferring files from source to destination.

    • FilterType (string) --

      The type of filter rule to apply. DataSync only supports the SIMPLE_PATTERN rule type.

    • Value (string) --

      A single filter string that consists of the patterns to include or exclude. The patterns are delimited by "|" (that is, a pipe), for example: /folder1|/folder2

type Tags

list

param Tags

Specifies the tags that you want to apply to the Amazon Resource Name (ARN) representing the task execution.

Tags are key-value pairs that help you manage, filter, and search for your DataSync resources.

  • (dict) --

    A key-value pair representing a single tag that's been applied to an Amazon Web Services resource.

    • Key (string) -- [REQUIRED]

      The key for an Amazon Web Services resource tag.

    • Value (string) --

      The value for an Amazon Web Services resource tag.

rtype

dict

returns

Response Syntax

{
    'TaskExecutionArn': 'string'
}

Response Structure

  • (dict) --

    StartTaskExecutionResponse

    • TaskExecutionArn (string) --

      The ARN of the running task execution.