AWS API Changes

2025/12/01 - AWS Clean Rooms ML - 2 updated api methods

Changes AWS Clean Rooms ML now supports privacy-enhancing synthetic dataset generation for custom ML training.

GetCollaborationMLInputChannel (updated)

Link ¶
Changes (response)

{'syntheticDataConfiguration': {'syntheticDataEvaluationScores': {'dataPrivacyScores': {'membershipInferenceAttackScores': [{'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
                                                                                                                             'score': 'double'}]}},
                                'syntheticDataParameters': {'columnClassification': {'columnMapping': [{'columnName': 'string',
                                                                                                        'columnType': 'CATEGORICAL '
                                                                                                                      '| '
                                                                                                                      'NUMERICAL',
                                                                                                        'isPredictiveValue': 'boolean'}]},
                                                            'epsilon': 'double',
                                                            'maxMembershipInferenceAttackScore': 'double'}}}

Returns information about a specific ML input channel in a collaboration.

See also: AWS API Documentation

Request Syntax

client.get_collaboration_ml_input_channel(
    mlInputChannelArn='string',
    collaborationIdentifier='string'
)

type mlInputChannelArn:

string

param mlInputChannelArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the ML input channel that you want to get.

type collaborationIdentifier:

string

param collaborationIdentifier:

[REQUIRED]

The collaboration ID of the collaboration that contains the ML input channel that you want to get.

rtype:

dict

returns:

Response Syntax

{
    'membershipIdentifier': 'string',
    'collaborationIdentifier': 'string',
    'mlInputChannelArn': 'string',
    'name': 'string',
    'configuredModelAlgorithmAssociations': [
        'string',
    ],
    'status': 'CREATE_PENDING'|'CREATE_IN_PROGRESS'|'CREATE_FAILED'|'ACTIVE'|'DELETE_PENDING'|'DELETE_IN_PROGRESS'|'DELETE_FAILED'|'INACTIVE',
    'statusDetails': {
        'statusCode': 'string',
        'message': 'string'
    },
    'retentionInDays': 123,
    'numberOfRecords': 123,
    'privacyBudgets': {
        'accessBudgets': [
            {
                'resourceArn': 'string',
                'details': [
                    {
                        'startTime': datetime(2015, 1, 1),
                        'endTime': datetime(2015, 1, 1),
                        'remainingBudget': 123,
                        'budget': 123,
                        'budgetType': 'CALENDAR_DAY'|'CALENDAR_MONTH'|'CALENDAR_WEEK'|'LIFETIME',
                        'autoRefresh': 'ENABLED'|'DISABLED'
                    },
                ],
                'aggregateRemainingBudget': 123
            },
        ]
    },
    'description': 'string',
    'syntheticDataConfiguration': {
        'syntheticDataParameters': {
            'epsilon': 123.0,
            'maxMembershipInferenceAttackScore': 123.0,
            'columnClassification': {
                'columnMapping': [
                    {
                        'columnName': 'string',
                        'columnType': 'CATEGORICAL'|'NUMERICAL',
                        'isPredictiveValue': True|False
                    },
                ]
            }
        },
        'syntheticDataEvaluationScores': {
            'dataPrivacyScores': {
                'membershipInferenceAttackScores': [
                    {
                        'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
                        'score': 123.0
                    },
                ]
            }
        }
    },
    'createTime': datetime(2015, 1, 1),
    'updateTime': datetime(2015, 1, 1),
    'creatorAccountId': 'string'
}

Response Structure

(dict) --
- membershipIdentifier (string) --
  
  The membership ID of the membership that contains the ML input channel.
- collaborationIdentifier (string) --
  
  The collaboration ID of the collaboration that contains the ML input channel.
- mlInputChannelArn (string) --
  
  The Amazon Resource Name (ARN) of the ML input channel.
- name (string) --
  
  The name of the ML input channel.
- configuredModelAlgorithmAssociations (list) --
  
  The configured model algorithm associations that were used to create the ML input channel.
  - (string) --
- status (string) --
  
  The status of the ML input channel.
- statusDetails (dict) --
  
  Details about the status of a resource.
  - statusCode (string) --
    
    The status code that was returned. The status code is intended for programmatic error handling. Clean Rooms ML will not change the status code for existing error conditions.
  - message (string) --
    
    The error message that was returned. The message is intended for human consumption and can change at any time. Use the statusCode for programmatic error handling.
- retentionInDays (integer) --
  
  The number of days to retain the data for the ML input channel.
- numberOfRecords (integer) --
  
  The number of records in the ML input channel.
- privacyBudgets (dict) --
  
  Returns the privacy budgets that control access to this Clean Rooms ML input channel. Use these budgets to monitor and limit resource consumption over specified time periods.
  Note
  
  This is a Tagged Union structure. Only one of the following top level keys will be set: accessBudgets. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
```
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
```
  - accessBudgets (list) --
    
    A list of access budgets that apply to resources associated with this Clean Rooms ML input channel.
    - (dict) --
      
      An access budget that defines consumption limits for a specific resource within defined time periods.
      - resourceArn (string) --
        
        The Amazon Resource Name (ARN) of the resource that this access budget applies to.
      - details (list) --
        
        A list of budget details for this resource. Contains active budget periods that apply to the resource.
        
        (dict) --
        
        The detailed information for a specific budget period, including time boundaries and budget amounts.
        
        startTime (datetime) --
        
        The start time of this budget period.
        
        endTime (datetime) --
        
        The end time of this budget period. If not specified, the budget period continues indefinitely.
        
        remainingBudget (integer) --
        
        The amount of budget remaining in this period.
        
        budget (integer) --
        
        The total budget amount allocated for this period.
        
        budgetType (string) --
        
        The type of budget period. Calendar-based types reset automatically at regular intervals, while LIFETIME budgets never reset.
        
        autoRefresh (string) --
        
        Specifies whether this budget automatically refreshes when the current period ends.
      - aggregateRemainingBudget (integer) --
        
        The total remaining budget across all active budget periods for this resource.
- description (string) --
  
  The description of the ML input channel.
- syntheticDataConfiguration (dict) --
  
  The synthetic data configuration for this ML input channel, including parameters for generating privacy-preserving synthetic data and evaluation scores for measuring the privacy of the generated data.
  - syntheticDataParameters (dict) --
    
    The parameters that control how synthetic data is generated, including privacy settings, column classifications, and other configuration options that affect the data synthesis process.
    - epsilon (float) --
      
      The epsilon value for differential privacy, which controls the privacy-utility tradeoff in synthetic data generation. Lower values provide stronger privacy guarantees but may reduce data utility.
    - maxMembershipInferenceAttackScore (float) --
      
      The maximum acceptable score for membership inference attack vulnerability. Synthetic data generation fails if the score for the resulting data exceeds this threshold.
    - columnClassification (dict) --
      
      Classification details for data columns that specify how each column should be treated during synthetic data generation.
      - columnMapping (list) --
        
        A mapping that defines the classification of data columns for synthetic data generation and specifies how each column should be handled during the privacy-preserving data synthesis process.
        
        (dict) --
        
        Properties that define how a specific data column should be handled during synthetic data generation, including its name, type, and role in predictive modeling.
        
        columnName (string) --
        
        The name of the data column as it appears in the dataset.
        
        columnType (string) --
        
        The data type of the column, which determines how the synthetic data generation algorithm processes and synthesizes values for this column.
        
        isPredictiveValue (boolean) --
        
        Indicates if this column contains predictive values that should be treated as target variables in machine learning models. This affects how the synthetic data generation preserves statistical relationships.
  - syntheticDataEvaluationScores (dict) --
    
    Evaluation scores that assess the quality and privacy characteristics of the generated synthetic data, providing metrics on data utility and privacy preservation.
    - dataPrivacyScores (dict) --
      
      Privacy-specific evaluation scores that measure how well the synthetic data protects individual privacy, including assessments of potential privacy risks such as membership inference attacks.
      - membershipInferenceAttackScores (list) --
        
        Scores that evaluate the vulnerability of the synthetic data to membership inference attacks, which attempt to determine whether a specific individual was a member of the original dataset.
        
        (dict) --
        
        A score that measures the vulnerability of synthetic data to membership inference attacks and provides both the numerical score and the version of the attack methodology used for evaluation.
        
        attackVersion (string) --
        
        The version of the membership inference attack, which consists of the attack type and its version number, used to generate this privacy score.
        
        score (float) --
        
        The numerical score representing the vulnerability to membership inference attacks.
- createTime (datetime) --
  
  The time at which the ML input channel was created.
- updateTime (datetime) --
  
  The most recent time at which the ML input channel was updated.
- creatorAccountId (string) --
  
  The account ID of the member who created the ML input channel.

GetMLInputChannel (updated)

Link ¶
Changes (response)

{'syntheticDataConfiguration': {'syntheticDataEvaluationScores': {'dataPrivacyScores': {'membershipInferenceAttackScores': [{'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
                                                                                                                             'score': 'double'}]}},
                                'syntheticDataParameters': {'columnClassification': {'columnMapping': [{'columnName': 'string',
                                                                                                        'columnType': 'CATEGORICAL '
                                                                                                                      '| '
                                                                                                                      'NUMERICAL',
                                                                                                        'isPredictiveValue': 'boolean'}]},
                                                            'epsilon': 'double',
                                                            'maxMembershipInferenceAttackScore': 'double'}}}

Returns information about an ML input channel.

See also: AWS API Documentation

Request Syntax

client.get_ml_input_channel(
    mlInputChannelArn='string',
    membershipIdentifier='string'
)

type mlInputChannelArn:

string

param mlInputChannelArn:

[REQUIRED]

The Amazon Resource Name (ARN) of the ML input channel that you want to get.

type membershipIdentifier:

string

param membershipIdentifier:

[REQUIRED]

The membership ID of the membership that contains the ML input channel that you want to get.

rtype:

dict

returns:

Response Syntax

{
    'membershipIdentifier': 'string',
    'collaborationIdentifier': 'string',
    'mlInputChannelArn': 'string',
    'name': 'string',
    'configuredModelAlgorithmAssociations': [
        'string',
    ],
    'status': 'CREATE_PENDING'|'CREATE_IN_PROGRESS'|'CREATE_FAILED'|'ACTIVE'|'DELETE_PENDING'|'DELETE_IN_PROGRESS'|'DELETE_FAILED'|'INACTIVE',
    'statusDetails': {
        'statusCode': 'string',
        'message': 'string'
    },
    'retentionInDays': 123,
    'numberOfRecords': 123,
    'privacyBudgets': {
        'accessBudgets': [
            {
                'resourceArn': 'string',
                'details': [
                    {
                        'startTime': datetime(2015, 1, 1),
                        'endTime': datetime(2015, 1, 1),
                        'remainingBudget': 123,
                        'budget': 123,
                        'budgetType': 'CALENDAR_DAY'|'CALENDAR_MONTH'|'CALENDAR_WEEK'|'LIFETIME',
                        'autoRefresh': 'ENABLED'|'DISABLED'
                    },
                ],
                'aggregateRemainingBudget': 123
            },
        ]
    },
    'description': 'string',
    'syntheticDataConfiguration': {
        'syntheticDataParameters': {
            'epsilon': 123.0,
            'maxMembershipInferenceAttackScore': 123.0,
            'columnClassification': {
                'columnMapping': [
                    {
                        'columnName': 'string',
                        'columnType': 'CATEGORICAL'|'NUMERICAL',
                        'isPredictiveValue': True|False
                    },
                ]
            }
        },
        'syntheticDataEvaluationScores': {
            'dataPrivacyScores': {
                'membershipInferenceAttackScores': [
                    {
                        'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
                        'score': 123.0
                    },
                ]
            }
        }
    },
    'createTime': datetime(2015, 1, 1),
    'updateTime': datetime(2015, 1, 1),
    'inputChannel': {
        'dataSource': {
            'protectedQueryInputParameters': {
                'sqlParameters': {
                    'queryString': 'string',
                    'analysisTemplateArn': 'string',
                    'parameters': {
                        'string': 'string'
                    }
                },
                'computeConfiguration': {
                    'worker': {
                        'type': 'CR.1X'|'CR.4X',
                        'number': 123
                    }
                },
                'resultFormat': 'CSV'|'PARQUET'
            }
        },
        'roleArn': 'string'
    },
    'protectedQueryIdentifier': 'string',
    'numberOfFiles': 123.0,
    'sizeInGb': 123.0,
    'kmsKeyArn': 'string',
    'tags': {
        'string': 'string'
    }
}

Response Structure

(dict) --
- membershipIdentifier (string) --
  
  The membership ID of the membership that contains the ML input channel.
- collaborationIdentifier (string) --
  
  The collaboration ID of the collaboration that contains the ML input channel.
- mlInputChannelArn (string) --
  
  The Amazon Resource Name (ARN) of the ML input channel.
- name (string) --
  
  The name of the ML input channel.
- configuredModelAlgorithmAssociations (list) --
  
  The configured model algorithm associations that were used to create the ML input channel.
  - (string) --
- status (string) --
  
  The status of the ML input channel.
- statusDetails (dict) --
  
  Details about the status of a resource.
  - statusCode (string) --
    
    The status code that was returned. The status code is intended for programmatic error handling. Clean Rooms ML will not change the status code for existing error conditions.
  - message (string) --
    
    The error message that was returned. The message is intended for human consumption and can change at any time. Use the statusCode for programmatic error handling.
- retentionInDays (integer) --
  
  The number of days to keep the data in the ML input channel.
- numberOfRecords (integer) --
  
  The number of records in the ML input channel.
- privacyBudgets (dict) --
  
  Returns the privacy budgets that control access to this Clean Rooms ML input channel. Use these budgets to monitor and limit resource consumption over specified time periods.
  Note
  
  This is a Tagged Union structure. Only one of the following top level keys will be set: accessBudgets. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
```
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
```
  - accessBudgets (list) --
    
    A list of access budgets that apply to resources associated with this Clean Rooms ML input channel.
    - (dict) --
      
      An access budget that defines consumption limits for a specific resource within defined time periods.
      - resourceArn (string) --
        
        The Amazon Resource Name (ARN) of the resource that this access budget applies to.
      - details (list) --
        
        A list of budget details for this resource. Contains active budget periods that apply to the resource.
        
        (dict) --
        
        The detailed information for a specific budget period, including time boundaries and budget amounts.
        
        startTime (datetime) --
        
        The start time of this budget period.
        
        endTime (datetime) --
        
        The end time of this budget period. If not specified, the budget period continues indefinitely.
        
        remainingBudget (integer) --
        
        The amount of budget remaining in this period.
        
        budget (integer) --
        
        The total budget amount allocated for this period.
        
        budgetType (string) --
        
        The type of budget period. Calendar-based types reset automatically at regular intervals, while LIFETIME budgets never reset.
        
        autoRefresh (string) --
        
        Specifies whether this budget automatically refreshes when the current period ends.
      - aggregateRemainingBudget (integer) --
        
        The total remaining budget across all active budget periods for this resource.
- description (string) --
  
  The description of the ML input channel.
- syntheticDataConfiguration (dict) --
  
  The synthetic data configuration for this ML input channel, including parameters for generating privacy-preserving synthetic data and evaluation scores for measuring the privacy of the generated data.
  - syntheticDataParameters (dict) --
    
    The parameters that control how synthetic data is generated, including privacy settings, column classifications, and other configuration options that affect the data synthesis process.
    - epsilon (float) --
      
      The epsilon value for differential privacy, which controls the privacy-utility tradeoff in synthetic data generation. Lower values provide stronger privacy guarantees but may reduce data utility.
    - maxMembershipInferenceAttackScore (float) --
      
      The maximum acceptable score for membership inference attack vulnerability. Synthetic data generation fails if the score for the resulting data exceeds this threshold.
    - columnClassification (dict) --
      
      Classification details for data columns that specify how each column should be treated during synthetic data generation.
      - columnMapping (list) --
        
        A mapping that defines the classification of data columns for synthetic data generation and specifies how each column should be handled during the privacy-preserving data synthesis process.
        
        (dict) --
        
        Properties that define how a specific data column should be handled during synthetic data generation, including its name, type, and role in predictive modeling.
        
        columnName (string) --
        
        The name of the data column as it appears in the dataset.
        
        columnType (string) --
        
        The data type of the column, which determines how the synthetic data generation algorithm processes and synthesizes values for this column.
        
        isPredictiveValue (boolean) --
        
        Indicates if this column contains predictive values that should be treated as target variables in machine learning models. This affects how the synthetic data generation preserves statistical relationships.
  - syntheticDataEvaluationScores (dict) --
    
    Evaluation scores that assess the quality and privacy characteristics of the generated synthetic data, providing metrics on data utility and privacy preservation.
    - dataPrivacyScores (dict) --
      
      Privacy-specific evaluation scores that measure how well the synthetic data protects individual privacy, including assessments of potential privacy risks such as membership inference attacks.
      - membershipInferenceAttackScores (list) --
        
        Scores that evaluate the vulnerability of the synthetic data to membership inference attacks, which attempt to determine whether a specific individual was a member of the original dataset.
        
        (dict) --
        
        A score that measures the vulnerability of synthetic data to membership inference attacks and provides both the numerical score and the version of the attack methodology used for evaluation.
        
        attackVersion (string) --
        
        The version of the membership inference attack, which consists of the attack type and its version number, used to generate this privacy score.
        
        score (float) --
        
        The numerical score representing the vulnerability to membership inference attacks.
- createTime (datetime) --
  
  The time at which the ML input channel was created.
- updateTime (datetime) --
  
  The most recent time at which the ML input channel was updated.
- inputChannel (dict) --
  
  The input channel that was used to create the ML input channel.
  - dataSource (dict) --
    
    The data source that is used to create the ML input channel.
    Note
    
    This is a Tagged Union structure. Only one of the following top level keys will be set: protectedQueryInputParameters. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
```
'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
```
    - protectedQueryInputParameters (dict) --
      
      Provides information necessary to perform the protected query.
      - sqlParameters (dict) --
        
        The parameters for the SQL type Protected Query.
        
        queryString (string) --
        
        The query string to be submitted.
        
        analysisTemplateArn (string) --
        
        The Amazon Resource Name (ARN) associated with the analysis template within a collaboration.
        
        parameters (dict) --
        
        The protected query SQL parameters.
        
        (string) --
        
        (string) --
      - computeConfiguration (dict) --
        
        Provides configuration information for the workers that will perform the protected query.
        
        Note
        
        This is a Tagged Union structure. Only one of the following top level keys will be set: worker. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:
        
        'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
        
        worker (dict) --
        
        The worker instances that will perform the compute work.
        
        type (string) --
        
        The instance type of the compute workers that are used.
        
        number (integer) --
        
        The number of compute workers that are used.
      - resultFormat (string) --
        
        The format in which the query results should be returned. If not specified, defaults to CSV.
  - roleArn (string) --
    
    The Amazon Resource Name (ARN) of the role used to run the query specified in the dataSource field of the input channel.
    
    Passing a role across AWS accounts is not allowed. If you pass a role that isn't in your account, you get an AccessDeniedException error.
- protectedQueryIdentifier (string) --
  
  The ID of the protected query that was used to create the ML input channel.
- numberOfFiles (float) --
  
  The number of files in the ML input channel.
- sizeInGb (float) --
  
  The size, in GB, of the ML input channel.
- kmsKeyArn (string) --
  
  The Amazon Resource Name (ARN) of the KMS key that was used to create the ML input channel.
- tags (dict) --
  
  The optional metadata that you applied to the resource to help you categorize and organize them. Each tag consists of a key and an optional value, both of which you define.
  
  The following basic restrictions apply to tags:
  - Maximum number of tags per resource - 50.
  - For each resource, each tag key must be unique, and each tag key can have only one value.
  - Maximum key length - 128 Unicode characters in UTF-8.
  - Maximum value length - 256 Unicode characters in UTF-8.
  - If your tagging schema is used across multiple services and resources, remember that other services may have restrictions on allowed characters. Generally allowed characters are: letters, numbers, and spaces representable in UTF-8, and the following characters: + - = . _ : / @.
  - Tag keys and values are case sensitive.
  - Do not use aws:, AWS:, or any upper or lowercase combination of such as a prefix for keys as it is reserved for AWS use. You cannot edit or delete tag keys with this prefix. Values can have this prefix. If a tag value has aws as its prefix but the key does not, then Clean Rooms ML considers it to be a user tag and will count against the limit of 50 tags. Tags with only the key prefix of aws do not count against your tags per resource limit.
  - (string) --
    - (string) --