2025/12/01 - AWS Clean Rooms ML - 2 updated api methods
Changes AWS Clean Rooms ML now supports privacy-enhancing synthetic dataset generation for custom ML training.
{'syntheticDataConfiguration': {'syntheticDataEvaluationScores': {'dataPrivacyScores': {'membershipInferenceAttackScores': [{'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
'score': 'double'}]}},
'syntheticDataParameters': {'columnClassification': {'columnMapping': [{'columnName': 'string',
'columnType': 'CATEGORICAL '
'| '
'NUMERICAL',
'isPredictiveValue': 'boolean'}]},
'epsilon': 'double',
'maxMembershipInferenceAttackScore': 'double'}}}
Returns information about a specific ML input channel in a collaboration.
See also: AWS API Documentation
Request Syntax
client.get_collaboration_ml_input_channel(
mlInputChannelArn='string',
collaborationIdentifier='string'
)
string
[REQUIRED]
The Amazon Resource Name (ARN) of the ML input channel that you want to get.
string
[REQUIRED]
The collaboration ID of the collaboration that contains the ML input channel that you want to get.
dict
Response Syntax
{
'membershipIdentifier': 'string',
'collaborationIdentifier': 'string',
'mlInputChannelArn': 'string',
'name': 'string',
'configuredModelAlgorithmAssociations': [
'string',
],
'status': 'CREATE_PENDING'|'CREATE_IN_PROGRESS'|'CREATE_FAILED'|'ACTIVE'|'DELETE_PENDING'|'DELETE_IN_PROGRESS'|'DELETE_FAILED'|'INACTIVE',
'statusDetails': {
'statusCode': 'string',
'message': 'string'
},
'retentionInDays': 123,
'numberOfRecords': 123,
'privacyBudgets': {
'accessBudgets': [
{
'resourceArn': 'string',
'details': [
{
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'remainingBudget': 123,
'budget': 123,
'budgetType': 'CALENDAR_DAY'|'CALENDAR_MONTH'|'CALENDAR_WEEK'|'LIFETIME',
'autoRefresh': 'ENABLED'|'DISABLED'
},
],
'aggregateRemainingBudget': 123
},
]
},
'description': 'string',
'syntheticDataConfiguration': {
'syntheticDataParameters': {
'epsilon': 123.0,
'maxMembershipInferenceAttackScore': 123.0,
'columnClassification': {
'columnMapping': [
{
'columnName': 'string',
'columnType': 'CATEGORICAL'|'NUMERICAL',
'isPredictiveValue': True|False
},
]
}
},
'syntheticDataEvaluationScores': {
'dataPrivacyScores': {
'membershipInferenceAttackScores': [
{
'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
'score': 123.0
},
]
}
}
},
'createTime': datetime(2015, 1, 1),
'updateTime': datetime(2015, 1, 1),
'creatorAccountId': 'string'
}
Response Structure
(dict) --
membershipIdentifier (string) --
The membership ID of the membership that contains the ML input channel.
collaborationIdentifier (string) --
The collaboration ID of the collaboration that contains the ML input channel.
mlInputChannelArn (string) --
The Amazon Resource Name (ARN) of the ML input channel.
name (string) --
The name of the ML input channel.
configuredModelAlgorithmAssociations (list) --
The configured model algorithm associations that were used to create the ML input channel.
(string) --
status (string) --
The status of the ML input channel.
statusDetails (dict) --
Details about the status of a resource.
statusCode (string) --
The status code that was returned. The status code is intended for programmatic error handling. Clean Rooms ML will not change the status code for existing error conditions.
message (string) --
The error message that was returned. The message is intended for human consumption and can change at any time. Use the statusCode for programmatic error handling.
retentionInDays (integer) --
The number of days to retain the data for the ML input channel.
numberOfRecords (integer) --
The number of records in the ML input channel.
privacyBudgets (dict) --
Returns the privacy budgets that control access to this Clean Rooms ML input channel. Use these budgets to monitor and limit resource consumption over specified time periods.
accessBudgets (list) --
A list of access budgets that apply to resources associated with this Clean Rooms ML input channel.
(dict) --
An access budget that defines consumption limits for a specific resource within defined time periods.
resourceArn (string) --
The Amazon Resource Name (ARN) of the resource that this access budget applies to.
details (list) --
A list of budget details for this resource. Contains active budget periods that apply to the resource.
(dict) --
The detailed information for a specific budget period, including time boundaries and budget amounts.
startTime (datetime) --
The start time of this budget period.
endTime (datetime) --
The end time of this budget period. If not specified, the budget period continues indefinitely.
remainingBudget (integer) --
The amount of budget remaining in this period.
budget (integer) --
The total budget amount allocated for this period.
budgetType (string) --
The type of budget period. Calendar-based types reset automatically at regular intervals, while LIFETIME budgets never reset.
autoRefresh (string) --
Specifies whether this budget automatically refreshes when the current period ends.
aggregateRemainingBudget (integer) --
The total remaining budget across all active budget periods for this resource.
description (string) --
The description of the ML input channel.
syntheticDataConfiguration (dict) --
The synthetic data configuration for this ML input channel, including parameters for generating privacy-preserving synthetic data and evaluation scores for measuring the privacy of the generated data.
syntheticDataParameters (dict) --
The parameters that control how synthetic data is generated, including privacy settings, column classifications, and other configuration options that affect the data synthesis process.
epsilon (float) --
The epsilon value for differential privacy, which controls the privacy-utility tradeoff in synthetic data generation. Lower values provide stronger privacy guarantees but may reduce data utility.
maxMembershipInferenceAttackScore (float) --
The maximum acceptable score for membership inference attack vulnerability. Synthetic data generation fails if the score for the resulting data exceeds this threshold.
columnClassification (dict) --
Classification details for data columns that specify how each column should be treated during synthetic data generation.
columnMapping (list) --
A mapping that defines the classification of data columns for synthetic data generation and specifies how each column should be handled during the privacy-preserving data synthesis process.
(dict) --
Properties that define how a specific data column should be handled during synthetic data generation, including its name, type, and role in predictive modeling.
columnName (string) --
The name of the data column as it appears in the dataset.
columnType (string) --
The data type of the column, which determines how the synthetic data generation algorithm processes and synthesizes values for this column.
isPredictiveValue (boolean) --
Indicates if this column contains predictive values that should be treated as target variables in machine learning models. This affects how the synthetic data generation preserves statistical relationships.
syntheticDataEvaluationScores (dict) --
Evaluation scores that assess the quality and privacy characteristics of the generated synthetic data, providing metrics on data utility and privacy preservation.
dataPrivacyScores (dict) --
Privacy-specific evaluation scores that measure how well the synthetic data protects individual privacy, including assessments of potential privacy risks such as membership inference attacks.
membershipInferenceAttackScores (list) --
Scores that evaluate the vulnerability of the synthetic data to membership inference attacks, which attempt to determine whether a specific individual was a member of the original dataset.
(dict) --
A score that measures the vulnerability of synthetic data to membership inference attacks and provides both the numerical score and the version of the attack methodology used for evaluation.
attackVersion (string) --
The version of the membership inference attack, which consists of the attack type and its version number, used to generate this privacy score.
score (float) --
The numerical score representing the vulnerability to membership inference attacks.
createTime (datetime) --
The time at which the ML input channel was created.
updateTime (datetime) --
The most recent time at which the ML input channel was updated.
creatorAccountId (string) --
The account ID of the member who created the ML input channel.
{'syntheticDataConfiguration': {'syntheticDataEvaluationScores': {'dataPrivacyScores': {'membershipInferenceAttackScores': [{'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
'score': 'double'}]}},
'syntheticDataParameters': {'columnClassification': {'columnMapping': [{'columnName': 'string',
'columnType': 'CATEGORICAL '
'| '
'NUMERICAL',
'isPredictiveValue': 'boolean'}]},
'epsilon': 'double',
'maxMembershipInferenceAttackScore': 'double'}}}
Returns information about an ML input channel.
See also: AWS API Documentation
Request Syntax
client.get_ml_input_channel(
mlInputChannelArn='string',
membershipIdentifier='string'
)
string
[REQUIRED]
The Amazon Resource Name (ARN) of the ML input channel that you want to get.
string
[REQUIRED]
The membership ID of the membership that contains the ML input channel that you want to get.
dict
Response Syntax
{
'membershipIdentifier': 'string',
'collaborationIdentifier': 'string',
'mlInputChannelArn': 'string',
'name': 'string',
'configuredModelAlgorithmAssociations': [
'string',
],
'status': 'CREATE_PENDING'|'CREATE_IN_PROGRESS'|'CREATE_FAILED'|'ACTIVE'|'DELETE_PENDING'|'DELETE_IN_PROGRESS'|'DELETE_FAILED'|'INACTIVE',
'statusDetails': {
'statusCode': 'string',
'message': 'string'
},
'retentionInDays': 123,
'numberOfRecords': 123,
'privacyBudgets': {
'accessBudgets': [
{
'resourceArn': 'string',
'details': [
{
'startTime': datetime(2015, 1, 1),
'endTime': datetime(2015, 1, 1),
'remainingBudget': 123,
'budget': 123,
'budgetType': 'CALENDAR_DAY'|'CALENDAR_MONTH'|'CALENDAR_WEEK'|'LIFETIME',
'autoRefresh': 'ENABLED'|'DISABLED'
},
],
'aggregateRemainingBudget': 123
},
]
},
'description': 'string',
'syntheticDataConfiguration': {
'syntheticDataParameters': {
'epsilon': 123.0,
'maxMembershipInferenceAttackScore': 123.0,
'columnClassification': {
'columnMapping': [
{
'columnName': 'string',
'columnType': 'CATEGORICAL'|'NUMERICAL',
'isPredictiveValue': True|False
},
]
}
},
'syntheticDataEvaluationScores': {
'dataPrivacyScores': {
'membershipInferenceAttackScores': [
{
'attackVersion': 'DISTANCE_TO_CLOSEST_RECORD_V1',
'score': 123.0
},
]
}
}
},
'createTime': datetime(2015, 1, 1),
'updateTime': datetime(2015, 1, 1),
'inputChannel': {
'dataSource': {
'protectedQueryInputParameters': {
'sqlParameters': {
'queryString': 'string',
'analysisTemplateArn': 'string',
'parameters': {
'string': 'string'
}
},
'computeConfiguration': {
'worker': {
'type': 'CR.1X'|'CR.4X',
'number': 123
}
},
'resultFormat': 'CSV'|'PARQUET'
}
},
'roleArn': 'string'
},
'protectedQueryIdentifier': 'string',
'numberOfFiles': 123.0,
'sizeInGb': 123.0,
'kmsKeyArn': 'string',
'tags': {
'string': 'string'
}
}
Response Structure
(dict) --
membershipIdentifier (string) --
The membership ID of the membership that contains the ML input channel.
collaborationIdentifier (string) --
The collaboration ID of the collaboration that contains the ML input channel.
mlInputChannelArn (string) --
The Amazon Resource Name (ARN) of the ML input channel.
name (string) --
The name of the ML input channel.
configuredModelAlgorithmAssociations (list) --
The configured model algorithm associations that were used to create the ML input channel.
(string) --
status (string) --
The status of the ML input channel.
statusDetails (dict) --
Details about the status of a resource.
statusCode (string) --
The status code that was returned. The status code is intended for programmatic error handling. Clean Rooms ML will not change the status code for existing error conditions.
message (string) --
The error message that was returned. The message is intended for human consumption and can change at any time. Use the statusCode for programmatic error handling.
retentionInDays (integer) --
The number of days to keep the data in the ML input channel.
numberOfRecords (integer) --
The number of records in the ML input channel.
privacyBudgets (dict) --
Returns the privacy budgets that control access to this Clean Rooms ML input channel. Use these budgets to monitor and limit resource consumption over specified time periods.
accessBudgets (list) --
A list of access budgets that apply to resources associated with this Clean Rooms ML input channel.
(dict) --
An access budget that defines consumption limits for a specific resource within defined time periods.
resourceArn (string) --
The Amazon Resource Name (ARN) of the resource that this access budget applies to.
details (list) --
A list of budget details for this resource. Contains active budget periods that apply to the resource.
(dict) --
The detailed information for a specific budget period, including time boundaries and budget amounts.
startTime (datetime) --
The start time of this budget period.
endTime (datetime) --
The end time of this budget period. If not specified, the budget period continues indefinitely.
remainingBudget (integer) --
The amount of budget remaining in this period.
budget (integer) --
The total budget amount allocated for this period.
budgetType (string) --
The type of budget period. Calendar-based types reset automatically at regular intervals, while LIFETIME budgets never reset.
autoRefresh (string) --
Specifies whether this budget automatically refreshes when the current period ends.
aggregateRemainingBudget (integer) --
The total remaining budget across all active budget periods for this resource.
description (string) --
The description of the ML input channel.
syntheticDataConfiguration (dict) --
The synthetic data configuration for this ML input channel, including parameters for generating privacy-preserving synthetic data and evaluation scores for measuring the privacy of the generated data.
syntheticDataParameters (dict) --
The parameters that control how synthetic data is generated, including privacy settings, column classifications, and other configuration options that affect the data synthesis process.
epsilon (float) --
The epsilon value for differential privacy, which controls the privacy-utility tradeoff in synthetic data generation. Lower values provide stronger privacy guarantees but may reduce data utility.
maxMembershipInferenceAttackScore (float) --
The maximum acceptable score for membership inference attack vulnerability. Synthetic data generation fails if the score for the resulting data exceeds this threshold.
columnClassification (dict) --
Classification details for data columns that specify how each column should be treated during synthetic data generation.
columnMapping (list) --
A mapping that defines the classification of data columns for synthetic data generation and specifies how each column should be handled during the privacy-preserving data synthesis process.
(dict) --
Properties that define how a specific data column should be handled during synthetic data generation, including its name, type, and role in predictive modeling.
columnName (string) --
The name of the data column as it appears in the dataset.
columnType (string) --
The data type of the column, which determines how the synthetic data generation algorithm processes and synthesizes values for this column.
isPredictiveValue (boolean) --
Indicates if this column contains predictive values that should be treated as target variables in machine learning models. This affects how the synthetic data generation preserves statistical relationships.
syntheticDataEvaluationScores (dict) --
Evaluation scores that assess the quality and privacy characteristics of the generated synthetic data, providing metrics on data utility and privacy preservation.
dataPrivacyScores (dict) --
Privacy-specific evaluation scores that measure how well the synthetic data protects individual privacy, including assessments of potential privacy risks such as membership inference attacks.
membershipInferenceAttackScores (list) --
Scores that evaluate the vulnerability of the synthetic data to membership inference attacks, which attempt to determine whether a specific individual was a member of the original dataset.
(dict) --
A score that measures the vulnerability of synthetic data to membership inference attacks and provides both the numerical score and the version of the attack methodology used for evaluation.
attackVersion (string) --
The version of the membership inference attack, which consists of the attack type and its version number, used to generate this privacy score.
score (float) --
The numerical score representing the vulnerability to membership inference attacks.
createTime (datetime) --
The time at which the ML input channel was created.
updateTime (datetime) --
The most recent time at which the ML input channel was updated.
inputChannel (dict) --
The input channel that was used to create the ML input channel.
dataSource (dict) --
The data source that is used to create the ML input channel.
protectedQueryInputParameters (dict) --
Provides information necessary to perform the protected query.
sqlParameters (dict) --
The parameters for the SQL type Protected Query.
queryString (string) --
The query string to be submitted.
analysisTemplateArn (string) --
The Amazon Resource Name (ARN) associated with the analysis template within a collaboration.
parameters (dict) --
The protected query SQL parameters.
(string) --
(string) --
computeConfiguration (dict) --
Provides configuration information for the workers that will perform the protected query.
worker (dict) --
The worker instances that will perform the compute work.
type (string) --
The instance type of the compute workers that are used.
number (integer) --
The number of compute workers that are used.
resultFormat (string) --
The format in which the query results should be returned. If not specified, defaults to CSV.
roleArn (string) --
The Amazon Resource Name (ARN) of the role used to run the query specified in the dataSource field of the input channel.
Passing a role across AWS accounts is not allowed. If you pass a role that isn't in your account, you get an AccessDeniedException error.
protectedQueryIdentifier (string) --
The ID of the protected query that was used to create the ML input channel.
numberOfFiles (float) --
The number of files in the ML input channel.
sizeInGb (float) --
The size, in GB, of the ML input channel.
kmsKeyArn (string) --
The Amazon Resource Name (ARN) of the KMS key that was used to create the ML input channel.
tags (dict) --
The optional metadata that you applied to the resource to help you categorize and organize them. Each tag consists of a key and an optional value, both of which you define.
The following basic restrictions apply to tags:
Maximum number of tags per resource - 50.
For each resource, each tag key must be unique, and each tag key can have only one value.
Maximum key length - 128 Unicode characters in UTF-8.
Maximum value length - 256 Unicode characters in UTF-8.
If your tagging schema is used across multiple services and resources, remember that other services may have restrictions on allowed characters. Generally allowed characters are: letters, numbers, and spaces representable in UTF-8, and the following characters: + - = . _ : / @.
Tag keys and values are case sensitive.
Do not use aws:, AWS:, or any upper or lowercase combination of such as a prefix for keys as it is reserved for AWS use. You cannot edit or delete tag keys with this prefix. Values can have this prefix. If a tag value has aws as its prefix but the key does not, then Clean Rooms ML considers it to be a user tag and will count against the limit of 50 tags. Tags with only the key prefix of aws do not count against your tags per resource limit.
(string) --
(string) --