Bachelor Thesis
Differential Privacy as a Defense Against Membership Inference in Machine Learning
Machine learning models trained on sensitive data such as medical records, financial transactions, or personal communications leak information about their training sets in ways that are not immediately apparent. Membership inference attacks exploit the fact that a model behaves measurably differently on records it has seen during training than on unseen data. An adversary with only black-box access to a trained model can determine, with non-trivial accuracy, whether a target record was used during training. Beyond membership inference, model inversion and attribute inference attacks allow adversaries to reconstruct sensitive features of individuals from model outputs alone.
Differential privacy (DP) offers a mathematically rigorous defense, guaranteeing that the output of a computation cannot be used to reliably distinguish between two datasets differing in a single record. In a machine learning pipeline, DP can be enforced at multiple intervention points: during data collection through local DP mechanisms, at the data release stage through DP-preserving synthetic data generation, or directly in model optimization through DP-SGD. Each intervention point carries a distinct privacy budget, a different impact on model utility, and a different residual vulnerability to inference attacks. Understanding which intervention is most effective for a given threat model and deployment context is an open empirical question.
This thesis should begin by surveying the landscape of privacy attacks against machine learning models, with a focus on membership inference and covering model inversion as a secondary threat. The student will implement representative attack algorithms and a standard non-private baseline, then apply DP at each of the three pipeline stages independently. The empirical evaluation should characterize how the privacy budget $\varepsilon$, the stage of DP intervention, and the dataset characteristics jointly determine both the residual attack success and the degradation in model utility. The student should conclude with a comparative analysis that informs practitioners on which DP strategy is most appropriate under different assumptions about the adversary and the sensitivity of the data.