Aadyaa Maddi

Why Do We Need Tighter Error Bounds?

An active area of research in privacy-preserving machine learning is to find the tightest bound possible on the empirical risk obtained at the end of the learning process. What is the practical benefit of doing this?

I find this a little unintuitive, so I wanted to write down the answer to this question. We have a desired \(\epsilon_{1}\) value for the privacy parameter. Using some theoretical argument, we reason about the maximum error possible at \(\epsilon_{1}\). Assume we get error \(err_{1}\). We know that the algorithm’s error is inversely proportional to the value of \(\epsilon\). In practical deployments of privacy-preserving machine learning algorithms, we want the error to be as low as possible. Suppose \(err_{1}\) is not low enough. We would have to use a higher value of the privacy parameter \(\epsilon_{2} > \epsilon_{1}\) to reduce the error to \(err_{2} < err_{1}\). Doing this means we end up with a weaker privacy guarantee.

Instead, suppose we could reason about the maximum error possible at \(\epsilon_{1}\) using a better theoretical argument. Ideally, this would provide us with the lowest possible maximum error. We could stick to the stronger privacy guarantee and have lesser error simultaneously.

#Differential-Privacy