Why this matters — a model is a leaky diary that talks
Think of a trained model as a diary that learned to talk. You never handed it a single record at query time, yet it can tell a stranger whether your medical row was in the training set, sketch a recognisable face from a classifier, or paste a customer's Aadhaar number into a completion. The data is gone, but its shadow is baked into the weights. Privacy-preserving ML is the discipline of shrinking that shadow.
Interviewers probe this because it is where AI meets law. Under India's DPDP Act and the EU AI Act, a model that regurgitates PII is a reportable incident, not a curiosity. Panels want engineers who can name the attack, quantify the leak, and pick the right control — differential privacy, federated learning, encryption, or plain data minimisation — without guessing.
Sneha is in round two for an AI security role. The panel lead asks: "We fine-tuned an LLM on support tickets. A user got another customer's PAN in a reply. What happened, and how do you stop it?" She freezes — she knows it is "leakage" but cannot name the mechanism or a fix.
The fix is a mental model: training data leaves a fingerprint in the weights, and there are named attacks and named defences for it. Learn the chain — attack, measurement, control — and these questions become a script you can recite under pressure.
1. Privacy Attacks on Models
Privacy attacks treat the model as the leak. You query it, watch its confidence, or read its gradients — and recover facts about people who were never meant to be exposed. The thread tying them together is overfitting: the more a model memorises individuals, the louder it leaks.
Q1 What is a membership inference attack (MIA)?L1
A membership inference attack answers one question: was this exact record in the model's training set? The attacker feeds a candidate sample to the model and reads a signal — usually confidence or loss. Training members tend to get higher confidence and lower loss because the model has seen them. A classic recipe trains shadow models on data drawn from the same distribution, then trains an attack classifier to separate "member" from "non-member" by output behaviour.
It sounds abstract until the dataset is sensitive. Confirming that someone's record was in an HIV-diagnosis or salary dataset is itself the privacy breach — membership is the secret.
Q2 Why do larger or overfit models leak more, and how do you measure MIA risk?L2
Leakage scales with memorisation. An overfit model fits individual training points instead of the general pattern, so members behave measurably differently from non-members — exactly the gap an attacker exploits. The bigger the train/test accuracy gap, the easier the attack. High-capacity models on small or duplicated data are the worst offenders.
Measure it like a binary classifier. Run the attack and report AUC and, more honestly, true-positive rate at a low false-positive rate (TPR @ 0.1% FPR) — average accuracy hides the few records that leak badly. Priya at a Bangalore AI startup ran ART's MIA module against her churn model and found AUC 0.78; she traced it to a duplicated-row leak in the training data.
Q3 What is model inversion, and how does attribute inference differ from membership inference?L2
Model inversion reconstructs representative input features for a class or individual by optimising an input to maximise the model's confidence — the canonical demo recovers a blurry but recognisable face from a face-recognition model given only a name/label and API access.
Attribute inference is narrower: the attacker already knows most of a record and uses the model to fill in a missing sensitive field (e.g., infer salary band from the rest). The distinction interviewers want: membership inference asks were you in the data; attribute/inversion asks what is your sensitive value. Inversion reconstructs; attribute inference predicts a specific column.
Q4 Explain training-data extraction and memorisation in LLMs.L2
Large language models can verbatim memorise chunks of their training data, especially rare, high-entropy, or duplicated strings — exactly the shape of secrets like API keys, Aadhaar/PAN numbers, and email signatures. Extraction is the attack: prompt the model so it completes from memory, then filter for high-confidence, low-perplexity outputs that look like real records (Carlini et al. showed this on GPT-2 scale models and it gets worse with scale and duplication).
Two facts to state: memorisation rises with model size and with how often a string was duplicated in training. So de-duplication and PII scrubbing of the corpus are first-line defences, not afterthoughts.
Q5 Your fine-tuned support LLM returned another customer's PAN. Walk through diagnosis and fixes.L3
First, name it: training-data memorisation surfacing as verbatim regurgitation — OWASP LLM02 (Sensitive Information Disclosure). Diagnose: was the PAN in the fine-tune corpus? Grep the dataset; reproduce with the triggering prompt; check if it is reachable by paraphrase.
Fixes, layered: (1) scrub the corpus with PII detection (Presidio) and de-duplicate before retraining — root cause. (2) Add an output filter / DLP regex + Presidio on responses to catch PAN/Aadhaar patterns. (3) Consider DP fine-tuning (DP-SGD) to bound per-record influence. (4) Log the incident; under DPDP this is likely reportable. Order matters: remediate the data, then layer guardrails — guardrails alone do not un-memorise.
Q6 What is a reconstruction attack on gradients, and when is it feasible?L3
A gradient reconstruction attack (e.g., Deep Leakage from Gradients) recovers the actual training inputs — pixels, tokens — from the gradient updates a client shares, by optimising dummy data until its gradient matches the observed one. It is the reason "we only share gradients, not data" is a false sense of safety in federated learning.
Feasibility is highest with small local steps, few samples per update, and no aggregation — a single client's raw gradient is the danger. It weakens fast with large aggregation groups, more local epochs, gradient clipping/noise, and secure aggregation that hides any one client's contribution. Aman at a Mumbai bank cited this to justify mandating secure aggregation before any federated rollout.
Q7 Interviewer: "Is anonymising the training data enough to stop these attacks?" How do you answer?L3
No — and saying so cleanly is the point. Removing direct identifiers stops naive lookups but not statistical attacks. A model trained on "anonymised" data can still leak membership (was this row present) and support attribute inference, because the model learns correlations, not just the dropped name column. Re-identification via quasi-identifiers (PIN code + age + gender) is a separate, well-documented risk.
The honest framing: anonymisation reduces the input sensitivity but gives no formal bound on what the trained model reveals. If you need a provable guarantee against these attacks, you reach for differential privacy, which bounds any single record's influence regardless of side knowledge.
Flip these before your interview
Attacker asks: was this exact record in training? High model confidence on it leaks the answer. So what: tells you the model memorised individuals.
Reconstructs a representative training input (a face, a record) from model outputs. So what: “no raw data shared” is not the same as private.
Flood the API with queries to clone the model or steal its IP. So what: defend with auth, rate limits, and query budgets.
The privacy budget in differential privacy — smaller means stronger. ε≈1 strong, ε≈8 typical. So what: always quote ε and δ together.
Each silo trains locally and shares only updates, never raw data. So what: gradients still leak — add secure aggregation plus DP.
Homomorphic encryption computes on ciphertext (slow); a TEE runs plaintext inside a sealed enclave (fast, trust the chip). So what: speed vs trust trade-off.
Karthik at a Wipro project ships a health-risk API that returns full softmax confidences. A researcher shows he can tell, with high accuracy, whether a named patient's record was in the training set. Predict the cause and the single best control, and how to verify it works.
2. Differential Privacy
Differential privacy (DP) is the only widely-accepted formal privacy guarantee. It does not promise the output is useless to attackers — it promises that any one person's data barely changes the output, so an attacker cannot tell if you opted in. Everything here hinges on two numbers: epsilon and delta.
Q8 Define differential privacy in plain terms.L1
Differential privacy guarantees that the result of an analysis is almost the same whether or not any single individual is in the dataset. Formally, for two datasets differing by one record, the probability of any output changes by at most a factor of e^epsilon (plus a small slack delta). You add carefully calibrated random noise to achieve this.
The plain-English promise: nothing an attacker can learn about you from the output could not also be learned if your record had never been included. That is a guarantee about the mechanism, holding even against attackers with unlimited side knowledge — which is why it beats anonymisation.
Q9 What does epsilon actually mean, and what is a 'good' value?L2
epsilon is the privacy-loss budget — the maximum factor by which any output's probability can shift because of one person. Smaller epsilon = more noise = stronger privacy but lower utility. It is exponential, so epsilon 1 and epsilon 10 are worlds apart, not 10x apart.
Rules of thumb interviewers like: epsilon ≤ 1 is strong, 1–3 is reasonable for many ML tasks, and double-digit epsilon offers weak practical guarantees (some industrial deployments quietly run epsilon in the tens). Always quote epsilon with delta and the unit — per-query, per-user, or per-training-run — because epsilon without scope is meaningless.
Q10 What is delta, and why is (epsilon, delta)-DP weaker than pure DP?L2
delta is the probability that the clean epsilon bound fails entirely — a small chance the mechanism leaks more than promised. Pure DP is (epsilon, 0); approximate DP is (epsilon, delta) with delta tiny.
The catch: if delta is too large, the mechanism could, with probability delta, output someone's full record and still satisfy the inequality. So set delta less than 1/N (smaller than the inverse of the dataset size), often 1e-5 or 1e-6. (epsilon, delta)-DP is weaker but practically necessary — it is what lets the Gaussian mechanism and DP-SGD's composition math work cleanly.
Q11 Explain DP-SGD: how do you make training differentially private?L2
DP-SGD adds two steps to ordinary SGD on each step. First, per-example gradient clipping: compute each sample's gradient and clip its L2 norm to a bound C, so no single record can dominate the update. Second, add Gaussian noise scaled to C (noise multiplier sigma) to the summed gradients before the optimiser step.
Clipping bounds sensitivity; the noise provides the privacy. A privacy accountant (RDP / the moments accountant, or PLD) tracks cumulative epsilon across all steps. Libraries: TensorFlow Privacy and Opacus. The cost is real — per-example gradients are memory-hungry and you trade accuracy for a lower epsilon.
Q12 What is a privacy budget, and how does composition affect it across queries or epochs?L3
The privacy budget is your total allowable epsilon. Each DP query or training step spends some of it, and DP composes: privacy loss accumulates. Naive (basic) composition just adds epsilons — run the same epsilon-1 query 100 times and you are at epsilon 100. Advanced composition and tight accountants (RDP, PLD) grow the budget closer to sqrt(k) instead of k, which is why DP-SGD over thousands of steps stays usable.
Operationally: fix a target epsilon up front, let the accountant tell you how many steps/queries you can afford, and once the budget is spent you stop releasing — you cannot keep answering for free. Karthik at a Hyderabad SOC enforced this as a hard gate in their analytics pipeline.
Q13 How do you reason about the privacy/utility trade-off when accuracy tanks under DP?L3
Lower epsilon means more noise means lower accuracy — that is the law, not a bug. Levers to recover utility at fixed epsilon: more data (noise is fixed per step, so signal-to-noise improves with N), larger aggregation groups (more samples per step amortise the noise), tuning the clip norm C (too small loses signal, too large needs more noise), and starting from a public pre-trained model so DP only covers the sensitive fine-tune.
The senior move is to frame it as a budget negotiation: "At epsilon 3 we lose ~4 points of accuracy; is that acceptable to legal for this dataset?" — make the trade explicit rather than silently choosing.
Q14 When is DP NOT the right tool?L3
DP protects against per-individual leakage, so it is wrong when the thing you must protect is not an individual record. It does not stop model theft, prompt injection, or a correct-but-harmful inference about a whole group. It hurts badly on small datasets or rare classes, where the required noise destroys utility. And it does nothing for data-in-transit/at-rest confidentiality — that is encryption's job.
It is also the wrong frame when you simply should not have collected the data: the answer there is data minimisation, not noisy training. State the boundary clearly — DP bounds what a released model/statistic reveals about one person, nothing more.
Q15 Contrast central DP vs local DP — who adds the noise, what is the trust model, and where do you deploy each?L2
The split is who you have to trust. In central DP, raw records reach a trusted curator/aggregator who computes the statistic or trains the model and adds the calibrated noise once, centrally. Less total noise for the same epsilon, so better utility — but everyone must trust that the curator never leaks the raw data. DP-SGD on a server you control is central DP. In local DP (LDP), each user perturbs their own value on their device before it ever leaves, so the collector only ever sees noised data and there is no trusted curator. The canonical primitive is randomised response ("flip a coin, sometimes lie"), generalised by RAPPOR (Google's Chrome telemetry) and Apple's count-mean-sketch for emoji/typing stats.
The cost of removing the trusted party is brutal noise: LDP error scales badly (roughly 1/(epsilon·sqrt(N))), so it only works at population scale for aggregate telemetry — not for training a high-fidelity model on a few thousand rows. The interview soundbite: central DP = better utility, must trust the curator; local DP = no trusted party, far worse utility, needs millions of users. The modern middle ground is the shuffle model (anonymise/shuffle LDP reports to amplify privacy), which buys back much of the utility — and distributed DP inside secure aggregation in FL is essentially the same idea.
Q16 DP fine-tuning a 7B LLM with LoRA/PEFT — how does it interact with DP-SGD per-example clipping and the accountant, and what utility hit at epsilon ≈ 8?L3
The reason this pairing works is that LoRA/PEFT shrinks the thing DP-SGD has to noise. DP-SGD's noise is added to the gradient and scaled to the clip norm C; full fine-tuning of 7B parameters means an enormous, high-dimensional gradient and correspondingly destructive noise. LoRA freezes the base weights and trains only small low-rank adapter matrices (often <1% of parameters), so you clip and noise a tiny gradient — much better signal-to-noise at the same epsilon. Concretely you still do per-example gradient clipping, but only over the adapter parameters, then add Gaussian noise scaled to C, then step; the RDP/PLD accountant tracks epsilon over steps exactly as before. The painful part is engineering: per-example gradients defeat the usual batched backprop, so you need per-sample gradients (Opacus hooks, functorch/vmap, or ghost clipping to avoid materialising them) — memory, not math, is the bottleneck.
On utility: starting from a strong public pre-trained base and privatising only the fine-tune is what makes DP affordable — the public prior does the heavy lifting and DP only protects the sensitive adapter data. At epsilon ≈ 8, delta = 1e-5, DP-LoRA fine-tunes typically land within a few points of non-private accuracy on classification/generation tasks; the hit grows sharply as you push toward epsilon ≈ 1. Gotchas to name: don't set epsilon needlessly small (it tanks accuracy without meaningfully lowering real attack risk), tune C and noise multiplier together, prefer large batch sizes (more samples amortise the noise), and remember the base model's pre-training data carries no DP guarantee — DP-LoRA only protects the fine-tune set.
▶ Watch a leaking support LLM get fixed — Vikram at a Mumbai bank
Vikram fine-tunes a support model on real tickets, finds it leaking a customer’s PAN, and locks privacy down in six stages.
2.4M real customer tickets to auto-draft replies.
PAN verbatim — memorisation.
Presidio to strip IN_PAN, IN_AADHAAR, PERSON before training.
max_grad_norm=1.0, noise_multiplier=1.1, target ε≈8.
≈50% — near chance, no telling who was in.
NIST AI RMF before the model ships.
Sneha trains a credit-default model at a Pune fintech using DP-SGD with TensorFlow Privacy. To get a stronger privacy guarantee for customers, her lead asks her to lower the privacy budget. Her panel asks what changing epsilon from 8 to 1 actually does.
epsilon is the privacy-loss bound, so a smaller value means a tighter guarantee, more added noise, and less information leaked about any single record — typically at some accuracy cost (NIST AI RMF MEASURE; OpenDP / TensorFlow Privacy). a inverts the meaning; smaller epsilon is stronger, not weaker. c the noise multiplier and clip norm, not the learning rate, drive the privacy spend. d DP adds calibrated noise to gradients; it does not encrypt them.Aman at a Chennai ITES reports his DP-trained model started at epsilon 2, but after months of weekly retrains on the same users the team is unsure what privacy it now offers. Predict the cause and the fix, and how to verify the real guarantee.
epsilon is now far higher than 2. Differential privacy composes — running the mechanism many times over the same individuals adds up, and an untracked budget silently erodes the guarantee (NIST AI RMF GOVERN/MEASURE). Fix: track a cumulative privacy budget with a proper accountant (the RDP / moments accountant in TensorFlow Privacy or OpenDP), set a hard total-epsilon ceiling across all retrains, and stop or re-scope once it is reached. Verify by reading the accountant's reported total epsilon after the latest run and confirming it is at or below the agreed ceiling before release.3. Federated Learning & Secure Aggregation
Federated learning (FL) flips the pipeline: instead of pulling everyone's data to a central server, you send the model to the data, train locally, and ship back only updates. It reduces raw-data movement — but updates leak too, so FL is a starting point, not a finished privacy control.
Q17 What is federated learning, in one breath?L1
Federated learning trains a shared model across many clients (phones, hospitals, banks) without centralising their raw data. The server sends the current model; each client trains on its local data; clients send back only model updates (gradients or weight deltas); the server aggregates them (typically FedAvg — a weighted average) into a new global model. Repeat for many rounds.
The privacy pitch is data minimisation: the raw data never leaves the device. The catch — which every good answer adds — is that the updates themselves can leak the data, so FL needs extra protection layered on.
Q18 What is gradient leakage and why does 'we only share gradients' fail?L2
Gradient leakage is the fact that a client's update is a function of its private data, so an attacker (often the server) can invert it to reconstruct that data. Deep Leakage from Gradients optimises synthetic inputs until their gradient matches the shared one, recovering near-exact images and text in some settings.
So "we only share gradients" fails because gradients are not anonymous summaries — they are a lossy but rich encoding of the very samples used. The exposure is worst for a single client's raw, unaggregated, small update. Defences: aggregate across many clients, clip and add noise, and hide individual contributions with secure aggregation.
Q19 What is secure aggregation and what does it protect?L2
Secure aggregation lets the server learn only the sum of client updates, never any individual one. Clients mask their updates with pairwise random masks that cancel out when summed (Bonawitz et al.'s protocol), often with Shamir secret-sharing so the sum still works if some clients drop out. The server sees a meaningful aggregate; each contribution stays hidden.
It directly neutralises an honest-but-curious server trying gradient leakage on one client — there is no single gradient to invert. What it does not give you is a privacy bound on the aggregate itself; that still requires DP noise on top.
Q20 Describe the federated threat model: honest-but-curious server vs malicious clients.L3
Two distinct adversaries. The honest-but-curious server follows the protocol but tries to learn from what it sees — its weapon is gradient leakage; its counter is secure aggregation plus DP. The malicious client attacks the model's integrity, not its confidentiality — data/model poisoning and backdoor attacks by sending crafted updates.
The trap interviewers set: secure aggregation helps the curious server but hurts poisoning defence, because hiding individual updates also hides the malicious ones you would want to flag. So you need poisoning-resistant aggregation (Krum, trimmed-mean, norm clipping) and anomaly checks that work on protected updates. Naming this tension scores well.
Q21 How do you combine FL, secure aggregation, and DP — and what's user-level vs example-level DP?L3
Stack them: federated training keeps raw data local, secure aggregation hides each client's update from the server, and differential privacy bounds what the released global model reveals. The right DP unit in FL is usually user-level (client-level) DP — clip and noise so that any one client's entire dataset barely affects the model — which is stronger and more meaningful than example-level DP when one user owns many records.
Place the DP noise carefully: distributed DP adds noise inside secure aggregation so no party sees a clean per-client value, avoiding the trust problem of a central server adding it. This trio is what production FL (e.g., on-device keyboards) actually uses.
Q22 A Mumbai bank wants FL across branches for fraud detection. What do you flag?L3
Flag four things. (1) Updates leak — mandate secure aggregation; raw gradients to a central server are a no-go. (2) Non-IID data — branches differ, so FedAvg may converge poorly; expect tuning and possibly per-branch personalisation. (3) Poisoning — a compromised branch can backdoor the model; add poisoning-resistant aggregation and update-norm anomaly checks. (4) Regulatory framing — FL reduces data movement (good for DPDP/RBI data-localisation) but the model still encodes customer signals, so add client-level DP and a model-leakage assessment (MIA test) before release.
Closing line: FL is a data-minimisation win, not a privacy guarantee — secure aggregation and DP are what make it defensible.
Q23 Name a concrete FL framework and wire up secure aggregation, client-level DP, dropout-resilience and Byzantine-robust aggregation.L3
Name one and show you've actually used it: Flower (framework-agnostic, the common production choice), TensorFlow Federated (TFF, research/simulation), NVIDIA FLARE (enterprise/healthcare), or PySyft from OpenMined. Sketch the wiring in Flower: a custom Strategy orchestrates rounds; clients implement fit()/evaluate() on local data. Layer the controls: (1) secure aggregation via Flower's SecAgg/SecAgg+ (the Salvia implementation of the Bonawitz protocol) so the server only ever sees the summed update and never any one client's; it is built to survive client dropout through Shamir secret-sharing of the masks. (2) Client-level DP — clip each client's whole update to a norm and add Gaussian noise (Flower's DP wrappers / Opacus on-device), so one client's entire dataset barely moves the global model. (3) Byzantine-robust aggregation — swap plain FedAvg for Krum, multi-Krum, trimmed-mean, or median to blunt poisoning from malicious clients.
The senior point is the tension you must resolve: secure aggregation hides individual updates, but robust aggregation (Krum/trimmed-mean) needs to inspect them to reject outliers — you can't naively do both. Reconcile it with norm-bounding/clipping before masking, robustness checks computed inside the secure-aggregation protocol, or distributed DP that adds noise within the aggregate. Close with: pick user/client-level DP for the unit, SecAgg+ for dropout-resilient confidentiality, and a robust aggregator for integrity — that trio is what real on-device deployments (Gboard-style) actually run.
Rahul builds cross-hospital federated learning for a Bangalore AI startup so patient data never leaves each site. A reviewer points out the central server still sees each hospital's raw model update every round. His panel asks for the precise risk and the right control.
4. Crypto for ML
Cryptographic ML lets you compute on data you are not allowed to see in the clear. Three families dominate: homomorphic encryption (math on ciphertext), secure multiparty computation (split trust across parties), and trusted execution environments (hardware enclaves). They differ sharply on threat model, latency, and cost.
Q24 What is homomorphic encryption (HE) and what makes it special for ML?L1
Homomorphic encryption lets you compute directly on encrypted data and get an encrypted result that, once decrypted, equals the computation on the plaintext. The server never sees the inputs, the outputs, or the intermediate values in the clear. For ML this means a client can send an encrypted query, the server runs inference on it, and only the client can decrypt the answer.
Schemes to name: CKKS (approximate, ideal for real-valued ML), BFV/BGV (exact integers). Libraries: Microsoft SEAL, OpenFHE, TenSEAL. The catch is cost — covered next.
Q25 Why is fully homomorphic encryption so expensive, and what's the practical workaround?L2
FHE is slow because every operation works on large lattice-based ciphertexts, and multiplications grow noise that must be periodically reset by an expensive bootstrapping step. Overheads of 1,000x to 1,000,000x in compute and big ciphertext expansion are normal. Non-linear functions (ReLU, softmax) are especially painful — you approximate them with low-degree polynomials.
Practical workarounds: use leveled HE (allow a fixed multiplicative depth, skip bootstrapping) for shallow models; restrict HE to inference, not training; design polynomial-friendly networks; or switch to SMPC/TEEs when latency matters. HE shines for low-throughput, high-sensitivity tasks — encrypted scoring of a few records — not real-time, high-QPS serving.
Q26 What is secure multiparty computation (SMPC) and when do you use it for ML?L2
Secure multiparty computation lets several parties jointly compute a function over their combined inputs while each keeps its input private. Inputs are secret-shared across parties; they exchange messages to compute (using shares + protocols like Yao's garbled circuits or additive sharing) and only the agreed output is revealed.
Use it for split-trust settings: two hospitals training a joint model without sharing patient rows, or splitting an inference between two non-colluding servers so neither sees the input. The cost is communication — many rounds of network traffic — so it is bandwidth- and latency-bound, not compute-bound like HE. Security usually holds only if parties do not collude.
Q27 What is a trusted execution environment (TEE) / confidential computing for ML?L2
A TEE is a hardware-isolated enclave (Intel SGX/TDX, AMD SEV-SNP, NVIDIA Confidential Computing on H100) where code and data run encrypted in memory, shielded even from the OS, hypervisor, and cloud operator. You load the model and data into the enclave, run normal (plaintext-speed) computation inside, and use remote attestation to prove to a client which code is running before sending secrets.
For ML this gives near-native speed for both training and inference on sensitive data — far faster than HE/SMPC. The trade-off is the trust root shifts to the hardware vendor, and TEEs have a real history of side-channel attacks (Spectre-class, page-fault leaks). It is hardware-enforced isolation, not a math proof.
Q28 Compare HE vs SMPC vs TEE on threat model, latency, and cost.L3
Threat model: HE and SMPC give cryptographic guarantees (HE trusts no one with the data; SMPC trusts that parties don't collude). TEE gives hardware-rooted guarantees — you must trust the chip vendor and accept side-channel exposure. Latency/throughput: TEE ≈ near-native (best); SMPC is bottlenecked by network rounds; HE is the slowest by far. Cost: HE = huge compute + ciphertext blow-up; SMPC = bandwidth + multiple non-colluding parties to operate; TEE = special hardware + attestation plumbing.
The senior call: pick TEE for performance-critical workloads where hardware trust is acceptable; SMPC for two/three-party split-trust without special hardware; HE for the highest-assurance, low-throughput case where you can tolerate the cost. Often they combine — e.g., a TEE running an SMPC party.
Q29 A Chennai ITES wants to run inference on a vendor's model over PHI without exposing either. What do you propose?L3
State the dual constraint: the client's PHI must stay private from the vendor, and the vendor's model weights must stay private from the client. That rules out simply shipping data or the model.
Pragmatic answer: run inference inside a TEE (e.g., SEV-SNP or H100 confidential computing). The model loads into the enclave, the client sends encrypted PHI that is only decrypted inside, and remote attestation proves to both sides that the unmodified, agreed code is running before any secret is released. This gives near-native latency, which HE could not. If hardware trust is unacceptable to the regulator, fall back to two-party SMPC between client and vendor — slower, but no hardware root of trust. Layer DLP/Presidio redaction and audit logging regardless.
Q30 Can RAG embeddings stored in a vector DB leak the underlying text (embedding inversion / vec2text), and how do you mitigate it?L3
Yes — and treat the embedding as sensitive data, not an opaque hash. Embedding inversion reconstructs the original text from its vector: vec2text iteratively refines a guess until its embedding matches the target and can recover a large fraction of short inputs (reported ~90%+ of a 32-token passage from some text-embedding models), and newer attacks (e.g. ALGEN) do it few-shot from a single leaked vector or even zero-shot across embedding spaces. So a leaked or over-shared vector store can expose the very documents you indexed — medical notes, contracts, support tickets. OWASP now lists vector/embedding weaknesses among the 2025 LLM risks, and the threat compounds with vector poisoning (malicious entries steering retrieval).
Mitigate in layers. Access control first: treat the vector DB like the source corpus — encrypt at rest, lock down API/IAM, never expose raw vectors to clients (the attack needs the vectors). Minimise at ingest: run Presidio/DLP to strip PAN/Aadhaar/PII before chunks are embedded, so even a recovered chunk carries no identifiers. Perturb the embeddings: DP-style noise on vectors (generalised Laplace / directional Purkayastha mechanisms) or transforms like STEER cut reconstruction quality sharply while keeping most retrieval accuracy. Detect anomalous bulk reads and poisoned inserts; audit who can query/export the index. The framing that scores: "embeddings are lossily reversible, so the vector store inherits the corpus's classification and the same minimise-encrypt-monitor controls."
Priya at a Mumbai bank wants two banks to jointly compute a shared fraud score without either revealing its customer features to the other. Her panel asks which cryptographic technique fits — and what its main cost is.
5. Data Minimisation & PII
The cheapest privacy control is the data you never collect, keep, or log. This section is the un-glamorous, interview-favourite ground: anonymisation versus pseudonymisation, synthetic data's limits, PII detection with Presidio, retention, and how it all maps to GDPR and India's DPDP Act.
Q31 Anonymisation vs pseudonymisation — what's the difference and why does it matter legally?L1
Pseudonymisation replaces identifiers with a token or key (e.g., name → USR_8842) but the mapping back exists somewhere — so it is reversible and still personal data under GDPR/DPDP. Anonymisation irreversibly strips identifiability so a person cannot be re-identified by reasonable means — true anonymous data falls outside data-protection law.
Why it matters: teams call data "anonymised" when it is merely pseudonymised, and assume the regulation no longer applies. It does. The bar for genuine anonymisation is high because of re-identification via quasi-identifiers — covered next.
Q32 What is re-identification risk and how do k-anonymity / l-diversity address it?L2
Re-identification is recovering an individual from supposedly anonymous data by combining quasi-identifiers (PIN code, age, gender, date) with outside data — the classic result is that a large share of people are unique on just a few such fields.
k-anonymity requires every record to be indistinguishable from at least k-1 others on the quasi-identifiers (via generalisation/suppression), so you can't narrow below a group of k. Its weaknesses spawned l-diversity (each group must have at least l distinct sensitive values, to stop attribute disclosure) and t-closeness. All are syntactic and brittle against side knowledge — which is the argument for DP when you need a real guarantee.
Q33 Does synthetic data solve privacy? What are its leakage limits?L2
Not automatically. Synthetic data is generated to mimic the statistics of real data, and it helps — there is no direct one-to-one record. But a generative model trained on real data can memorise and regurgitate real records, or preserve enough structure that membership and attribute inference still succeed. "Synthetic" is not a synonym for "private".
The defensible version is DP-trained synthetic data: train the generator with DP-SGD so the output carries a formal epsilon bound. Then test it — run a membership inference attack against the generator and a nearest-neighbour distance check to confirm it isn't copying training rows. Generate-then-trust is a mistake; generate-then-measure is the standard.
Q34 How does Presidio detect and redact PII in prompts and logs?L2
Microsoft Presidio is an open-source PII tool with two parts: an Analyzer that finds entities and a Anonymizer that transforms them. The Analyzer combines regex patterns, named-entity recognition (spaCy/transformer models), checksum validators (e.g., credit-card Luhn), and context words to score detections, returning entity type, span, and confidence.
You add custom recognizers for India-specific PII — Aadhaar, PAN, GST, IFSC. The Anonymizer then redacts/masks/replaces/hashes/encrypts each span. In an LLM pipeline you run it on inbound prompts (strip PII before it reaches the model or logs) and on outbound responses/logs (catch regurgitated PII), reducing OWASP LLM02 exposure.
Q35 Design a data-minimisation + retention strategy for an LLM customer-support product.L3
Work the data lifecycle. Collect: only fields needed to answer — purpose limitation; don't log full payloads by default. Ingest: run Presidio on prompts to redact Aadhaar/PAN/card numbers before they hit logs, vector stores, or fine-tune sets. Store: separate operational data from training data; pseudonymise user IDs; encrypt at rest. Train: de-duplicate and scrub the corpus; consider DP fine-tuning. Retain: set explicit TTLs (e.g., transcripts 30–90 days) with automated deletion, and honour DPDP erasure/consent-withdrawal requests. Serve: output-side DLP to catch leaks.
Closing principle for the panel: data you never store cannot leak, be subpoenaed, or breach — minimisation is the control with the best ROI, and it is a DPDP/GDPR obligation, not a nice-to-have.
Q36 How do these techniques map to GDPR and India's DPDP Act?L3
Tie controls to legal principles. Data minimisation and purpose limitation are direct obligations under both GDPR (Art. 5) and the DPDP Act 2023. Pseudonymisation is an explicitly named GDPR safeguard but keeps data in scope; only true anonymisation exits scope. The right to erasure is hard for trained models — you cannot easily un-train a record, so favour minimisation and consider machine unlearning or retraining for high-risk cases.
DPDP adds consent, data-localisation pressure (relevant to FL/TEE placement), and breach notification — a model regurgitating PII is a reportable event. DP and synthetic data help you argue you process less identifiable data, but they don't, on their own, satisfy consent or erasure. Map each technique to a duty; don't claim one tool covers all.
Q37 A user invokes right-to-erasure on a trained model — is retraining the only option? Explain SISA and approximate/certified unlearning.L3
Start by naming the hard problem: deleting the row from the database does not remove its influence from the weights — the model still memorised it, so erasure obligations (GDPR Art. 17, DPDP) bite the model, not just the store. Full retraining from the surviving data is the gold standard (provably removes the record) but is too slow/expensive to do per request. Machine unlearning is the field that does it cheaply. The cleanest exact method is SISA (Sharded, Isolated, Sliced, Aggregated): partition data into shards, train an independent sub-model per shard, and ensemble them — to forget a record you only retrain the one shard it lived in, cutting cost by roughly the shard count. The trade-off is some accuracy loss from sharding the data.
When even shard-retraining is too costly, use approximate unlearning: directly edit the trained weights to estimate "the model you'd have without that point" — e.g. an influence-function / Newton step in the opposite direction of the record's contribution, or gradient-ascent-style forgetting. These are fast but heuristic. Certified removal is the rigorous subclass: it gives a provable (epsilon-style) bound that the unlearned model is statistically indistinguishable from a retrained one — conceptually DP applied to the act of forgetting. Gotchas to state: verify unlearning with a membership-inference test on the forgotten record (it should drop to chance); naive gradient-ascent "forgetting" can quietly wreck overall accuracy or fail to forget; and downstream artefacts — checkpoints, caches, embeddings/vector stores, and logs — must be purged too, or the data resurfaces. For high-risk data, favour data minimisation up front so there's less to unlearn later.
Divya at a Hyderabad SOC finds that a support chatbot's fine-tuning logs contain customer PAN numbers, phone numbers and addresses in plain text, and the model sometimes repeats them. Predict what went wrong and the one pipeline change that fixes the blind spot.
⚡ Privacy-Preserving Machine Learning last-minute cheat-sheet
Membership: was I in the data? Inversion/attribute: what is my value? Extraction: LLM regurgitates verbatim PII. All worse with overfit/big/duplicated data. Report TPR@low-FPR, not just AUC.ε smaller = stronger (exponential); pair with δ < 1/N (e.g. 1e-5). ε≤1 strong, 1-3 reasonable, tens = weak.Opacus / TF-Privacy.HE: math on ciphertext, slowest (bootstrapping). SMPC: split-trust, network-bound, no-collusion. TEE: enclave, near-native speed, trust hardware + side-channels. Use attestation.OWASP LLM02 Sensitive Info Disclosure · NIST AI 100-2 adversarial taxonomy · MITRE ATLAS · GDPR Art.5 minimisation · DPDP Act 2023 consent/localisation/breach.Glossary — terms an interviewer will probe
- Membership Inference (MIA)
- Attack that decides whether a specific record was in the training set, usually from confidence/loss signals.
- Model Inversion
- Reconstructing representative input features (e.g., a face) for a class by optimising inputs against the model.
- Attribute Inference
- Using a model to predict a missing sensitive field of a known individual.
- Memorisation / Extraction
- An LLM reproducing training data verbatim; extraction prompts it to spill memorised PII/secrets.
- Differential Privacy (DP)
- Formal guarantee that any one record barely changes the output, bounded by epsilon (and delta).
- Epsilon (ε)
- DP privacy-loss budget; smaller means more noise and stronger privacy. Exponential scale.
- Delta (δ)
- Probability the epsilon bound fails; set below 1/N, often 1e-5.
- DP-SGD
- Private training: per-example gradient clipping plus Gaussian noise, tracked by a privacy accountant.
- Privacy Budget / Composition
- Total epsilon spent; privacy loss accumulates across queries/epochs (advanced composition grows ~sqrt(k)).
- Federated Learning (FL)
- Training across clients that share model updates, not raw data; aggregated with FedAvg.
- Gradient Leakage
- Reconstructing training inputs from shared gradients (Deep Leakage from Gradients).
- Secure Aggregation
- Protocol letting the server learn only the sum of client updates via cancelling masks/secret-sharing.
- Homomorphic Encryption (HE)
- Computing on encrypted data; decrypting the result equals computing on plaintext (CKKS/BFV).
- SMPC
- Secure multiparty computation: parties jointly compute over secret-shared inputs, revealing only the output.
- TEE / Confidential Computing
- Hardware enclave (SGX/SEV-SNP/H100 CC) running code+data encrypted in memory, with remote attestation.
- Pseudonymisation vs Anonymisation
- Pseudonym = reversible token, still personal data; anonymisation = irreversible, out of legal scope.
- k-anonymity / l-diversity
- Each record indistinguishable among k on quasi-identifiers; l distinct sensitive values per group.
- Presidio
- Open-source PII Analyzer + Anonymizer using regex, NER, validators, and context to detect and redact PII.
Ask the AI Tutor — six interviewer follow-ups
🤖 Ask the AI Tutor
Tap any question — instant context-aware answer. The follow-ups your panel lobs after a textbook answer.
Pre-curated from OWASP / NIST / MITRE + community threads. For deeper, live questions, ask at chat.techclick.in.
Lock it in — explain it in your own words
📝 Self-explain · 2 minutes
In two sentences, explain the difference between differential privacy and secure aggregation in a federated-learning setup, and say which one bounds what the final model can leak about any single person.
📩 Spaced recall · 7 days, 21 days
Forgetting curve says half of this leaves your head in 7 days. Opt in and we'll send 3 micro-Qs on day 7 and day 21.
📋 Final assessment — 10 questions, 70% to pass
1 Remember · 3 Apply · 4 Analyze · 2 Evaluate. Pass and the lesson stamps as complete on your profile.
In differential privacy, what does a smaller value of epsilon mean?
epsilon is a tighter privacy-loss bound: more noise is added and less is revealed about any individual, usually at some accuracy cost. a inverts it — smaller epsilon is stronger, not weaker. c epsilon governs privacy, not loop speed. d DP adds noise; it is not encryption of weights.Aditya at an Infosys account must train a churn model on customer data while limiting how much any one customer's record influences the model. Which single approach fits best?
epsilon. a encryption at rest protects stored bytes, not what the trained model leaks. b a CAPTCHA blocks bots, unrelated to per-record influence. d more epochs usually increases memorisation, worsening leakage.Neha sets up federated learning across three Chennai clinics so patient records stay local. A reviewer notes the central server still receives each clinic's raw gradient update every round. Which change most reduces the leakage from those updates?
Vikram at a Flipkart team must ensure customer PII (PAN, phone, address) never enters the fine-tuning corpus or the chat logs. Which step best addresses this before any data is stored?
At a Mumbai bank, a researcher with only API access reconstructs recognisable facial features for a given identity class from Priya's face-recognition model. The training data was never exposed and the API returns full confidences. Which attack best fits?
Sneha at a TCS account finds a credit model reveals, with high accuracy, whether a specific person's record was in the training set; the model is heavily overfit and the API returns full softmax confidences. Which factor most enables this attack?
Aman at a Pune fintech runs the same DP-trained model through weekly retrains on a largely fixed customer base, each at epsilon 2, with no central tracking. After six months the privacy team cannot state the real guarantee. What is the root cause?
epsilon well past the per-run value. a a high noise multiplier would mean stronger privacy, not an unknown one. b DP is data-type agnostic; tabular runs spend budget too. c federated learning is unrelated to budget accounting across retrains.Divya at a Hyderabad SOC must let two banks jointly compute a shared fraud score without either revealing its raw customer features, and a single bank should not be able to peek at the other's inputs. Which design best meets the requirement?
A Bangalore AI startup architect argues: We use federated learning, so patient data never leaves the hospitals — that means our system is fully privacy-compliant, no extra controls needed. Ananya must judge this for the panel. What is the best assessment?
For a Mumbai bank, a manager says: To stop membership inference, just round the confidence scores we return — that is enough, nothing else is needed. Aditya must respond to the panel. Which judgement is soundest?
Sources cited inline (re-checked 2026-06)
- OWASP Top 10 for LLM Applications 2025 — LLM02 Sensitive Information Disclosure:
https://genai.owasp.org/llm-top-10/ - NIST AI 100-2 E2025 — Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations:
https://csrc.nist.gov/pubs/ai/100/2/e2025/final - MITRE ATLAS — Adversarial Threat Landscape for AI Systems:
https://atlas.mitre.org/ - Abadi et al., Deep Learning with Differential Privacy (DP-SGD):
https://arxiv.org/abs/1607.00133 - Shokri et al., Membership Inference Attacks Against Machine Learning Models:
https://arxiv.org/abs/1610.05820 - Carlini et al., Extracting Training Data from Large Language Models:
https://arxiv.org/abs/2012.07805 - Zhu et al., Deep Leakage from Gradients:
https://arxiv.org/abs/1906.08935 - Bonawitz et al., Practical Secure Aggregation for Privacy-Preserving Machine Learning:
https://eprint.iacr.org/2017/281 - Microsoft Presidio — PII detection and anonymization docs:
https://microsoft.github.io/presidio/ - India Digital Personal Data Protection Act, 2023 (DPDP) — MeitY:
https://www.meity.gov.in/data-protection-framework
Next lesson · Privacy-Preserving Machine Learning — Machine Unlearning & the Right to Erasure
How do you make a trained model 'forget' one user's data without full retraining? We cover exact vs approximate unlearning, SISA, certified removal, and how it maps to DPDP/GDPR erasure requests.