Which serialization format for ML model weights cannot execute arbitrary code when it is loaded?

Correct answer: d) safetensors. d. safetensors stores only tensor data and a JSON header, so loading it runs no code. pickle, joblib, and dill all execute opcodes on load and can carry an RCE payload via __reduce__.

Sneha at a Pune fintech must accept a partner's pretrained model that only ships as a .pkl file. Which loading practice should she apply before trusting it?

Correct answer: b) Scan it with ModelScan, then load only in an isolated sandbox with egress blocked.. b. Scanning catches known unsafe opcodes, and a sandboxed load with egress blocked contains anything the scan misses. a loading in production triggers any RCE on a live host. c pickle is binary, so reading it as text is unreliable. d you cannot json.dumps() a pickle without first unpickling it, which is the dangerous step.

Rahul's team at a Bangalore AI startup deploys models to Kubernetes. He wants to guarantee only models signed by their CI ever start as pods. Which control should he apply?

Correct answer: a) An admission controller that runs cosign verify with the expected signer identity before a pod is admitted.. a. A verifying admission controller blocks unsigned or mismatched artifacts before they ever run — enforcement at the gate. b is detection after the fact, not prevention. c a tag string proves nothing; anyone can type it. d CPU limits do not check signatures.

Priya audits a Chennai ITES training pipeline whose GitHub Actions uses third-party/setup-ml@main. Her lead asks her to apply the standard hardening. What should she change?

Correct answer: c) Pin the action to a full commit SHA and enable Dependabot to propose reviewed SHA bumps.. c. A full commit SHA freezes the exact code, and Dependabot raises reviewable bumps so you stay patched without trusting a mutable ref. a @latest/@main is the very mutable-ref risk being fixed. b forking and freezing forever leaves known CVEs unpatched. d continue-on-error hides failures and weakens the gate.

Neha, a Hyderabad SOC analyst, sees that right after a downloaded checkpoint loads, a training node makes one outbound connection to 185-themed infra, then training finishes cleanly. Analyzing this, what is the most likely root cause?

Correct answer: b) A malicious deserialization payload in the checkpoint executed a beacon at load time.. b. The connection firing the instant the checkpoint loads, plus a clean finish, is the signature of a pickle __reduce__ payload that beacons then yields control back. a framework telemetry would not be tied to one specific untrusted file. c a resolver bug would not target external infra right after a load. d driver fetches do not correlate with loading a user-supplied checkpoint.

Aman finds that a Mumbai bank's model registry lets any engineer overwrite an existing version tag, and deploys pull by tag. Analyzing the supply-chain exposure, which risk is greatest?

Correct answer: b) An attacker (or mistake) silently swaps a verified model for a poisoned one under the same tag, and deploys pick it up.. b. Mutable tags break integrity: the same name can now point to different bytes, so a swap bypasses any earlier review and ships a poisoned model. Deploying by immutable digest fixes it. a, c, and d are housekeeping annoyances, not security failures.

Secure MLOps & AI Supply Chain Interview Q&A — Poisoned Models, Pickle RCE, Cosign & AI-BOM

Why this matters — you'd never eat a tiffin from a stranger, so why run their weights?

Think of a public model hub like a tiffin counter where anyone can drop a box. The label says bert-base-uncased, the smell is right, the box looks official. But one box was swapped by someone who wants into your kitchen. With ML, opening that box is torch.load() — and a malicious pickle runs their code on your machine before a single prediction happens.

Interviewers probe this because most candidates treat models as inert data files. They are not. A model is a program plus its training history. Panels want to hear that you check where it came from, what's inside the file, and whether it was tampered with — provenance, serialization safety and integrity, in that order.

Scenario · Sneha — junior ML security engineer at a Pune fintech

The panel slides over a laptop. "Our data team pulled a sentiment model off Hugging Face last week. The training box got crypto-mined two days later. Walk us through how that happened." Sneha knows the model was the entry point but freezes on the exact mechanism — pickle? layer? CI runner?

The fix is a clean mental model: untrusted weights are untrusted code, the load function is the trigger, and safetensors plus a scanner plus a signature break the chain. Learn the chain once and these questions stop being scary.

1. AI/ML Supply-Chain Risk

Every model carries a history: base weights, fine-tune data, dependencies, the person who uploaded it. The supply chain is all of that, and any link can be poisoned. These questions test whether you treat third-party models as untrusted artifacts.

Q1 What do we mean by the 'AI/ML supply chain', and why is it a security concern?L1

The AI/ML supply chain is everything that goes into a deployed model: the base/pre-trained weights, fine-tuning datasets, Python dependencies (transformers, torch), the training pipeline, and the registry that ships it. It's a concern because each link is a place to inject harm — a poisoned dataset, a trojaned weight file, a malicious dependency, or a tampered artifact in transit. Unlike normal software, a model is opaque: you can't read its 'source' line by line. So you must secure provenance (where each piece came from), integrity (it wasn't altered), and the load path (it can't run code on you). This maps to OWASP LLM03 (supply chain) and MITRE ATLAS supply-chain techniques.

They want the three pillars: provenance, integrity, safe loading — not just 'malware in models'.

Q2 A teammate at a Bangalore AI startup says 'just download the model and run it.' Why is that dangerous?L2

Because 'download and run' conflates a data file with a trusted program. Three things can go wrong before you ever get a prediction. One: the file format (pickle / PyTorch .bin) executes arbitrary code on load, so a malicious upload gets remote code execution on your box. Two: the weights may be poisoned or trojaned — clean on benchmarks but back-doored on a trigger phrase. Three: it could be a typosquat of a popular repo. So the safe pattern is: pin to a specific commit/revision, prefer safetensors, scan with ModelScan, verify a signature or hash, and only then load — ideally first in a sandbox. 'Download and run' skips every one of those gates.

RCE on load + poisoned weights + typosquats, plus the gated alternative.

Q3 Explain typosquatting on a model hub. How would you defend against it?L2

Typosquatting is uploading a repo whose name closely mimics a trusted one — meta-llama/Llama-3-8B vs a look-alike meta-IIama/Llama-3-8B (capital-i for lowercase-L), or openai-community vs openai-comunity. Pulling the wrong namespace gives an attacker your trust. Defences: pin the exact org + repo + revision SHA, never a moving tag; maintain an internal allow-list of vetted model IDs mirrored into your own registry; check the publisher's verified/org badge and download counts; and verify a known-good hash. At a Mumbai bank we'd block direct hub pulls in CI and only allow models that already passed review into the private registry.

Concrete look-alike example + pin-by-SHA + internal mirror/allow-list.

Q4 Describe the PoisonGPT attack. What does it teach us about model provenance?L3

PoisonGPT (Mithril Security, 2023) was a proof-of-concept: they took a real open model (GPT-J-6B), surgically edited it to emit one specific false fact while behaving normally everywhere else, and re-uploaded it under a name resembling the legitimate one. It passed standard benchmarks, so a downstream user couldn't tell. The lesson: benchmarks are not provenance. A model can be 99% honest and still carry a planted lie or back-door. You cannot detect this by evaluating outputs alone. You need a verifiable chain — a signature from a known publisher, an attestation of how it was built, and an AI-BOM listing the base model and training data. Trust the origin and the build, not the vibe of the outputs.

Surgical edit + passes benchmarks + 'benchmarks ≠ provenance' → need signing/attestation.

Q5 How would you vet a third-party model before allowing it into production?L3

A staged gate, not a single check. (1) Source: verified publisher, pin org/repo/revision SHA, no typosquats. (2) Format: prefer safetensors; reject raw pickle where possible. (3) Scan: run ModelScan / Protect AI Guardian for serialization payloads and architectural back-doors. (4) Integrity: verify Sigstore signature or a published SHA-256. (5) BOM: capture/inspect the AI-BOM — base model, datasets, licence. (6) Behaviour: red-team with garak/PyRIT for triggers and jailbreaks; check licence and data-use terms. (7) Isolation: first load runs in a network-egress-blocked sandbox. Only then promote into the private registry. Document the decision so it's auditable.

A multi-stage pipeline ending in registry promotion, not a one-off scan.

Q6 What is dataset poisoning, and how does it differ from a poisoned model file?L2

Dataset poisoning corrupts the training data — an attacker injects mislabelled or trigger-bearing samples so the model learns a back-door or a bias. It happens before training and bakes the flaw into the weights. A poisoned model file is the artifact itself being tampered with — either malicious serialization (code that runs on load) or trojaned weights swapped in after training. Key difference: dataset poisoning is a training-time integrity problem (defend with data provenance, dedup, anomaly detection on labels, signed datasets in the BOM); a poisoned file is an artifact integrity/loading problem (defend with safetensors, scanning, signing). MITRE ATLAS treats these as distinct techniques.

Training-time data corruption vs artifact tampering — different defences.

Q7 Your CI pulls models straight from a public hub. Design a safer flow for a TCS-scale team.L3

Cut the direct dependency. (1) Stand up a private registry (MLflow / a model registry) as the single source of truth; CI and serving may pull only from it. (2) A separate, controlled ingestion pipeline pulls from the public hub by pinned revision SHA, runs ModelScan, verifies signatures, generates an AI-BOM, and red-teams before publishing internally. (3) Ingestion runs on an isolated runner with egress restricted to the hub only. (4) Sign the internal copy with cosign and require signature verification at deploy (admission policy). (5) Network policy blocks training/serving nodes from reaching public hubs at all. So nothing untrusted ever touches build or runtime directly.

Private registry as choke-point + isolated ingestion + egress controls + signed promotion.

Legend untrusted / attacker trusted / corporate inspection / policy point the key "aha" node allowed

Every hop into your model is an attacker entry point. Trace the path from a public hub down to your inference server, and notice each spot marked red where someone can poison data, swap a model, or tamper with a build.

Quick check · inline mini-quiz #1

Sneha, an ML engineer at a Pune fintech, pulls a sentiment model from Hugging Face to ship in production. Her panel asks how she should verify the artifact before trusting it. Which step gives the strongest supply-chain assurance?

a) Trust the download because the repo has 40,000 stars and a high download count. b) Run ModelScan on the file and check the publisher's signed provenance with cosign verify against a pinned digest. c) Load it in a notebook and watch CPU usage for anything odd. d) Rename the file to .safetensors so it cannot run code.

Correct: b. Scanning with ModelScan catches unsafe deserialization payloads, and verifying signed provenance with Sigstore cosign against a pinned SHA-256 digest proves the artifact is the one the publisher actually built. a is popularity, not integrity — stars are gameable. c watching CPU is not a security control and misses dormant payloads. d renaming does nothing; the format is decided by the file contents, not the extension.

2. Model Serialization Attacks

The most direct way to own an ML box is the load step. Several common formats run arbitrary Python during deserialization. These questions test whether you know which formats are dangerous, why, and what to use instead.

Q8 Why can loading a Python pickle file execute arbitrary code?L1

Pickle is not just data — it's a tiny stack-based program for reconstructing Python objects. During pickle.load(), the __reduce__ mechanism and opcodes like REDUCE / GLOBAL can call any callable, for example os.system or subprocess.Popen. So a crafted pickle says 'when you unpack me, run this command,' and unpickling obeys — that's arbitrary code execution, before you ever use the model. This matters because torch.load(), joblib.load() and many model loaders sit on top of pickle. The rule interviewers want: never unpickle data you don't trust.

__reduce__/REDUCE opcode runs arbitrary callables on load — RCE.

Q9 Which common ML formats can execute code on load, and which are safe?L2

Dangerous (code can run on load): raw pickle/.pkl, joblib, PyTorch .bin/.pt/.pth (pickle under the hood), Keras .h5/.keras with a Lambda layer (embeds arbitrary code), and TensorFlow SavedModel (can carry custom ops / restore-time code). Safe: safetensors — it stores only tensors plus a JSON header, with no code path and no Python execution on load. ONNX and GGUF are mostly data but still parse with care. The senior answer: default to safetensors; if a format can run code, treat the file as untrusted code and scan/sandbox it.

Pickle/joblib/.bin/Keras-Lambda/SavedModel dangerous; safetensors safe; reason for each.

Q10 What is safetensors and why is it considered safe?L2

safetensors (Hugging Face) is a serialization format that stores just the raw tensor bytes and a small JSON header describing names, dtypes and shapes. There is no pickle, no Python objects, no callables — so loading cannot trigger code execution. It also supports zero-copy and lazy loading, so it's usually faster and more memory-efficient than pickle-based formats. Limits: it stores tensors only, not arbitrary Python objects, so custom model logic still lives in your code. But for weights — which is what most supply-chain attacks target — it removes the RCE-on-load risk entirely. That's why safetensors is the default download format on many hubs now and the answer interviewers expect.

Tensors + JSON header, no code path; bonus: zero-copy/fast; limit: tensors only.

Q11 What does ModelScan do, and what are its limits?L2

ModelScan (open-source, Protect AI) statically inspects model files for unsafe code — it flags dangerous pickle opcodes/imports (e.g. os, subprocess) and similar payloads across H5, Pickle and SavedModel-style formats covering PyTorch, TensorFlow, Keras, sklearn and XGBoost. You run modelscan -p ./model.pkl in CI before promotion. Limits: it's static, so cleverly obfuscated or novel payloads can evade it — in 2025 researchers found bypasses in pickle scanners (e.g. CVE-2025-1716 and JFrog/Sonatype picklescan findings). So treat ModelScan as one gate, not the whole defence: combine it with preferring safetensors, signature verification, and sandboxed first-load.

Static scan of unsafe opcodes/imports + the CLI + honest limits (bypasses).

Q12 How does Protect AI Guardian extend ModelScan for enterprises?L2

Guardian builds on ModelScan but adds enforcement and breadth for teams. It scans 35+ formats — PyTorch, TensorFlow, ONNX, Keras, Pickle, GGUF, and safetensors — and detects deserialization payloads, architectural back-doors and runtime threats. The key difference is it acts as a secure gateway/policy point: it can sit in front of a hub and block models that fail policy before they reach your developers or registry, with central reporting. So ModelScan is the engine you run ad-hoc; Guardian is the managed gate that enforces 'no unsigned/unsafe model passes' across the org. For a regulated Mumbai bank, that central enforcement and audit trail is the selling point.

Enforcement gateway + 35+ formats + architectural backdoor detection + central policy/audit.

Q13 Aditya at a Hyderabad SOC must load a vendor's PyTorch .bin. How does he reduce risk?L3

First, push back: ask the vendor for safetensors and a signature/hash — that removes most risk. If he must load the .bin: (1) convert it to safetensors in a throwaway, network-egress-blocked sandbox (the conversion itself unpickles, so isolate it). (2) Scan with ModelScan/Guardian first. (3) Use torch.load(..., weights_only=True) on modern PyTorch, which restricts unpickling to plain tensors and blocks arbitrary globals. (4) Verify the vendor's published SHA-256 before any of this. (5) Run the sandbox as a non-root user with no cloud credentials mounted. After conversion and scan pass, store the safetensors copy in the registry and never touch the .bin again.

Ask for safetensors → sandbox-convert with egress off + weights_only=True + scan + hash.

Q14 Why isn't a model scanner enough on its own to trust a downloaded model?L3

Because a scanner answers one narrow question — 'does this file contain known-unsafe code patterns?' — and answers it statically. It can't prove the weights aren't poisoned or trojaned (PoisonGPT-style), can't confirm provenance (who built it, from what data), and can be bypassed by obfuscation or novel opcodes, as 2025 picklescan CVEs showed. Security is layered: prefer a format that can't run code (safetensors) so you're not relying on detecting payloads at all; verify a signature for provenance; check the AI-BOM for what went in; and red-team behaviour for back-doors. The scanner is a useful gate, but trust comes from origin + integrity + format, not detection alone.

Static detection has gaps (poisoning, provenance, bypass) → defence-in-depth.

A pickle file is code, not data — torch.load() runs it. Watch how __reduce__ turns a downloaded model into a reverse shell, and why switching to safetensors stops the chain dead.

Serialization security: flip to recall the answers

🥒

Why is pickle dangerous?

tap to flip

Pickle runs code during load via __reduce__. So loading any untrusted .pkl or .bin is remote code execution, not just data parsing.

🛡️

What is safetensors?

tap to flip

A format that stores only raw tensors plus JSON metadata, with no code path. So a malicious payload has nothing to execute on load.

🔍

When pickle is unavoidable?

tap to flip

Scan with ModelScan or picklescan in CI, load inside a no-egress sandbox, and watch syscalls. So a payload fires harmlessly, isolated.

✍️

Why sign models?

tap to flip

cosign signatures prove who built the artifact and that bytes are unchanged. So you reject swapped or tampered models before they load.

📦

Why pin by digest?

tap to flip

Tags are mutable, so an attacker can repoint :latest. A sha256 digest is immutable. So you always load exactly what you reviewed.

Quick check · inline mini-quiz #2

Rahul at a Bangalore AI startup inherits a repo that loads model weights with pickle.load() from a partner's S3 bucket. In his interview he is asked the real risk. What is it?

a) Pickle is slower than JSON, so inference latency goes up. b) Pickle files can be larger, wasting disk on the node. c) Unpickling runs arbitrary code via __reduce__, so a tampered file gives remote code execution at load time. d) Pickle drops floating-point precision, hurting accuracy.

Correct: c. Python pickle executes a __reduce__ opcode during load, so a malicious weights file runs attacker code the moment Rahul calls pickle.load() — full RCE on the training or serving host. a and b are performance trivia, not the security issue. d is false; pickle preserves precision. The fix is to switch to safetensors, which stores only tensors and cannot execute code.

Pause & Predict #2

Neha, a SOC analyst in Hyderabad, sees a training node make an outbound DNS request to a random domain seconds after a teammate loaded a downloaded .pkl checkpoint. The job otherwise completes normally. Predict the cause and the fix.

A malicious pickle payload fired at load time. The crafted checkpoint embedded a __reduce__ hook that executed a callback (the DNS beacon) the instant pickle.load() ran, then let training continue so nothing looked broken. Diagnose by scanning the file with ModelScan and checking it for os.system/subprocess opcodes, and correlate the DNS request with the load timestamp. The fix: quarantine the node and the file, ban raw pickle, and require safetensors plus pre-load scanning in CI. Verify by replaying the load in an isolated sandbox with egress blocked and confirming no outbound call fires.

3. Integrity, Signing & AI-BOM

Once a model is vetted, you need to prove it stays vetted and know exactly what's inside it. Signing answers 'is this the file we approved?'; the AI-BOM answers 'what is this model made of?' These questions test both.

Q15 Why hash a model file, and where does plain hashing fall short?L1

You hash (SHA-256) a vetted model so you can detect tampering: re-hash before load and compare to the approved value — any change means the bytes were altered. It's cheap and effective for integrity. Where it falls short: a hash only proves the file equals some known value; it says nothing about who produced it or how. If an attacker compromises wherever you store the 'known-good' hash, they can swap both file and hash. So hashing gives integrity but not authenticated provenance — for that you need a signature tied to an identity, ideally with a transparency log. Hash plus signature together is the real answer.

Hash = tamper detection but not provenance; need signing for 'who/how'.

Q16 How does Sigstore cosign sign and verify a model artifact?L2

cosign (Sigstore) signs artifacts using short-lived, keyless certificates: it authenticates you via OIDC (e.g. your CI identity), gets an ephemeral cert from Fulcio, signs, and records the signature in the Rekor transparency log — so there's no long-lived private key to leak. For a model you'd push it as an OCI artifact and run cosign sign <ref>, then verify with cosign verify --certificate-identity ... --certificate-oidc-issuer .... Verification checks the signature, the signer identity, and the Rekor log entry. The Sigstore model-transparency project applies this specifically to ML models. The win: anyone can prove a model came from your pipeline and wasn't altered, without managing key files.

Keyless OIDC + Fulcio cert + Rekor log + cosign sign/verify with identity flags.

Q17 What is SLSA provenance and how does it apply to an ML pipeline?L3

SLSA (Supply-chain Levels for Software Artifacts) is a framework for proving how an artifact was built. The core idea is signed provenance — a machine-readable attestation, generated by the build system, stating which sources, parameters and builder produced this exact output. For ML, the 'build' is training/fine-tuning: provenance records the base model, dataset references, hyperparameters, code commit and the runner identity, then it's signed (often via cosign/in-toto). Higher SLSA levels demand a hardened, non-falsifiable builder. The payoff: a verifier can confirm a model was produced by your trusted pipeline from approved inputs — defeating PoisonGPT-style swaps where the file looks fine but the origin is a lie.

Signed build attestation (sources/params/builder) extended to training inputs.

Q18 What is an AI-BOM / ML-BOM, and what does it list?L2

An AI-BOM / ML-BOM (Machine Learning Bill of Materials) is a structured inventory of everything a model is made of — the equivalent of an SBOM for ML. It records the base/pre-trained model, the datasets used (and their provenance/licences), the frameworks and dependencies, training/inference configuration, performance metrics, and licence terms. The standard most teams use is CycloneDX, which added native ML-BOM and model-card support (stable from v1.5, current spec v1.7 / ECMA-424). It matters for security and compliance: when a poisoned dataset or vulnerable dependency is disclosed, you can instantly answer 'which of our models used it?' — and EU AI Act auditors increasingly ask for it.

Inventory of base model + data + deps + config; CycloneDX as the standard.

Q19 A model in production turns out to use a poisoned dataset disclosed today. How does an AI-BOM help?L3

It turns a panic into a query. Because each model's AI-BOM lists its dataset references, you grep your BOM inventory for that dataset and immediately get the exact list of affected models, versions and the services serving them — no guessing. Then you can pull or quarantine those models, roll back to a clean version, and notify owners with evidence. Without a BOM you'd have to reconstruct training histories by hand, often impossible. Pair it with signing so the rollback target is provably the clean build, and with provenance so you can confirm which training run consumed the bad data. The BOM is what makes ML incident response fast and auditable.

BOM = instant blast-radius query → quarantine/rollback to a signed clean version.

Q20 How do signing and a BOM work together to detect tampering and trace impact?L2

They cover different questions. A signature (cosign + Rekor) answers 'is this the exact file my pipeline produced, by the expected identity?' — that's tamper detection and authenticated provenance at load/deploy time. The AI-BOM answers 'what is inside it and what is it affected by?' — that's traceability and blast-radius. Best practice: generate the BOM during the build, then attest and sign it alongside the model (cosign supports CycloneDX attestations). At deploy, an admission policy verifies the signature and that the BOM attestation is present. So you both block altered models and, when a CVE or poisoning hits a component, instantly find every model that ships it.

Signature = integrity/provenance; BOM = traceability; sign the BOM as an attestation.

Untrusted model? Don't load it — run the gate first. Follow the four checkpoints (format, scan, signature, sandbox) and see where each path ends: accepted, sandboxed, or rejected.

🖥️ This is the screen you'll use — CI -> security gate -> modelscan -p ./llama-3-instrukt.bin. (Recreated for clarity — your console matches this.)

github.com/protectai/modelscan (recreated console)

CI -> security gate -> modelscan -p ./llama-3-instrukt.bin

1Commandmodelscan -p ./llama-3-instrukt.bin

·Scanned filellama-3-instrukt.bin (PyTorch pickle)

2ResultUnsafe operator: posix.system (CRITICAL)

·Total issues1 critical, 0 high, 0 medium

·Verify (safe artifact)cosign verify-blob --key model.pub model.safetensors

·Gate decisionFAIL CLOSED -> quarantine bucket s3://mlsec-quarantine/

Fail build + quarantine

Pause & Predict #1

Aditya at a Mumbai bank deploys a fraud model. The serving pod's cosign verify step suddenly fails with error: no matching signatures, yet the model file looks unchanged and inference still works in staging. Predict the cause and the fix.

The signature no longer matches the artifact or the trusted identity. Either the image/model digest changed (a rebuild pushed a new layer) or someone re-tagged the artifact, so the pinned digest cosign checks against drifted. Diagnose by comparing the deployed digest (crane digest) with the signed one, and confirm the expected signer identity and issuer flags (--certificate-identity, --certificate-oidc-issuer) are still correct. The fix: re-sign the exact promoted artifact in CI and deploy by immutable digest, never by mutable tag. Verify by re-running cosign verify in the admission step — it should now find the matching signature before the pod starts.

4. Securing the ML Pipeline

The pipeline is where data becomes a model — and where secrets, IAM and runners become attack surface. These questions test whether you can secure CI/CD for ML the way you'd secure any production build, plus the ML-specific bits like feature stores and registries.

Q21 Why are hard-coded keys in notebooks a serious problem, and what's the fix?L1

Notebooks get committed, shared, copied into Slack and exported — so an API key or DB credential pasted into a cell leaks far beyond the author. Worse, notebook outputs and checkpoints can capture secrets too, and they rarely get rotated. The fix: never store secrets in code or notebooks. Pull them at runtime from a secrets manager (HashiCorp Vault, AWS Secrets Manager) or injected environment variables scoped to the job; use short-lived credentials / workload identity rather than static keys; and add a pre-commit secret scanner (gitleaks/detect-secrets) so a key can't be committed in the first place. Rotate anything that ever touched a notebook.

Notebooks spread widely → use a secrets manager + short-lived creds + pre-commit scan.

Q22 What does least-privilege IAM look like for a training job and a model registry?L2

Each pipeline role gets only what its step needs, nothing more. A training job needs read on the specific training data path and write only to a scoped artifact location — not broad bucket access, not registry-admin, no production secrets. Pushing to the registry is a separate identity with write to one repo/path; serving gets read-only pull and cannot publish. Use per-job workload identity (IAM roles for service accounts / OIDC), not shared long-lived keys, so credentials are short-lived and attributable. Deny by default. The test interviewers apply: if the training runner is compromised, can it poison the registry or read prod data? With least-privilege, no.

Separate scoped identities per step (train read/scoped-write, push, serve read-only) + workload identity.

Q23 Why isolate environments in an ML pipeline, and how would you do it?L2

Isolation contains blast radius: a compromised data-prep step shouldn't reach production weights, and an untrusted model's first load shouldn't touch your cloud creds. How: run dev/training/prod in separate accounts/projects with their own IAM; use ephemeral, single-use CI runners (not a long-lived shared box that accumulates secrets); apply network policies so training nodes can't reach the public internet or prod data stores they don't need; and load any untrusted/third-party model in a sandbox with egress blocked and no credentials mounted. Each environment is a boundary, so one breach doesn't become all of them.

Separate accounts/runners + network egress controls + sandbox for untrusted loads.

Q24 How do you protect the feature store and training data from tampering?L3

Treat the feature store and training data as production assets with integrity guarantees. Access: least-privilege — feature pipelines write, training reads, with separate identities; deny broad write. Integrity: version datasets and snapshot them (e.g. DVC / dataset versioning), record hashes in the AI-BOM, and prefer immutable, append-only stores so a row can't be silently rewritten. Provenance: log who/what produced each feature set; ideally sign datasets. Detection: validation and drift/anomaly checks on incoming data (range, schema, label distribution) to catch poisoning attempts. Audit: enable access logging. The goal: if data is altered or poisoned, you can both prevent it and prove which training run consumed it.

Least-privilege writes + versioned/immutable + hashed in BOM + anomaly checks + audit logs.

Q25 How should access to MLflow or a model registry be controlled?L2

Lock it down like any production system — open MLflow servers have leaked models and credentials. Authentication: require auth (MLflow's auth or, better, put it behind an SSO/identity-aware proxy); never expose it unauthenticated on the internet. Authorization: role separation — data scientists log experiments, only a CI identity (or release role) can promote/register a production model; serving has read-only pull. Stage gates: transitions to Production require approval. Integrity: require signed artifacts and verify signatures on promotion. Network: private endpoint, not public. Audit: log every registration and stage change. So no one can quietly push a malicious model to Production.

Auth + RBAC (who can promote) + stage approvals + signed artifacts + private + audit.

Q26 Priya inherits a pipeline at a Chennai ITES with one shared CI key for everything. What's wrong and how do you fix it?L3

One shared key is a single point of total compromise: anyone who reads it (or any compromised step) gets every permission — read prod data, push to the registry, deploy. It's also un-rotatable without breaking everything and un-attributable in logs. Fix in stages. (1) Replace the static key with per-job workload identity / OIDC so each step gets short-lived, scoped credentials. (2) Split permissions by step (training read-only data, separate registry-push identity, read-only serving). (3) Move any remaining secrets into Vault, injected at runtime. (4) Rotate and revoke the old key, add secret scanning, and enable per-identity audit logging. Now a single compromise can't take the whole pipeline.

Identify single-point-of-failure → per-job OIDC, scoped identities, Vault, rotate + audit.

▶ Watch a malicious model get caught before prod — Rahul at a Bangalore AI startup

You will watch a typosquatted model pulled from a public hub get scanned, flagged, quarantined, and replaced with verified safetensors.

① PULL Rahul pulls a fine-tuned model named llama-3-instrukt from a public hub — a typosquat of the real one.

▼

② SCAN The CI security gate runs modelscan -p ./llama-3-instrukt.bin on the downloaded file.

▼

③ FLAG ModelScan flags an embedded pickle __reduce__ payload calling posix.system — rated CRITICAL.

▼

④ QUARANTINE The pipeline fails closed, moves the artifact to a quarantine bucket, and alerts the Hyderabad SOC.

▼

⑤ SWAP Rahul switches to the official model.safetensors weights from the verified source.

▼

⑥ VERIFY cosign verify-blob passes, so the model registry accepts the signed artifact for deploy.

Press Play to start. Each Next advances one stage.

Quick check · inline mini-quiz #3

Priya, a DevSecOps engineer at a Chennai ITES firm, finds the model-training GitHub Actions job uses a community step pinned as actions/checkout@v4 and exports cloud creds as plaintext env vars. Which fix hardens the pipeline most?

a) Pin third-party actions to a full commit SHA and move secrets to OIDC short-lived tokens scoped to the job. b) Add a comment in the YAML warning teammates not to touch the creds. c) Make the repo private so nobody can read the workflow. d) Increase the runner's RAM so the job finishes before an attacker reacts.

Correct: a. A floating tag like @v4 lets a compromised action mutate under you; pinning to a full commit SHA freezes the exact code, and OIDC issues short-lived, job-scoped cloud tokens so a leaked log never exposes long-lived keys. b is a comment, not a control. c private repos still run compromised actions and leak secrets to logs. d RAM has nothing to do with credential exposure.

5. Securing Inference & Deployment

A vetted, signed model still has to be served safely. The inference layer is an exposed API plus a model server plus weights at rest — all attack surface. These questions test API hardening, model-server config, and protecting the model in production.

Q27 What are the basics of hardening an inference API?L1

Treat it like any production API plus ML-specific risks. Authentication & authorization: require API keys/OAuth tokens, scope them per client, no anonymous access. Rate limiting & quotas: cap requests per key to blunt abuse, denial-of-wallet, and model-extraction probing. Input validation: enforce size/shape/type limits and sanitize prompts/inputs. Output minimisation: return only what's needed — no logits/embeddings or stack traces that aid extraction. Transport: TLS everywhere. Logging/monitoring: log access and watch for extraction-style query patterns. These map to OWASP LLM apps risks like unbounded consumption (LLM10) and sensitive info disclosure.

Authn/z + rate limits + input validation + output minimisation + TLS + monitoring.

Q28 What unsafe endpoints or features must you disable on Triton, TorchServe or vLLM?L2

Model servers ship developer-friendly features that are dangerous in prod. NVIDIA Triton: disable the model-control 'explicit/poll' load API in production (it can let a caller load arbitrary models from a path) and restrict the metrics/management ports — don't expose them publicly. TorchServe: lock down the management API (default port 8081) which can register/load models from URLs — restrict allowed_urls, bind to localhost, and require auth; past CVEs (e.g. SSRF/RCE via model registration) came from leaving it open. vLLM: don't expose the OpenAI-compatible server unauthenticated; put it behind an auth gateway and limit it. General rule: bind admin/management interfaces to internal only, require auth, and disable dynamic model loading from arbitrary sources.

Triton model-control API, TorchServe mgmt port 8081/allowed_urls, vLLM auth — disable dynamic load + restrict admin.

Q29 How do containers and GPU isolation protect a serving deployment?L2

Containers give you a hardened, minimal blast radius: run the server as a non-root user, read-only root filesystem, drop Linux capabilities, no privileged mode, and a minimal base image so there's little to exploit. Mount only the model and config — never cloud credentials the model doesn't need. GPU isolation matters because GPUs are often shared: use MIG (multi-instance GPU) or per-tenant scheduling so one tenant's workload can't read another's GPU memory or starve it. Add network policies so a serving pod can reach only its dependencies. The aim: even if the model server is exploited, the attacker is boxed into a low-privilege container with no path to other tenants or your cloud account.

Non-root/read-only/cap-drop container + minimal image + MIG/per-tenant GPU + network policy.

Q30 How do you manage secrets at runtime for an inference service?L2

Inference often needs secrets — a DB password, a downstream API key, a registry pull token. Don't bake them into the image or env files in the repo. Inject them at runtime from a secrets manager (Vault, AWS/GCP Secrets Manager) or the platform's secret store, mounted as a short-lived token or file the process reads on startup. Prefer workload identity so the pod authenticates as itself and fetches scoped, auto-rotating credentials — no long-lived keys at all. Scope each secret to just that service, enable access auditing, and rotate on schedule. If the container is compromised, the attacker gets only short-lived, narrowly-scoped credentials, not your master keys.

Runtime injection from a secrets manager + workload identity + scoped/short-lived + rotation.

Q31 How do you protect model weights at rest and in use in production?L3

Weights are valuable IP and an integrity target, so defend both. At rest: store in a private bucket/registry with encryption (KMS-backed), tight least-privilege read, and signature verification on pull so a swapped file is rejected. Keep them out of public images. In transit: TLS for every pull. In use: serving pulls a verified, signed copy; the serving identity is read-only and can't overwrite the source; the container is hardened so memory isn't trivially dumped, and on shared GPUs use isolation so another tenant can't scrape device memory. Verify on load: check the hash/signature each time, not just once. So weights can't be stolen, swapped, or read by a neighbour.

Encrypt + least-privilege + signature verify on pull/load + TLS + GPU isolation.

Q32 Why minimise model output, and what should an inference response withhold?L3

Rich outputs help attackers. Returning raw logits / confidence vectors / embeddings makes model extraction and membership-inference far easier — an attacker can clone or probe your model with fewer queries. Verbose errors and stack traces leak framework versions, file paths and internal structure. So withhold what the client doesn't strictly need: return the final label/answer, not full probability distributions or internal scores; strip debug detail and return generic error messages; never echo system prompts, training snippets or other users' data; and consider rounding/limiting confidence. Pair this with rate limits and query-pattern monitoring so even minimal outputs can't be farmed at scale. This addresses OWASP sensitive-information-disclosure and model-theft risks.

Logits/embeddings/verbose errors aid extraction & leakage → return minimal; pair with rate limits.

The fast-recall cheat-sheet. Unsafe vs safe formats, what signing buys you, the scanners that matter, and how to harden the registry — memorise these tiles for the interview.

Pause & Predict #3

Karthik at a Flipkart-scale e-commerce team ships an LLM assistant. Within a day, support logs show the bot leaking another customer's order details. The model and prompt were untouched. Predict the most likely cause and the fix.

Cross-session context bleed from a shared or mis-scoped conversation store. The retrieval or memory layer is keyed too broadly (or cached responses across users), so one session pulls another user's data into context — an authorization gap, not a model flaw. Diagnose by tracing the session/tenant key through the cache and vector store and checking whether retrieval filters on the authenticated user ID. The fix: enforce per-user tenant isolation on every retrieval and cache key, and add an output guard (Presidio/Llama Guard) that blocks PII not belonging to the current user. Verify with two concurrent test users and confirm neither can ever surface the other's records.

⚡ Secure MLOps & AI Supply Chain last-minute cheat-sheet

Untrusted weights = untrusted codepickle/joblib/.bin/.pt, Keras-Lambda, TF SavedModel can run code on load. Default to safetensors.

Vet before you loadPin org/repo/revision SHA → ModelScan/Guardian → verify signature/hash → AI-BOM → sandbox first-load (egress off).

PoisonGPT lessonBenchmarks ≠ provenance. A model can pass evals and still carry a planted lie/back-door. Trust origin + build.

Scanner limitsStatic scans can be bypassed (2025 picklescan CVE-2025-1716 / JFrog). Use safetensors + signing + red-team too.

Sign + BOMcosign sign/verify (keyless OIDC + Rekor) = integrity & provenance. CycloneDX ML-BOM = traceability. Sign the BOM.

Pipeline hygieneNo keys in notebooks → Vault + workload identity. Least-privilege per step. Auth-gate MLflow; only CI can promote to Production.

Harden model serversTriton: disable model-control API. TorchServe: lock mgmt port 8081/allowed_urls. vLLM: never expose unauthenticated.

Inference + weights at restAuthn/z, rate-limit, input validation, minimise output (no logits/embeddings). Encrypt weights, verify signature on pull/load, isolate GPUs (MIG).

Glossary — terms an interviewer will probe

AI/ML Supply Chain: Everything that goes into a deployed model: base weights, data, dependencies, pipeline and registry.
Pickle RCE: Arbitrary code execution triggered when Python unpickles a malicious file via __reduce__/opcodes.
safetensors: Tensor-only format with a JSON header and no code path, so loading can't execute code.
ModelScan: Protect AI open-source scanner that flags unsafe code in pickle/H5/SavedModel artifacts.
Protect AI Guardian: Enterprise gateway built on ModelScan; scans 35+ formats and enforces model-security policy.
PoisonGPT: PoC where a real model was surgically edited to emit a false fact yet pass benchmarks.
Typosquatting: Uploading a look-alike repo name to trick users into pulling a malicious model.
Dataset poisoning: Injecting bad/triggered samples into training data so the model learns a back-door.
Sigstore cosign: Keyless artifact signing via OIDC + Fulcio certs, logged in the Rekor transparency log.
Rekor: Sigstore's append-only transparency log recording signatures for later verification.
SLSA: Supply-chain Levels for Software Artifacts: signed provenance proving how an artifact was built.
AI-BOM / ML-BOM: Bill of materials listing a model's base model, datasets, deps and config; CycloneDX is the standard.
CycloneDX: SBOM standard with native ML-BOM and model-card support (spec v1.7 / ECMA-424).
MLflow registry: Model registry tracking versions and stages; must be auth-gated with promotion controls.
Triton / TorchServe / vLLM: Model servers; their model-control/management endpoints must be locked down in prod.
Model extraction: Cloning or stealing a model by querying it; eased by exposing logits/embeddings.

Ask the AI Tutor — six interviewer follow-ups

🤖 Ask the AI Tutor

Tap any question — instant context-aware answer. The follow-ups your panel lobs after a textbook answer.

Pre-curated from OWASP / NIST / MITRE + community threads. For deeper, live questions, ask at chat.techclick.in.

Lock it in — explain it in your own words

📝 Self-explain · 2 minutes

In two sentences, explain the difference between integrity (signing/verification) and provenance (where an artifact came from) in an AI supply chain, and why you need both.

Expert version: Integrity proves the artifact has not changed since it was signed — a cosign verify against a pinned digest tells you the bytes are exactly what the signer produced. Provenance proves who built it and from what — the dataset, base model, and pipeline behind those bytes; you need both because a perfectly intact artifact is still dangerous if it came from an untrusted or poisoned source.

📩 Spaced recall · 7 days, 21 days

Forgetting curve says half of this leaves your head in 7 days. Opt in and we'll send 3 micro-Qs on day 7 and day 21.

Quiz me on this in 7 days & 21 days

Sources cited inline (re-checked 2026-06)

OWASP Top 10 for LLM Applications 2025 — LLM03 Supply Chain, LLM10 Unbounded Consumption: https://genai.owasp.org/llm-top-10/
MITRE ATLAS — adversarial-ML tactics & techniques incl. ML supply chain: https://atlas.mitre.org/
NIST AI 100-2 (2025) Adversarial Machine Learning taxonomy & NIST AI RMF: https://csrc.nist.gov/pubs/ai/100/2/e2025/final
Protect AI ModelScan — model serialization attack scanner: https://github.com/protectai/modelscan
Protect AI Guardian — secure model gateway, 35+ formats: https://protectai.com/guardian
Hugging Face safetensors — safe tensor serialization format: https://github.com/huggingface/safetensors
Sigstore cosign + model-transparency for ML model signing: https://docs.sigstore.dev/
CycloneDX Machine Learning Bill of Materials (ML-BOM), spec v1.7: https://cyclonedx.org/capabilities/mlbom/
JFrog Security — 3 zero-day picklescan bypasses (2025) & CVE-2025-1716: https://jfrog.com/blog/unveiling-3-zero-day-vulnerabilities-in-picklescan/
SLSA — Supply-chain Levels for Software Artifacts provenance: https://slsa.dev/

Next lesson · Secure MLOps & AI Supply Chain — Adversarial Robustness & Red-Teaming

You can now lock down the supply chain. Next we attack the model itself: evasion, extraction and back-door triggers with PyRIT, garak and the Adversarial Robustness Toolbox — and how to defend against them.

📚 All lessons 🧪 Practice exam 💬 Ask deeper Qs

Secure MLOps & AI Supply Chain Interview Q&A

🎯 By the end of this lesson you'll be able to

Pick your weak spot — jump straight to it

Supply-Chain Risk

Serialization Attacks

Signing & AI-BOM

Pipeline + Runtime

Why this matters — you'd never eat a tiffin from a stranger, so why run their weights?

1. AI/ML Supply-Chain Risk

2. Model Serialization Attacks

Serialization security: flip to recall the answers

3. Integrity, Signing & AI-BOM

4. Securing the ML Pipeline

▶ Watch a malicious model get caught before prod — Rahul at a Bangalore AI startup

5. Securing Inference & Deployment

⚡ Secure MLOps & AI Supply Chain last-minute cheat-sheet

Glossary — terms an interviewer will probe

Ask the AI Tutor — six interviewer follow-ups

🤖 Ask the AI Tutor

Lock it in — explain it in your own words

📝 Self-explain · 2 minutes

📩 Spaced recall · 7 days, 21 days

📋 Final assessment — 10 questions, 70% to pass

Sources cited inline (re-checked 2026-06)

Next lesson · Secure MLOps & AI Supply Chain — Adversarial Robustness & Red-Teaming