Insecure deserialization is one of the most dangerous vulnerability classes because it so frequently leads directly to remote code execution. Java’s native serialization and Python’s pickle module are both capable of executing arbitrary code during deserialization—and both have been exploited in high-profile breaches.
Why Deserialization Is Dangerous
Serialization converts an object to bytes for storage or transmission. Deserialization reconstructs the object. The vulnerability arises when:
- Untrusted data (from a user, network, or cookie) is deserialized.
- The deserialization process executes code as part of object reconstruction.
- An attacker crafts input that triggers that code execution with their payload.
Python: The pickle Problem
Python’s pickle is explicit in its documentation: “The pickle module is not secure. Only unpickle data you trust.” Yet it appears in caching layers, job queues, and ML model storage where the trust boundary isn’t obvious.
import pickle
# What pickle can do during deserialization:
class Exploit:
def __reduce__(self):
import os
return (os.system, ('id',)) # Executes 'id' on unpickle
payload = pickle.dumps(Exploit())
# Any application that does this is vulnerable:
pickle.loads(user_supplied_data) # Executes arbitrary code
Real-world pickle attacks appear in:
- Redis caches storing session data
- Celery task queues using pickle as the serializer (not the default in modern versions)
- Scikit-learn model files loaded from user uploads
- Flask session cookies using pickle-based signing
Safe Alternatives to pickle
JSON for data interchange:
import json
# Instead of pickle.dumps(obj), use:
data = json.dumps({"user_id": 123, "role": "admin"})
# Instead of pickle.loads(data):
obj = json.loads(data)
dataclasses + JSON for structured objects:
from dataclasses import dataclass, asdict
import json
@dataclass
class UserSession:
user_id: int
role: str
expires: float
def serialize_session(session: UserSession) -> str:
return json.dumps(asdict(session))
def deserialize_session(data: str) -> UserSession:
d = json.loads(data)
return UserSession(**d) # Safe — only constructs your known class
marshmallow or pydantic for untrusted input:
from pydantic import BaseModel
class TaskPayload(BaseModel):
task_name: str
args: list[str]
priority: int
# Validates types and rejects unexpected fields
payload = TaskPayload.model_validate_json(user_input)
For ML models, prefer safetensors or ONNX over pickle:
# Vulnerable — .pkl files from users execute code on load
model = pickle.load(open('uploaded_model.pkl', 'rb'))
# Safe — safetensors format
from safetensors.torch import load_file
model_weights = load_file('model.safetensors') # No code execution possible
Java: Native Serialization and Gadget Chains
Java’s ObjectInputStream.readObject() is the equivalent of pickle.loads() — it reconstructs an object graph and calls lifecycle methods along the way. The danger is “gadget chains”: sequences of classes already present in the JVM classpath that, when deserialized in a specific order, execute arbitrary commands.
The Apache Commons Collections gadget chain (discovered 2015) affected WebSphere, WebLogic, JBoss, Jenkins, and many others. Any server with commons-collections in the classpath and a deserialization endpoint was vulnerable to RCE.
// Any code that does this with untrusted input is a critical vulnerability
ObjectInputStream ois = new ObjectInputStream(inputStream);
Object obj = ois.readObject(); // Gadget chains can execute here
Detecting Vulnerable Endpoints
Look for these patterns in your codebase:
// Dangerous patterns
new ObjectInputStream(socket.getInputStream()).readObject()
SerializationUtils.deserialize(bytes) // Apache Commons Lang
Base64.decode(cookieValue) → ObjectInputStream // Cookie-based session serialization
Using ysoserial to Test (Safely)
# ysoserial generates payloads for known gadget chains
java -jar ysoserial.jar CommonsCollections6 "id" > payload.ser
# Test against your staging endpoint — if you get code execution, patch immediately
curl -X POST https://staging.myapp.com/api/session \
-H "Content-Type: application/octet-stream" \
--data-binary @payload.ser
Safe Alternatives: JSON and Protocol Buffers
// Replace ObjectInputStream with Jackson JSON
import com.fasterxml.jackson.databind.ObjectMapper;
// DANGEROUS — arbitrary class deserialization
@JsonTypeInfo(use = JsonTypeInfo.Id.CLASS) // Never use Id.CLASS with untrusted input
Object obj = mapper.readValue(json, Object.class);
// SAFE — deserialize to a known, specific class
UserSession session = mapper.readValue(json, UserSession.class);
Protocol Buffers (protobuf):
// Define schema in .proto file — only known fields are parsed
UserSession session = UserSession.parseFrom(bytes);
// No code execution, schema validation built in
If You Must Use Java Serialization: Deserialization Filters
Java 9+ added serialization filters. Use them as a defense-in-depth measure:
// Allowlist approach — only permit specific classes
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
"com.myapp.model.*;java.util.*;java.lang.*"
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);
// Or use the JVM-wide filter
// -Djdk.serialFilter=com.myapp.model.*;!*
SerialKiller / NotSoSerial: Third-party libraries that provide configurable deserialization filters for older Java versions.
Finding Deserialization Vulnerabilities
# Search your Java codebase for dangerous patterns
grep -r "ObjectInputStream\|readObject\|SerializationUtils.deserialize" src/
# Python — find pickle usage
grep -r "pickle.loads\|pickle.load\|cPickle" . --include="*.py"
# Check for pickle-serialized Redis keys
redis-cli keys "*" | head -20
# Inspect values for \x80\x04\x95 (Python pickle magic bytes)
Key Takeaways
- Never deserialize
pickledata from untrusted sources — there is no safe way to do so. - Replace pickle-based caches and queues with JSON or msgpack immediately.
- Java’s
ObjectInputStream.readObject()with untrusted input is a critical RCE vulnerability — migrate to JSON/protobuf. - If you must use Java serialization, implement allowlist-based deserialization filters.
- For ML models, use
safetensorsor ONNX format; never load.pklmodel files from untrusted sources.