Insecure Deserialization: Java Gadget Chains, Python Pickle, and Safe Alternatives

Insecure deserialization is dangerous because it so frequently leads directly to remote code execution. Java’s native serialization and Python’s pickle module both execute arbitrary code during deserialization — this is by design, which is exactly the problem — and both have been exploited in high-profile breaches.

Why Deserialization Is Dangerous

Serialization converts an object to bytes for storage or transmission. Deserialization reconstructs the object. The vulnerability arises when:

Untrusted data (from a user, network, or cookie) is deserialized.
The deserialization process executes code as part of object reconstruction.
An attacker crafts input that triggers that code execution with their payload.

Python: The `pickle` Problem

Python’s pickle is explicit in its documentation: “The pickle module is not secure. Only unpickle data you trust.” Yet it appears in caching layers, job queues, and ML model storage where the trust boundary isn’t obvious.

import pickle

# What pickle can do during deserialization:
class Exploit:
    def __reduce__(self):
        import os
        return (os.system, ('id',))  # Executes 'id' on unpickle

payload = pickle.dumps(Exploit())

# Any application that does this is vulnerable:
pickle.loads(user_supplied_data)  # Executes arbitrary code

Real-world pickle attacks appear in:

Redis caches storing session data
Celery task queues using pickle as the serializer (not the default in modern versions)
Scikit-learn model files loaded from user uploads
Flask session cookies using pickle-based signing

Safe Alternatives to pickle

JSON for data interchange:

import json

# Instead of pickle.dumps(obj), use:
data = json.dumps({"user_id": 123, "role": "admin"})
# Instead of pickle.loads(data):
obj = json.loads(data)

dataclasses + JSON for structured objects:

from dataclasses import dataclass, asdict
import json

@dataclass
class UserSession:
    user_id: int
    role: str
    expires: float

def serialize_session(session: UserSession) -> str:
    return json.dumps(asdict(session))

def deserialize_session(data: str) -> UserSession:
    d = json.loads(data)
    return UserSession(**d)  # Safe  --  only constructs your known class

marshmallow or pydantic for untrusted input:

from pydantic import BaseModel

class TaskPayload(BaseModel):
    task_name: str
    args: list[str]
    priority: int

# Validates types and rejects unexpected fields
payload = TaskPayload.model_validate_json(user_input)

For ML models, prefer safetensors or ONNX over pickle:

# Vulnerable  --  .pkl files from users execute code on load
model = pickle.load(open('uploaded_model.pkl', 'rb'))

# Safe  --  safetensors format
from safetensors.torch import load_file
model_weights = load_file('model.safetensors')  # No code execution possible

Java: Native Serialization and Gadget Chains

Java’s ObjectInputStream.readObject() is the equivalent of pickle.loads() — it reconstructs an object graph and calls lifecycle methods along the way. The danger is “gadget chains”: sequences of classes already present in the JVM classpath that, when deserialized in a specific order, execute arbitrary commands.

The Apache Commons Collections gadget chain (discovered 2015) affected WebSphere, WebLogic, JBoss, Jenkins, and many others. Any server with commons-collections in the classpath and a deserialization endpoint was vulnerable to RCE.

// Any code that does this with untrusted input is a critical vulnerability
ObjectInputStream ois = new ObjectInputStream(inputStream);
Object obj = ois.readObject();  // Gadget chains can execute here

Detecting Vulnerable Endpoints

Look for these patterns in your codebase:

// Dangerous patterns
new ObjectInputStream(socket.getInputStream()).readObject()
SerializationUtils.deserialize(bytes)  // Apache Commons Lang
Base64.decode(cookieValue) → ObjectInputStream  // Cookie-based session serialization

Using `ysoserial` to Test (Safely)

# ysoserial generates payloads for known gadget chains
java -jar ysoserial.jar CommonsCollections6 "id" > payload.ser

# Test against your staging endpoint  --  if you get code execution, patch immediately
curl -X POST https://staging.myapp.com/api/session \
  -H "Content-Type: application/octet-stream" \
  --data-binary @payload.ser

Safe Alternatives: JSON and Protocol Buffers

// Replace ObjectInputStream with Jackson JSON
import com.fasterxml.jackson.databind.ObjectMapper;

// DANGEROUS  --  arbitrary class deserialization
@JsonTypeInfo(use = JsonTypeInfo.Id.CLASS)  // Never use Id.CLASS with untrusted input
Object obj = mapper.readValue(json, Object.class);

// SAFE  --  deserialize to a known, specific class
UserSession session = mapper.readValue(json, UserSession.class);

Protocol Buffers (protobuf):

// Define schema in .proto file  --  only known fields are parsed
UserSession session = UserSession.parseFrom(bytes);
// No code execution, schema validation built in

If You Must Use Java Serialization: Deserialization Filters

Java 9+ added serialization filters. Use them as a defense-in-depth measure:

// Allowlist approach  --  only permit specific classes
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
    "com.myapp.model.*;java.util.*;java.lang.*"
);

ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);

// Or use the JVM-wide filter
// -Djdk.serialFilter=com.myapp.model.*;!*

SerialKiller / NotSoSerial: Third-party libraries that provide configurable deserialization filters for older Java versions.

Finding Deserialization Vulnerabilities

# Search your Java codebase for dangerous patterns
grep -r "ObjectInputStream\|readObject\|SerializationUtils.deserialize" src/

# Python  --  find pickle usage
grep -r "pickle.loads\|pickle.load\|cPickle" . --include="*.py"

# Check for pickle-serialized Redis keys
redis-cli keys "*" | head -20
# Inspect values for \x80\x04\x95 (Python pickle magic bytes)

What to Prioritise

Search your codebase for pickle.loads, pickle.load, ObjectInputStream, and SerializationUtils.deserialize right now. Any of those processing untrusted input is a critical RCE vulnerability.

For Python: replace pickle-based caches and queues with JSON or msgpack. For ML model storage, use safetensors or ONNX — never load .pkl files from user uploads or external sources.

For Java: migrate to Jackson (deserializing to specific known classes, never Object.class) or protobuf. If you’re stuck with native serialization for a period, add allowlist-based deserialization filters as a stopgap.