If a server calls pickle.dumps before pickle.loads is there any way for RCE?-CodePudding

Disclaimer: This question is not for malicious purposes!! I am working on my OWN virtual machine!

The article here demonstrates how loading untrusted pickle data can lead to remote code execution, I am investigating ways of using this workflow without the security issues.

My question is as follows - If I have made it such that the webapp gets a request in Flask, uses pickle.dumps() on the request.form, then uses pickle.loads() on what was previously dumped, is there still a way to execute malicious code?

Example server code:

@blueprint.route('/test', methods=['GET', 'POST'])
def test():
    test=pickle.dumps(request.form) 
    test2=pickle.loads(test) # THE CODE SHOULD BE EXECUTED AT THIS POINT
    return ...

Is this workflow still vulnerable? From my understanding, the most common type of exploit with pickle comes when b64 string is passed through and interpreted by pickle.loads(). However, is it possible to achieve the same results if pickle.dumps() is called on the form prior to pickle.loads()?

I have tried a couple things, but nothing has panned out. Please let me know if you know the secret code :)

here is an example malicious user code from the same article

    import pickle
    import base64
    import os
    
    
    class RCE:
        def __reduce__(self):
            cmd = ('echo EXECUTED THIS STATEMENT')
            return os.system, (cmd,)
    
    
    if __name__ == '__main__':
        pickled = pickle.dumps(RCE())
        print(base64.urlsafe_b64encode(pickled))
        # Running pickle.loads(pickle.dumps(RCE())) would execute 'echo EXECUTED THIS STATEMENT'
        # I need to pass through RCE() because pickle.dumps() and pickle.loads() are server-side

That would return a base64 string, that when interpreted by pickle.loads(), would execute the code in cmd.

But how can you pass the result of RCE() in a request, so that it can then be dumped by pickle.dumps() on the server-side, before pickle.loads() and still execute malicious code?

Example (This code does not work):

client code

class RCE:
    def __reduce__(self):
        cmd = ('echo EXECUTED THIS STATEMENT')
        return os.system, (cmd, )

data = {
    'test': RCE()
}
s = requests.Session()
r = s.post(URL   "/test", data=data)

server-side code

@blueprint.route('/test', methods=['GET', 'POST'])
def test():
    test=pickle.dumps(request.form) 
    test2=pickle.loads(test) # THE CODE SHOULD BE EXECUTED AT THIS POINT
    return ...

Example (This code works):

client code

class RCE:
    def __reduce__(self):
        cmd = ('echo EXECUTED THIS STATEMENT')
        return os.system, (cmd, )

data = {
    'test': pickle.dumps(RCE())
}
s = requests.Session()
r = s.post(URL   "/test", data=data)

server-side code

@blueprint.route('/test', methods=['GET', 'POST'])
def test():
    test2=pickle.loads(request.form['test']) # THE CODE SHOULD BE EXECUTED AT THIS POINT
    return ...

My thinking is the following, is it possible to have a string, that when serialized by pickle.dumps() on the server-side, returns the same value as if pickle.dumps(RCE()) was executed on the client-side. Of course, the result from pickle.dumps() on the server side would be a little different because of the request.form aspect. From my undestanding, as long as there is executable code in the string, pickle.loads() will execute it.

CodePudding user response：

No there is no way the server can execute remote code by just dumping then loading it, but you can't load pickled data structures either.

I will use pickletools.dis to demonstrate what would actually happen:

import pickle
import pickletools
class RCE:
    def __reduce__(self):
        return eval, ("print('MALICIOUS PYTHON CODE HERE')",)

pickled_malicious = pickle.dumps(RCE())
print("what is executed when loading malicious pickle:")
pickletools.dis(pickled_malicious)
print("pickle is type:", type(pickled_malicious))

pickled_string = pickle.dumps(pickled_malicious)
print("what is executed when loading the dump of malicious")
pickletools.dis(pickled_string)

when the malicious code is loaded we load the function eval or os.system as well as the argument then the REDUCE op code runs that function:

what is executed when loading malicious pickle:
    0: \x80 PROTO      3
    2: c    GLOBAL     'builtins eval'
   17: q    BINPUT     0
   19: X    BINUNICODE "print('MALICIOUS PYTHON CODE HERE')"
   59: q    BINPUT     1
   61: \x85 TUPLE1
   62: q    BINPUT     2
   64: R    REDUCE
   65: q    BINPUT     3
   67: .    STOP

The pickled malicious code itself though is just a bytes object,

pickle is type: <class 'bytes'>

so if you dump that the loading would just load a literal bytes object (or maybe string if you are doing base64 encoding but either way it is just a literal at this point)

what is executed when loading the dump of malicious
    0: \x80 PROTO      3
    2: C    SHORT_BINBYTES b"\x80\x03cbuiltins\neval\nq\x00X#\x00\x00\x00print('MALICIOUS PYTHON CODE HERE')q\x01\x85q\x02Rq\x03."
   72: q    BINPUT     0
   74: .    STOP
highest protocol among opcodes = 3

This means that if the server is just calling pickle.dumps in the input (which is a string of base64 data or bytes data containing pickle data, either way it is just a literal value when it is dumped) then when it calls pickle.loads on that result it will just get back the original input.

Any scenario that interprets user input as pickle data is vulnerable - but you aren't doing that here - you are creating pickle data from known safe input (the input string) and then loading that.