Attacks Against Keystroke Data - An Overview (mostly Linux specific)
Motivation
An obvious way to retrieve sensitive information (passwords, keyphrases) is to directly catch the data when it is entered. This document contains a short introduction to such methods available under the GNU/Linux operating system. The purpose of this document is to call attention to vulnerabilities often disregarded in everyday system administration. It is not supposed to be complete or to cover all aspects of a certain attack; please read the corresponding papers, links are available at the end of this document. The document is seperated into four main categories:- theoretical background
- attacks that require root access to the target host
- attacks that don't require root access to the target host
- attacks that don't require access to the target host at all
Theoretical Background
The following text is an excerpt from [1]. Read it carefully, it's easy to comprehend and will help you in understanding the rest of this document.
Lets take a look at below figure to know how user inputs from console
keyboard are processed:
_____________ _________ _________
/ \ put_queue| |receive_buf| |tty_read
/handle_scancode\-------->|tty_queue|---------->|tty_ldisc|------->
\ / | | |buffer |
\_____________/ |_________| |_________|
_________ ____________
| |sys_read| |
--->|/dev/ttyX|------->|user process|
| | | |
|_________| |____________|
Figure 1
First, when you press a key on the keyboard, the keyboard will send
corresponding scancodes to keyboard driver. A single key press can produce
a sequence of up to six scancodes.
The handle_scancode() function in the keyboard driver parses the stream
of scancodes and converts it into a series of key press and key release
events called keycode by using a translation-table via kbd_translate()
function. Each key is provided with a unique keycode k in the range 1-127.
Pressing key k produces keycode k, while releasing it produces keycode
k+128.
For example, keycode of 'a' is 30. Pressing key 'a' produces keycode 30.
Releasing 'a' produces keycode 158 (128+30).
Next, keycodes are converted to key symbols by looking them up on the
appropriate keymap. This is a quite complex process. There are eight
possible modifiers (shift keys - Shift , AltGr, Control, Alt, ShiftL,
ShiftR, CtrlL and CtrlR), and the combination of currently active modifiers
and locks determines the keymap used.
After the above handling, the obtained characters are put into the raw
tty queue - tty_flip_buffer.
In the tty line discipline, receive_buf() function is called periodically
to get characters from tty_flip_buffer then put them into tty read queue.
When user process want to get user input, it calls read() function on
stdin of the process. sys_read() function will calls read() function
defined in file_operations structure (which is pointed to tty_read) of
corresponding tty (ex /dev/tty0) to read input characters and return to the
process.
Attacker with root access
If an attacker has access to the root account of the target host, gaining keystroke data is no problem at all, there usually exist several techniques.- filtering the output of strace(1) [9] / ptrace(2) [10]:
Try "strace -t -fF -p <PID> -eread" where <PID> is the ID of the process to be monitored, e.g. bash or vi. - modifying the Linux Kernel Keystroke Interrupt Handler to log all keystrokes:
This can be done in several ways, see [1] and [5] for details and source code, [6], [7] and [8] for kernel module development in general. - redirecting system calls:
An attacker modifies the kernel to change the default behaviour of certain system calls such as read/write etc.; See above papers, and also [11].
Attacker without root access
The previous attacks assume that the attacker has root privileges on the target host. In most cases, he doesn't. The process described in the background section is a CPU-cycle consuming operation, which can easily be monitored by a program that runs in an endless loop, checks the time and compares it to the previous measurement. Significant difference from average points to another process being executed. This was mentioned in [2], which also describes characteristics of keystrokes and how a 8-letter-password (56bit) can be reduced to 17bit, which dramatically decreases time needed to spend on a brute force attack. Of course, this attack is most likely to be discovered, because the loop must execute as fast / as often as possible, slowing things down. Also, swap activity and other programs (such as WindowManagers, updatedb, gpm and the like) highly interfere with accuracy of measurement; we are talking about microsecond (1/1000000 second) resolution timers.Attacker without access
Another way to reveal password information is to analyze ssh traffic. When a user logs over ssh into another machine, the password is sent in one single packet, since SSh2 usually padds to a standard size to not reveal password length. Every keystroke thereafter gets transmitted one by one (for compatibility with auto-completion/ncurses-like interfaces), allowing analysis of keystroke patterns as described before, if the attacker has access to a machine in between the route of the two hosts. More sophisticated methods include analysis of user typing patterns. See [3] and [4] for more intense discussion.Major drawbacks
All attacks described here are very likely to be discovered by a cautious system administrator either by looking at the process table or by examining current CPU usage. Modifying parts of the kernel that are not available as modules usually involves recompiling the whole kernel and rebooting the machine. At first glance, this would render all described attacks useless, but certain techniques such as modifying the process table hide an attacker's activities. See [12] for hiding techniques, [13] for revealing techniques.Available Publications
These papers contain detailed information about all topics being presented in this document:[1] Writing Linux Kernel Keylogger
http://www.phrack.com/phrack/59/p59-0x0e.txt
[2] Timing Attacks Against Trusted Path
http://www.cse.ogi.edu/colloquia/event/223.html
[3] Timing Analysis of Keystrokes and Timing Attacks on SSH
paris.cs.berkeley.edu/~dawnsong/papers/ssh-timing.pdf
[4] Passive Analysis of SSH (Secure Shell) Traffic - Solar Designer
http://www.openwall.com/advisories/OW-003-ssh-traffic-analysis.txt
[5] Kernel Based Keylogger - Mercenary
http://packetstorm.decepticons.org/UNIX/security/kernel.keylogger.txt
[6] The Linux keyboard driver - Andries Brouwer
http://www.linuxjournal.com/lj-issues/issue14/1080.html
[7] Linux Kernel Module Programming
http://www.tldp.org/LDP/lkmpg/
[8] Kernel function hijacking - Silvio Cesare
http://www.big.net.au/~silvio/kernel-hijack.txt
[9] Linux Manual: strace(1)
http://www.die.net/doc/linux/man/man1/strace.1.html
[10] Linux Manual: ptrace(2)
http://www.die.net/doc/linux/man/man2/ptrace.2.html
[11] Abuse of the Linux Kernel for Fun and Profit - Halflife
http://www.phrack.com/phrack/50/P50-05
[12] Backdoor and Linux LKM Rootkit - smashing the kernel at your own risk
http://it.rising.com.cn/safety/safetyschool/ywyb/020129lkm.htm
[13] chkrootkit - checks for signs of a rootkit
http://www.chkrootkit.org/