This is an old revision of the document!

LKRG

LKRG is free and Open Source project distributed primarily in source code form. You can download it and prepare custom build by yourself. However, if you would rather use a commercial product tailored for your specific operating system, please consider LKRG Pro, which is distributed primarily in the form of “native” packages for the target operating systems and in general is meant to be easier to install and use while delivering optimal performance. Additionally, you will help in development of the project (economically). LKRG Pro is available <here>.

What is LKRG?

The Linux Kernel Runtime Guard protects system by comparing hashes which are calculated from the most important kernel region / sections / structures with the internal database hashes. Additionally, special efforts have been made to individually protect all extensions of the kernel (modules). To make the project fully functional, the module should be initially loaded on a clean system – e.g. directly after installation or after booting clean system. At this moment it is possible to create a trusted database of hashes.

Aim

There are two main ideas behind the Linux Kernel Runtime Guard (LKRG):

Prevent unsupported modifications of the Linux kernel – it forces “rules” which need to be followed to develop extensions/extra functionalities for the Linux Operating System. These “rules” are an official Linux API which must be consumed to provide specific functionality instead of relying on unsupported modifications of running kernel (patching). Patching the kernel has direct impact on the security, system stability and performance. This project was never designed to be a perfect solution (it can be bypassed) and the weaknesses are known but the correct usage may significantly improve security, system stability and performance of the entire OS / platform.
Given that LKRG correctly prevents unauthorized modifications, we implement three useful user-mode protections (called “Protected Features”):
- Protected process – process can’t be intercepted in any form by anyone from usermode (even by superuser – root).
- Protected file – file can’t be modified / deleted by anyone even by superuser – root.
- Protected logs – log file can’t be modified / deleted by anyone even by superuser – root, but it's still possible to append new log entries.

Currently, we maintain two versions of the LKRG project:

p_lkrg-beta - beta version of the LKRG project which fully includes “Protected Features”. This version is more functional but it has some side effects (please read ”Protected Features” chapter)
p_lkrg-light - light version of the LKRG project - it does NOT include “Protected Features”.

Security promises (threat model) of the project

Security promises delivered by the LKRG can't be briefly summarized and need to be discussed in details. Please navigate to the following page describing the threat model of the project:

Threat model.

Guarded regions

The database contains hashes calculated from the following parts of the system:

Critical CPU/core data (currently only x86/amd64 arch is supported)

This component is very critical from the stability and security point of view for various reasons. It is very common that CPU has more than one core. Each core from each CPU must be protected separately and functions generating or checking hashes must be run on each individual core as exclusive task to guarantee nothing is changing the data in the meantime.
NOTE: component implementing this functionality is one of the most complicated in the system. Since multiple cores might be present in the system, it is possible that any random core in any random time can go offline and/or online. Whenever this situation happens, LKRG must rebuild internal database and protect the new cores’(if became online) most critical data. Additionally, entire CPU might go offline or be hot-plugged in. This is especially true on the Virtual Machines (VMs). It is not a rare situation when a new Virtual CPUs are assigned to the currently run VM. New (V)CPU might have multiple cores as well, and situation of hot-plugging (V)CPU must be correctly handled by the LKRG to be effective and provide the same functionality whenever this situation appears.
Hot-plugging (V)CPU forces to take into account another corner case scenario, when currently running system has only 1 (V)CPU with 1 core and there is plugged a new core or (V)CPU. Linux boots as UniProcessor (UP) kernel if platform has only 1 (V)CPU with only 1 core. However, as soon as additional core or (V)CPU appears, dynamic SMP boot kernel process will be launched. Newly dynamically loaded SMP kernel overwrites some UP macros, changes the assembly code of the kernel AND all loaded modules! This means entire database with hashes must be recalculated! LKRG correctly takes all of these situations into account and handles all of these corner case scenarios.
For each individual core in all (V)CPUs IPI are sent to exclusively run LKRG function which gathers and calculates critical CPU data including:

IDT entry point and size
IDT itself (as blob of memory)
MSR:
- MSR_IA32_SYSENTER_CS
- MSR_IA32_SYSENTER_ESP
- MSR_IA32_SYSENTER_EIP
- MSR_IA32_CR_PAT
- MSR_IA32_APICBASE
- MSR_EFER
- MSR_STAR
- MSR_LSTAR
- MSR_CSTAR
- MSR_SYSCALL_MASK

Additionally, LKRG keeps information about:

How many (V)CPUs/cores are available in the system
How many online (V)CPUs/cores are available in the system
How many offline (V)CPUs/cores are available in the system
How many possible (V)CPUs/cores might be in total available in the system

Entire Linux Kernel .text section

This covers almost entire Linux kernel itself, like syscall tables, all procedures, all function, all IRQ handlers, etc.

Exceptions

Linux Kernel exception table

Read only section

Entire Linux Kernel .rodata section – it should never change during the running system.

IOMMU

Optionally, this might be enabled but by default it is not taken into account. The reason behind that is that some of the memory ranges might change assignment by dynamically loaded drivers. In corner cases it is also possible that kernel itself may change it. If administrator is 100% sure what he is doing, this section might be enabled.

Modules

LKRG is trying to discover how many modules are there currently in the system and keeps tracking them. For each individual module the following information is tracked down based on the module link list:

Struct module pointer (a.k.a. THIS_MODULE)
Name
Pointer to the module_core
Size of the .text section
Hash from the entire .text section for that module

For each individual module the following information is tracked down based on the KOBJs:

Struct module pointer (a.k.a. THIS_MODULE)
Pointer to the ‘module_kobject’ structure
Entire KOBJ structure (except from list_head and kref information)
Name
Pointer to the module_core
Size of the .text section
Hash from the entire .text section for that module

Both information must match (if they exist in both places) and each of them is being tracked individually. Additionally, the following information is being tracked down:

Number of entries in module list
Number of KOBJs in specific KSET
Specific order of linked list in module list
Specific order in KSET for KOBJs

TODO

Hash from the internal database
Hash from LKRG itself
APIC / Local APIC
MADT / FADT / RSDT / ACPI
Call gates
Integrity of processes
Check if callbacks / notification routines point to the modules which we know and are tracking down
Data integrity for critical structures like:
- proc_root
- Critical files (like /etc/shadow, /etc/passwd, etc.)
- TTY hooks.

When is the LKRG validation routine executed?

The function for checking the system integrity will be executed:

By the kernel timer interruption which generates work item and inserts it in shared WQ
On demand via a dedicated control command from the communication channel
Whenever module activity is detected
Whenever new (V)CPU or core activity is detected
On various events happened in the system. For the performance reasons each event has assigned probability that integrity routine will be fired. The following events are monitored (using notification chains):
- CPU idle – probability 0.005%
- CPU frequency – probability 10%
- CPU power management – probability 10%
- Network device – probability 1%
- Network event – probability 5%
- Network device IPv4 changes – probability 50%
- Network device IPv6 changes – probability 50%
- Task structure handing off – probability 0.01%
- Task going out – probability 0.01%
- Task calling do_munmap() – probability 0.005%
- USB changes – probability 50%
- Global AC events – probability 50%

This list is not closed and will be extended.

Protected features

Based on the assumption that preventing from the unsupported modifications of the Linux kernel is correctly implemented, LKRG “exports” to the user-mode three very useful features:

Protected process

The main idea behind this feature is to be able to harden certain processes even from the highest privileged accounts (like “root” account) and to warrant that some secrets remain inaccessible. Proper implementation of Protected Process (PP) is not trivial from a couple of reasons:

Linux was never designed to have any protection from superuser account (as opposite to SELinux).
Linux exposes various interfaces which might be consumed to interact directly or indirectly with process and memory.
By definition, superuser can apply/change/remove any limitation in the system.
Some processes might require access to other process for proper functionality.

There is couple of “official” ways how to interact with the processes, and all of them needed to be hardened:

Syscalls – e.g. ptrace()
Signals – e.g. [t[g]]kill()
TODO: /proc interface – e.g. /proc/<pid>/mem

Additionally, process can be affected via direct memory access:

Raw memory access – e.g. via /dev/mem device
Kernel memory access – e.g. via /dev/kmem device

LKRG leverages *kprobes interfaces to “lock down” all possible ways of interacting with processes. As an end-result no one from user-space is able to interact with the process protected by LKRG. There is one exception for compatibility reasons:

Protected process might fork itself and for the security reasons if this ever happens, child must be automatically protected as well. Because mother and child often must communicate with each other, every protected process might interact with each other using official API. Exactly the same as non-protected processes in Linux. In practice LKRG creates a “new group” of processes which are isolated from “normal” Linux processes. These “new” processes have higher privileges since they can normally interact with all types of processes (including PP), where normal process has no access to the Protected Processes at all.

LKRG maintains its own red-black tree to track all PP. When process dies it is automatically removed from the list. There are 2 ways how the process might become Protected Process:

Communication channel has an option to provide a PID of already existing process to be dynamically linked as part of PP. As soon as LKRG receives that message, process will be protected.
Every executable file which is part of “Protected File” feature, automatically becomes PP if it is going to be executed. There are 2 main reasons for that:
- One of the way to compromise PP feature is to overwrite executable file from which the process was created. Next, time when this file will be executed, it might include attackers’ code. To mitigate this problem, “Protected File” feature can be used.
- Some processes might not exist yet, but administrator might still want them to be PP when they will be executed. It is the way to inform LKRG which executable file must be protected from end-to-end.

Protected file

The main idea behind this feature is to be able to harden certain files even from the highest privileged accounts (like “root” account) and to warrant that some secrets remain inaccessible. Proper implementation of Protected File (PF) is not trivial from a couple of reasons:

Linux was never designed to have any protection from superuser account (as opposite to SELinux).
Linux exposes various of interfaces which might be consumed to interact directly or indirectly with files.
By definition, superuser can apply/change/remove any limitation in the system.

Files can be modified / deleted using official API, as well as indirectly changed via raw disk access. Both methods must be stopped:

Official API – LKRG leverages IMMUTABLE bit in “inode” structure in VFS to “lock down” the file. Special efforts are taken to be sure that “inode” and corresponding “dcache” is never dropped or deleted by Linux kernel during entire lifetime. Even if superuser (root) forces kernel to drop all caches and make “clean” file access, LKRG will protect from deleting PF inodes and Linux kernel won’t do it. Additionally, specific file operation function pointer for the protected file (and directory holding this file) are replaced by LKRG functions for security reasons.
Raw disk access – LKRG does not allow any process (excluding protected processes) to have raw access to the memory neither to the disk. This blocks indirect file and process modification. LKRG virtually extends CAP_SYS_RAWIO capability to achieve this solution, since in normal Linux system, this capability does NOT block raw disk access. Additionally, LKRG forces all process to drop CAP_SYS_RAWIO (excluding protected processes). Both features (PP and PF) might create end-to-end trust chain to protect specific files and executable files for being protected from superuser (root) attacks. Protected file feature might be used to “lock down” critical configuration files (e.g. /etc/ld.so.preload, /etc/ssh/sshd_config, so on) as well as executable file. Executable file in the end creates Protected Process which is also protected from superuser (root) attacks. Some critical processes might heavily benefit from that (e.g. ssh-agent), but protection will be fully trusted only if all shared object (external libraries) will be protected as well.

There could be 2 ways of doing it:

Individually add to the Protected File list all libraries which are consumed by the processes which we want to protect
Compile executable as a static binary – it won’t use any dynamically linked libraries.

2nd option is used by LKRG client user-mode application. Additionally, LKRG makes it PF by default.

Protected logs

Works exactly the same as PF feature but allows file to be opened only in “append” mode. The main idea behind this feature is to be able to harden certain files even from the highest privileged accounts (like “root” account) and to warrant that some secrets remain inaccessible but at the same time give a possibility to append new information to it. Such a functionality might be desired for certain files (e.g. log files) to provide warranty that information which was already written down won't be modified neither deleted.

Limitation

Proper implementation of the Protected Features requires from LKRG to leverage and virtually extend CAP_SYS_RAWIO capability. Unfortunately, some of the software (e.g. Xorg or doesmu) requires this capability for proper behavior (e.g. accessing /dev/mem device which LKRG must lock down). If Protected Features are being enabled such a software won't properly work.

Mitigation

If you still want to use Protected Features and run the software which requires access to /dev/mem device (uses CAP_SYS_RAWIO capability) you might initialize and run desired software before you load LKRG into the kernel. By doing it, desired software should properly gain necessary resources and correctly use it and LKRG will lock down access to that device after that fact. Note: If for any reasons such a software tries to regain access to restricted device it won't be allowed anymore and proper functionality will be broken.

HOWTO use it...

Some of the examples how to use LKRG can be found here.

Caveats

(*JUMP_LABEL)

As soon as LKRG is loaded in the system, none of the .text section modifications is allowed. This is also true for the official Linux APIs which sometimes does patch(!) it. One of the examples might be kprobes interface. It is injecting 0xCC instruction on the monitored function. That’s why all modules using kprobes must be loaded BEFORE LKRG. Linux kernel can be compiled to heavily consume low-level runtime patching mechanism called “jump label” (CONFIG_JUMP_LABEL=y). Most of the Linux distributions provide kernel compiled with this option (sometimes extra sub-options - *_JUMP_LABEL). It makes Linux kernel a heavily self-modifying code which is very troublesome for this project (since we are comparing hashes!). Especially, *_JUMP_LABEL is just a low-level mechanism which might be consumed by higher level kernel layers or macros, e.g.:

Dynamic kernel debugging
Tracepoints
Optimization of “highly unlikely” code branches (static keys)
Macro DO_ONCE()
Probably more ;)

To mitigate this problem LKRG does the following:

Keeps a copy of the entire .text section made during module installation (it is guarded in the same way as original .text section)
If we detect that .text section for kernel was changed, we try to find the offset where modifications were made. We use this offset to calculate the VA of modified code. If modification happened because of the *_JUMP_LABEL options, either a long NOP or relative 'jmp' instruction was injected (both are 5 bytes long):
- If NOP is modified to 'jmp', destination of the instruction is still pointing to the inside of the same function (symbol name) where modification happened. We decode this ‘jmp’ instruction to validate if the target is still pointing inside the same symbol name range. If yes, it is most likely a 'legit' modification.
- If 'jmp' instruction was changed, we only allow it to be replaced by long NOP instruction.
Any other modifications are banned
If LKRG detects “whitelisted” modification, copy of .text section is updated and new hashes calculated.

NOTE: The following attack is still possible – find injected NOPs (by *_JMP_LABEL) and overwrite them using 'jmp' instruction. Destination of this instruction must point to the random address at the same “symbol name rage”. It allows the attacker to create a rootkit based only on ROP. A few comments are needed here:

It is very difficult task to create fully functional rootkit based only on 1-function ROP – but possible
At random point of time kernel WILL overwrite these modifications anyway – so these won't be persistent modifications. Moreover, it won't be deterministic so the risk is very low.

IPI Problem

There is an undesirable situation in SMP Linux machines when sending an IPI. Unfortunately, it might influence the state of the kernel and generating very confusing logs. They appear to suggest that the problem resides on the correct execution context which is killed and dumped, but not on the actually problematic context, which might not be dumped. This makes it hard to root-cause the problem even if one is aware of this shortcoming of the killings and the logging. More details about it can be found here:

http://lkml.iu.edu/hypermail/linux/kernel/1609.2/03265.html

LKRG TODO

Lock down compat_* syscall interfaces
Track down PTEs for the kernel .text section (detect malicious modifications)
Better self-protection
Implement additional features:
- Independently expose the list of processes visible for the kernel - can be used to detect hidden processes
- Independently expose the list of all objects in the specific directory - can be used to detect hidden files / directories
- Network hooks detection
- IMPORTANT: Detect kernel exploitation process by detecting specific data corruption in the kernel

License

GPLv[2/3] - need to decide.

Credits

Donation

Patreon.

Greetings

I would like to thank the following people who helped me at some point during development of this project:

Alexander Peslyak a.k.a. Solar Designer
Rafał 'n3rgal' Wojtczuk
Brad 'spender' Spengler
PaX Team… I mean pipacs :)

Table of Contents