SECURITY RESEARCH, TOOLS, TUTORIAL |

osquery Linux Tutorial and Tips

Ninja Level Monitoring and System Visibility

Osquery is a monitoring framework. It provides detailed visibility into the operating system, processes, and network connections of a computer system.

Osquery can be used in production environments on both workstations and servers. A powerful selling point being that it performs great (minimal overhead) on Linux, OSX (macOS), and Windows systems.

There are many advantages for both IT and Security Operations. We will focus on the Security Operations and DFIR (Digital Forensics and Incident Response) features as part of this tutorial.

Osquery Logo

Install osquery on Ubuntu Linux

Originally developed by Facebook, osquery is a well-supported and documented tool. It has straightforward installation steps for a variety of operating systems and Linux distributions. In this tutorial, we will focus on installation on Ubuntu from the official repository. If you are using Fedora or other Linux distros the initial steps are well documented.

These steps can be used on Debian or Ubuntu based systems. It will add the apt repository to the system and install the package. The regular system level apt upgrade will upgrade the package as required in the future.

~$ export OSQUERY_KEY=1484120AC4E9F8A1A577AEEE97A80C63C9D8B80B
~$ sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys $OSQUERY_KEY
~$ sudo add-apt-repository 'deb [arch=amd64] https://pkg.osquery.io/deb deb main'
~$ sudo apt-get update
~$ sudo apt-get install osquery

Following this installation the /etc/osquery location will be created for configuration files but these will not be populated at this stage.

Interactive Shell for Immediate Testing (osqueryi)

Before doing any configuration, we can load the interactive shell to perform test queries.

Using SQL (sqlite is the basis for the SQL syntax) queries, we can query tables to gather information about the operating system. In the below query, we get a list of users (example has been snipped).

vagrant@ubuntu-focal:~$ osqueryi
Using a virtual database. Need help, type '.help'
osquery> select * from users;
+-------+-------+------------+------------+------------------+------------------------------------+--------------------------+-------------------+------+
| uid   | gid   | uid_signed | gid_signed | username         | description                        | directory                | shell             | uuid |
+-------+-------+------------+------------+------------------+------------------------------------+--------------------------+-------------------+------+
| 0     | 0     | 0          | 0          | root             | root                               | /root                    | /bin/bash         |      |
| 1     | 1     | 1          | 1          | daemon           | daemon                             | /usr/sbin                | /usr/sbin/nologin |      |
| 2     | 2     | 2          | 2          | bin              | bin                                | /bin                     | /usr/sbin/nologin |      |
| 33    | 33    | 33         | 33         | www-data         | www-data                           | /var/www                 | /usr/sbin/nologin |      |
| 1001  | 1001  | 1001       | 1001       | ubuntu           | Ubuntu                             | /home/ubuntu             | /bin/bash         |      |
| 998   | 100   | 998        | 100        | lxd              |                                    | /var/snap/lxd/common/lxd | /bin/false        |      |
+-------+-------+------------+------------+------------------+------------------------------------+--------------------------+-------------------+------+

Another example this time with fields selected and a LIMIT:

osquery> select uid, username, directory from users LIMIT 5;
+-------+------------------+--------------------------+
| uid   | username         | directory                |
+-------+------------------+--------------------------+
| 0     | root             | /root                    |
| 1     | daemon           | /usr/sbin                |
| 2     | bin              | /bin                     |
| 3     | sys              | /dev                     |
| 4     | sync             | /bin                     |
+-------+------------------+--------------------------+
Take some time to explore the information available. Execute .tables within osqueryi to list all tables and .schema to show the schema (fields).

Running osqueryi from the Command Line

Executing queries directly from the command line with osqueryi can be useful. See the following examples.

~$ osqueryi "SELECT * FROM users;"
~$ echo "SELECT * FROM users;" | osqueryi
~$ osqueryi --json "SELECT * FROM users;"

In the third example above we have used the --json parameter to change the output format. This is a great trick to get operating system telemetry into json for use in bash scripts and command line processing.

Quick osquery Linux Example Queries

Examples are the best way to showcase a framework with so much flexibility. As the examples highlight use cases for osquery are very broad.

Example Queries
SELECT version FROM os_version;
get operating system type, version and architecture
SELECT * FROM processes;
list running processes. similar to ps -ef command
SELECT * FROM logged_in_users;
show logged in users. similar to the who command
SELECT hostname, cpu_brand, cpu_physical_cores, cpu_logical_cores, physical_memory FROM system_info;
gather physical system information
SELECT * FROM deb_packages WHERE name LIKE 'python3%';
list installed packages with a filter
SELECT url, round_trip_time, response_code FROM curl WHERE url = 'https://github.com/';
execute curl and report time / HTTP response code
SELECT md5 FROM hash WHERE path = '/etc/passwd';
calculate md5 hash of a file
SELECT * FROM hardware_events;
show usb, hard drive changes and other hardware state changes
SELECT * FROM process_events WHERE cmd_line LIKE 'nmap%';
retrieve commands from process event table that match filter (audit events)
SELECT * FROM process_open_sockets;
show open socket / network connections similar to netstat
osqueryi --json "SELECT * FROM curl_certificate WHERE hostname = 'api.hackertarget.com:443';"
retrieve certificate information using curl and dump json output to shell
SELECT * FROM file WHERE path = '/etc/passwd';
gather file attributes and details
SELECT name, path, pid FROM processes WHERE on_disk = 0;
a well documented example to show running process where binary has been deleted from disk (common in malware)
SELECT containers, containers_running, containers_paused, containers_stopped FROM docker_info;
gather information on running containers (docker)
SELECT pid, cmdline FROM docker_container_processes WHERE id = '$container_id';
show processes running from container that matches the id

Using Math to Calculate Disk Space

Getting slightly more complicated with a query to calculate the free space on a partition.

osquery> SELECT path, ROUND( (10e-10 * blocks_available * blocks_size), 1) AS gb_free, 100 - ROUND ((blocks_available * 1.0 / blocks * 1.0) * 100, 1) AS percent_used, device, type FROM mounts WHERE path = '/';
+------+---------+--------------+-----------+------+
| path | gb_free | percent_used | device    | type |
+------+---------+--------------+-----------+------+
| /    | 39.8    | 4.3          | /dev/sda1 | ext4 |
+------+---------+--------------+-----------+------+

JOIN Example Showing LISTENING services with Executable Path

An example that shows the value of a SQL JOIN statement combining data from two tables.

osquery> SELECT p.path, local_port FROM process_open_sockets s JOIN processes p ON s.pid = p.pid WHERE s.state = 'LISTEN';
+-----------------------------------+------------+
| path                              | local_port |
+-----------------------------------+------------+
| /usr/lib/systemd/systemd-resolved | 53         |
| /usr/sbin/sshd                    | 22         |
| /usr/bin/nc.openbsd               | 4000       |
| /usr/sbin/sshd                    | 22         |
+-----------------------------------+------------+

Hardware Monitoring

Rather than digging through log files and the Windows Registry osquery can also help with monitoring for hardware changes.

Particularly important for high security environments (classified networks) or for those IT departments who just want to know when someone plugs in a malware ridden USB device.

osquery> select driver,vendor,model from hardware_events;
+-------------+-------------------+---------------------------+
| driver      | vendor            | model                     |
+-------------+-------------------+---------------------------+
| usb         | Lexar Media, Inc. | LJDTT16G [JumpDrive 16GB] |
| usb-storage | Lexar Media, Inc. | LJDTT16G [JumpDrive 16GB] |
+-------------+-------------------+---------------------------+

Another table of interest is the usb_devices

osquery> select usb_port, vendor, model, serial from usb_devices;
+----------+-------------------+---------------------------+------------------+
| usb_port | vendor            | model                     | serial           |
+----------+-------------------+---------------------------+------------------+
| 1        | Linux Foundation  | 1.1 root hub              | 0000:00:06.0     |
| 2        | Lexar Media, Inc. | LJDTT16G [JumpDrive 16GB] | AAXNSQBA0WN23C34 |
+----------+-------------------+---------------------------+------------------+

Query these tables on a schedule and know when users plug in a USB drive either for immediate alerting to the SOC or for historical purposes when incident handling.

osquery Configuration

Getting osquery working optimally requires an understanding of the configuration options (/etc/osquery/osquery.conf) as well as the runtime flags (/etc/osquery/osquery.flags).

The flags file is a convenient way to control runtime parameters as there can be quite a few required.

During initial testing the flags that you will want to pay attention to are those that control the logging and events.

Events vs Scheduled SQL Statement

Most of the table data is generated when an SQL statement requests data. Events are used to populate real time audit data such as process execution, network auditing, and filesystem changes (file integrity monitoring). Without the event (audit) option, a network or process event that occurred between two scheduled SQL queries may be missed.

By default, the event (pubsub) framework is disabled. Depending on the host configuration, other process auditing (auditd) may also be in use. Note that auditd and the osquery auditing cannot be used at the same time (see eBPF as an alternative).

While auditing is very helpful to capture activity, it can introduce CPU overhead and will increase amount of logs generated by osquery. Be sure to test any configuration before deploying to production.

In this example we get an error when attempting to query an events table where the events are disabled.

osquery> select * from socket_events;
W0809 06:38:53.354483  5130 virtual_table.cpp:969] Table socket_events is event-based but events are disabled
W0809 06:38:53.354588  5130 virtual_table.cpp:976] Please see the table documentation: https://osquery.io/schema/#socket_events

For this query to work we need to either pass parameters to the command line of osqueryi as shown below, or we can set the parameters in the /etc/osquery/osquery.flags file.

~$ osqueryi --audit_allow_config=true --audit_allow_sockets=true --audit_persist=true --disable_audit=false --events_expiry=1 --events_max=50000 --disable_events=false

eBPF and osquery

eBPF is the new alternative way capture the auditing data (available since osquery 4.6.0) on Linux systems. It uses new kernel functionality (eBPF) to capture the process, socket, and other types of events.

There is a great Youtube presentation on that covers the technical details of how eBPF and osquery work.

To use eBPF the kernel will need to be 4.18 or newer. eBPF logo with bee

With eBPF enabled we will have access to tables bpf_process_events and bpf_socket_events that are equivalent to the standard process_events and socket_events tables.

Enabling eBPF for osquery on Linux requires the following flags:

--disable_events=false --enable_bpf_events=true

Container Monitoring

A further advantage when using eBPF rather than the audit subsystem is greater visibility into containers and management systems including both Docker and Kubernetes.

Testing eBPF & osquery on Ubuntu 20.04

When first testing eBPF you will want to ensure it works on test system. Using osqueryi is a great way to try things out.

~$ sudo osqueryi --disable_events=false --enable_bpf_events=true --verbose

If running this osquery on a default Ubuntu 20.04 install you may hit the following error:

I0819 00:01:57.169797 86613 bpfeventpublisher.cpp:297] Failed to load the BPF probe for syscall __x64_sys_execve: The 'enter' program could not be loaded: Failed to open the Linux kernel version header: /usr/include/linux/version.h
I0819 00:01:57.169965 86613 eventfactory.cpp:156] Event publisher not enabled: BPFEventPublisher: Failed to create the function tracer: The 'enter' program could not be loaded: Failed to open the Linux kernel version header: /usr/include/linux/version.h

This is shown when running with the --verbose flag. Otherwise the bpf_process_events table will simply be empty.

~$ sudo apt install linux-libc-dev

This will resolve the issue, as it includes the missing version.h file. The output from osqueryi will now show:

I0819 00:14:05.886425 88447 eventfactory.cpp:390] Starting event publisher run loop: BPFEventPublisher

After a period of time or execution of a command on the host.

osquery> select uid,cmdline,duration,ntime from bpf_process_events;
+-----+-----------------+----------+----------------+
| uid | cmdline         | duration | ntime          |
+-----+-----------------+----------+----------------+
| 0   | cat /etc/passwd | 1014624  | 84133214411451 |
+-----+-----------------+----------+----------------+

Another potential error is if the osqueryi process does not have permission to access the kernel memory space.

Event publisher not enabled: BPFEventPublisher: Failed to setup the memory lock limits. The BPF tables may not work correctly.

Ensure you have used sudo when executing osqueryi.

Download an example configuration for Linux Servers from Github. Based on the Palantir Linux server configuration this has been modified for use with eBPF events and tables.

osquery daemon

Running osquery as a service allows ongoing recording of data points using scheduled queries and event collection (process execution / network sockets).

By default under Linux the daemon will load the default location for the flags file and configuration file. Typically the configuration file will then include the location of additional query packs.

Following are two example queries that could be included in the configuration file:

{
"scheduled_query": {
 "users_snapshot": {
 "query": "SELECT * FROM users;",
 "description": "Returns full list of users on the system.",
 "interval": 86400,
 "snapshot": true,
 }
 }
}

Notice the snapshot key. This tells the query to be logged as full results. The interval is equal to 86400 seconds (daily snapshot).

{
"scheduled_query": {
 "users_differential": {
 "query": "SELECT * FROM users;",
 "description": "List any new users or changes in the users table.",
"interval": 3600,
 }
 }

In this query the daily snapshot is compared and any changes are logged. The interval is 3600 seconds (hourly).

The osquery daemon will run the scheduled queries logging locally to /var/log/osquery/osqueryd.results.json or whatever logging plugins are configured.

Configuring osqueryd for a quick start

With a default (example) configuration we see a number of errors about the Event publisher not being enabled.

I0810 05:14:07.526832 278165 eventfactory.cpp:156] Event publisher not enabled: BPFEventPublisher: Publisher disabled via configuration
I0810 05:14:07.527535 278165 eventfactory.cpp:156] Event publisher not enabled: auditeventpublisher: Publisher disabled via configuration
I0810 05:14:07.527607 278165 eventfactory.cpp:156] Event publisher not enabled: inotify: Publisher disabled via configuration
I0810 05:14:07.527662 278165 eventfactory.cpp:156] Event publisher not enabled: syslog: Publisher disabled via configuration

A better option for getting up and running is to use a working example configuration from Palantir. They have published a solid Linux Server configuration that includes an osquery.flags and osquery.conf file.

Palantir Github https://github.com/palantir/osquery-configuration/tree/master/Classic/Servers/Linux

Put these files in /etc/osquery/ and change the location of the ossec-rootkit pack in the osquery.conf file to the one at /usr/share/osquery/packs/.

Restart osqueryd and you will start getting logs. This configuration enables process monitoring, socket events and a number of other useful monitoring queries.

It is a great starting point. There are also configurations here for both Windows Endpoints and MacOS but we have primarily tested and deployed on Linux Servers.

osquery Logging

There are a number of Logging plugins for osquery. The default plugin for the daemon is filesystem logger. Logging for osquery is based on delivering a json log entry per query. Making the logs easily parsed, shipped or processed by any logging processor or platform.

It does not matter what logging platform you use, whether its Splunk or another commercial option or open source solutions such as Elastic Stack or Graylog. The fact that osquery outputs simple json makes upstream processing straightforward and flexible.

One logging pipeline example showing this flexibility.

Example Logging Pipeline for osquery

Post processing / filtering of the logs could occur at the filebeat or logstash stages. Analysis using the Mitre Att&ck Framework or Sigma Rules for example could then occur at Graylog.

The example pipeline could certainly be simplified depending on the infrastructure and requirements.

Centralized Management & Logging

There are a number of solutions for management of an osquery "fleet". The open source fleetdm is a fork of the Kolide platform. There are also Zentral and Uptycs as commercial offerings.

These all have a TLS endpoint that the osquery client connects to; both configuration and logging can then be controlled from the centralized platform.

osquery packs

osquery packs are sets of grouped queries that can be used for different use cases. A number of default packs are included:

~$ ls /usr/share/osquery/packs/
hardware-monitoring.conf it-compliance.conf ossec-rootkit.conf
unwanted-chrome-extensions.conf windows-attacks.conf
incident-response.conf osquery-monitoring.conf osx-attacks.conf    vuln-management.conf windows-hardening.conf

Based on use cases and or operating system these are included by default in the install but not enabled in the default configuration file /usr/share/osquery/osquery.example.conf.

$ sudo cp /usr/share/osquery/osquery.example.conf /etc/osquery/osquery.conf
osquery is built to be very performant with low impact on the system. However, every query does require system resources, so there is an impact. Testing queries prior to production deployment is essential.

When creating queries, do not repeat yourself on the client. If you have process monitoring available through event logging, you do not need to query for malicious processes on the host; better to send those process event logs back to your SIEM and run specific queries on the centralised logs.

Third parties may release osquery packs allowing the sharing of queries within the community.

File Integrity Monitoring (FIM)

Another event based auditing option is File Integrity Monitoring. Using the configuration, you will have to specify the locations and files that are to be monitored.

Enabling the File Integrity Monitoring requires the following flags for the file_events and process_file_events tables.

--enable_file_events=true --disable_audit=false

Testing File Integrity Monitoring with osqueryi. During load with --verbose enabled we see the file paths being monitored.

~$ sudo osqueryi --disable_events=false --enable_bpf_events=true --verbose --enable_file_events
<>
I0819 05:27:30.656767  1829 file_events.cpp:87] Added file event listener to: /usr/sbin/**
I0819 05:27:30.656881  1829 file_events.cpp:87] Added file event listener to: /usr/local/bin/**
I0819 05:27:30.656985  1829 file_events.cpp:87] Added file event listener to: /usr/local/sbin/**
I0819 05:27:30.657066  1829 file_events.cpp:87] Added file event listener to: /etc/hosts
<>

These paths are set in the osquery.conf file. Now, in the following example you can the see the file event was captured in the file_events table and reported in the query.

osquery> select target_path, category, action, atime, mtime from file_events;
+-------------+---------------+---------------------+------------+------------+
| target_path | category      | action              | atime      | mtime      |
+-------------+---------------+---------------------+------------+------------+
| /etc/hosts  | configuration | ATTRIBUTES_MODIFIED | 1629350896 | 1629350896 |
+-------------+---------------+---------------------+------------+------------+

Augeas

Augeas is an interesting open-source project that is packaged with osquery. Enabled by default are a number of configuration file "lenses". These allow osquery to parse configuration files and show the status of parameters on the system. This is a very helpful tool for compliance monitoring across a fleet of systems.

The default lenses are located here and can be reviewed to see what is possible. /usr/share/osquery/lenses

osquery> SELECT label, value FROM augeas WHERE path = '/etc/ssh/sshd_config' and label = "PasswordAuthentication";
+------------------------+-------+
| label                  | value |
+------------------------+-------+
| PasswordAuthentication | yes   |
+------------------------+-------+

A key concept is that this information is being collected at the time of the query. For many use cases, the query will be run on a schedule with the results being compared to a previous result in order to identify changes in the system state (new user account, logins, new network connections).

Yara and osquery

YARA is a powerful malware and file scanning framework. It can be incorporated into an osquery configuration allowing:
- on demand scanning when a file system change occurs (from file_events)
- a yara table for on-demand YARA scanning.

Configuring YARA requires that the osquery.conf identifies the signatures to use and the file_paths to monitor.

Conclusion

This tutorial provided a quick start guide for getting a usable osquery up and running. At the same time, we have covered the building blocks needed for a more complicated deployment.

There are significant benefits to be found with osquery whether you are looking to manage a fleet of servers, tens of thousands of workstations, or a handful of endpoints. Get in contact if you find this tutorial useful or have any feedback.

Work across the teams in your organisation to find advantages for more than security operations. DevOPS & IT will love it. Increase productivity, security visibility, and inter team communication all with one deployment project.