Building atop Osquery. Compliance, monitoring, threat hunting and auditing.

16 min readJan 27, 2020

Osquery, a tool initially developed by Facebook. Lets you query your local machine just like a database. This flexibility to extract information from your assets using SQL leads to several use cases which I’d like to explore in this blog post. If you’re new to Osquery check out our blog post on “What is Osquery?”.

I gave this talk at Osquery Con 2019 in June last year, the video for which you can watch here.

The slides for this presentation are available here including the SQL queries used in the examples.

Remote management of Osquery

This article assumes you are making use of Osquery’s remote API to manage and query numerous assets running the Osquery daemon.

This article is written using Osquery 3.3.2.

Inventory management

Given Osquery’s broad (and ever-increasing) number of virtual tables, many of these tables used to build up a comprehensive view of your IT estate.

Information captured from systems using Osquery

System information

Osquery provides access to several tables relating to various aspects of system information. The following system queries will run across all platforms.

The system_info table will provide basic system information around the CPU and available memory. The system_info table also provides the system UUID to help uniquely identify the asset within your IT estate. The hardware_serial column will also provide the manufactures hardware identifier. A serial number in the case of Apple or Dell service tag.

SELECT uuid, hardware_serial, hostname, cpu_subtype, cpu_brand, physical_memory, hardware_vendor, hardware_model FROM system_info;

The os_version table provides operating system information and the current patch level. This is useful for identifying systems that may not be running the latest OS release.

SELECT name, version, build, platform FROM os_version;

Processes and process events

We published a previous post on this which covers collecting process information from the processes table and the corresponding application hashes as well as event information from process_events table. However, I want to cover a few neat tricks taking advantage of the SQLite engine used by OSquery.

WITH RECURSIVE rc(pid, parent, name) AS (
  SELECT pid, parent, name FROM processes WHERE pid = 55334
  UNION ALL
  SELECT p.pid, p.parent, p.name FROM processes AS p, rc
  WHERE p.pid = rc.parent
  AND p.pid != 0
)
SELECT pid, parent, name FROM rc LIMIT 20;

You can use query recursion to iterate through parent processes to identify a given processes execution path. The process table also includes a column called on_disk which might help identify suspicious processes spawned from memory. One thing to note with this query is that if the parent process has exited, you won’t be able to iterate through the process tree. The process_events table is helpful to capture every process that’s been spawned as the processes table is only a snapshot in time. Setting up the process_events table does require some additional command flags which we covered in the fore-mentioned blog post.

Also, just a note on Windows when querying or joining the groups or users table. Depending on how you’re Windows events are configured you may see lots of the following events especially if a low polling interval set:

4798: A user's local group membership was enumerated.

Installed Applications

Understanding whats installed across your IT estate is useful to identify licensing issues (Mixing installed applications with running processes you can also work out the amount of time spent per application), un-patched applications or applications that haven’t been authorized to be run.

Showing application information for Mac OS X

Windows Programs
Installed windows programs can be queried using the following query. When pooling these results the identifying_number will help identify unique versions of installed programs.

SELECT name, version, identifying_number FROM programs;

Or even find programs installed in non-standard Windows locations.

SELECT name, install_location FROM programs WHERE install_location NOT LIKE 'C:\Program Files%'

Linux packages
The query used to find Linux packages depends on the package manager being used. At the moment both RPM and DEB based package managers are supported.

SELECT name, release, sha1 FROM rpm_packages;

And for Debian.

SELECT name, version, source FROM deb_packages;

What’s cool about being able to retrieve the installed package details (including the hash sha1/sha256) is that you can tie this against the public RPM and DEB package managers. You can even use Osquery to check which repositories are in use using yum_sources for RPMs.

SELECT name, baseurl, enabled FROM yum_sources;

… and apt_sources for DPKGs.

SELECT name, base_url FROM apt_sources;

With this information, you can in fact query those external repositories to check for updates and use the vendor information to check for vulnerabilities.

We’ve have provided a public Github repository which maps packages to CVEs.

Mac OSX apps
Lastly, you can grab all your installed Mac applications with:

SELECT bundle_name FROM apps;

This table provides a last_opened_time column which again is useful for mapping application activity for licensing rather than having to tie process information together.

It’s also worth mentioning python_packages and npm_packages . If your company employs developers or runs server side applications making use of these technologies. Then mapping these tables against common package databases will yield package updates or known vulnerabilities for a given package to help improve your security hygiene.

Security information event monitoring (SIEM)

Osquery offers several tables for collecting events for monitoring system security. There are a lot of neurosciences for collecting system events with Osquery and certian command line flags are required. Though before we look at the avaliable event tables I want to quickly look at the last table for both Linux and Mac OS X. Which in most use cases will be sufficient.

Using the Osquery last table to gather user logins

Linux and Mac OS X login information with Osquery

The last table in Osquery provides login and logout events for Linux and Mac OS X systems whilst providing the process id and the connecting host.

The following query joins the processes table to get the entry process name showing all sessions with a host identifier.

SELECT username, p.name AS process,
DATETIME(time, 'unixepoch') AS datetime, host FROM last AS l
LEFT JOIN processes AS p ON p.pid = l.pid
WHERE host <> '' AND host NOT LIKE ':pts%'
ORDER BY time DESC;

You can modify this query further to grab only external IP addresses.

AND host NOT IN (':1', '127.0.0.1')

Using both the last table and data from the processes table you can capture information from both tables to construct a timeline of the commands being executed by the remote user. This is useful for endpoint detection and response (EDR).

System logs with Osquery

Osquery provides virtual tables for Linux syslogs syslog_events, Mac OS X Apple system logs asl and Windows event logs windows_events .

Osquery events

For the next set of tables, you’ll need to enable Osquery events with the command flag --disable_events=false . Osquery will dump logs into its local database. To prevent table bloat you can set the following flags,
--events_expiry=3600to expire events after a given time in seconds has passed and--events_max=50000to remove older events falling outside a given row limit. The flag--events_optimize=true can also be set as Osquery will create “removed” records for expired events as per Osquery’s diff engine. When a remote query is executed only “added” events will be returned.

Linux syslog

Osquery will read syslog events from a system pipe. Both rsyslog or syslog-ng can be configured to write to a system pipe in a format Osquery can read. Configuration steps can be found here. Once you’ve configured the syslog daemon you’ll need to set the additional command line flag --syslog_pipe_path=/etc/osquery/syslog_pipe . Once done you’ll be able to query the syslog_events table like so:

SELECT * FROM syslog_events LIMIT 5;

Apple system logs (ASL)

ASL does not require any additional configuration of Osquery. You can simply query your Apple system logs like so:

SELECT * FROM asl ORDER BY time LIMIT 5;

The ASL also provides the pid, uid and gid which you can use to JOIN against processes, users and groups respectively.

Windows event logs (WEL)

Windows event logging requires no additional system configuration. Just add --windows_event_channels=System,Application,Setup,Security to your Osquery flags file. Be aware that applying all these channels to a server either ingesting logs from other servers or an Active Domain controller will result in a huge amount of data being both written locally and sent back to your Osquery server.

You can of course filter the events returned using the Windows eventid or the channel the event was captured from. Returning login events around logon and logout events similar to the last table described earlier.

SELECT * FROM windows_events WHERE source IN ('Security') AND eventid IN (4625, 4634, 4647, 4672) LIMIT 5;

Network events

Osquery provides several tables to access network connections across Windows, Linux and Mac OS X. There are really cool uses for network event data, not least what you’d do with standard netflow logs. However, Osquery can provide a little more context such as the originating application for the network event which can be married against existing netflow logs or for EDR incidents as shown in the screenshot below.

As depicted above the following query will join open sockets with the originating process id. You can also join the users and groups from the process to provide attribution.

SELECT processes.pid, processes.name, remote_address, remote_port FROM process_open_sockets LEFT JOIN processes ON processes.pid = process_open_sockets.pid WHERE remote_address <> '' AND remote_address != '::' AND remote_address != '127.0.0.1' AND remote_address != '0.0.0.0' AND remote_port = 443 LIMIT 10;

The socket_events table will give you every CONNECT, BIND and CLOSE event on a Linux system.

SELECT action, pid, local_address, local_port, remote_address, remote_port, family, protocol, path, time AS timestamp 
FROM socket_events
WHERE remote_address <> "" AND remote_port != 0 AND pid > 0;

However, as socket_events is only provided for Linux you can construct the following query for process_open_sockets (which is universal) to mimic the socket_events table on Linux if you’re trying to maintain a consistent query across platforms.

SELECT DISTINCT "CONNECT" AS action, pid, local_address, local_port, remote_address, remote_port, family, protocol, path, NULL AS timestamp
FROM process_open_sockets
WHERE remote_address <> "" AND remote_port != 0 AND pid > 0;

The listening_ports table can be used to mimic the BIND action seen in the sockets table.

Geo-location with WiFi surveys

Ever wondered where your Mac OS X devices are? Well, Osquery has two really interesting tables for you. Firstly, the wifi_status table will let you see what WiFi networks a device is connected to. A router will actually broadcast its country_code which can be used to finding the rough location of a device. This data is useful for comparing this against Google G-suite logins or other single-sign-on (SSO) providers to verify authentication requests.

However, you can get even more detail through the wifi_survey table with the help of Google’s geolocate API. When Google drove around taking photos for street view, they also collected everyone’s Wifi BSSIDs along with the current GPS location of their car. Which means that if you have the bssid, rssi and noise information you can work out within a few meters where a particular device is via Google’s API. As the rssi and noise columns will help calculate the distance of a user from the discovered networks. The user doesn’t even have to be connected to a WiFi network for this to work.

Data loss protection (USB device monitoring)

A really cool application we built using Osquery was to monitor for USB devices being plugged into a machine and then monitoring for files being either copied to or from that USB device using File integrity monitoring (FIM). This application of Osquery can somewhat provide a basic data loss protection.

There are two parts to getting this to work and unfortunately, it only works on Linux and Mac OS X at the moment.

In this example, we’re going to be focusing solely on Mac OS X.

File integrity monitoring (FIM)

Firstly, file integrity monitoring (FIM) uses inotify (Linux) and FSEvents(Mac OS X) to monitor files and directories for changes. As files/directories are written, read and deleted events are created. Osquery will also provide (where appropriate) a hash of the file as well. Which can be used to identify sensitive files such as key-chains. Osquery can update its FIM configuration on the fly via the remote API. Wildcards for directories can also be used.

When a USB device if plugged into a Mac, its automatically mounted into the /Volumes/ directory. The following Osquery FIM configuration will create an event as files are created, modified and deleted within the /Volumes/<device> directory. An important note is that this configuration is done for simplicity. In some cases, secondary system volumes may be mounted within /Volumes/ leading to a large number of events being generated, just something to bear in mind.

# osq.conf
{
  "options": {
  },
  "file_paths": {
    "homes": [
      "/Volumes/%%"
    ]
  }
}

Right let us find some devices! Fire up the osqueryi command line like so:

sudo osqueryi --disable_audit=false --verbose --disable_events=false --config_path ./osq.conf

USB devices

There are two tables we need for this next part disk_events and mounts this will help us find newly inserted USB devices and most importantly their mount point so that we know what files are being copied to which device.

Insert your USB storage device. This query should yield a result showing the newly inserted device event.

SELECT action, DATETIME(time, 'unixepoch') AS datetime, vendor, mounts.path FROM disk_events LEFT JOIN mounts ON mounts.device = disk_events.device;

Now the path column is what we’re interested in. This is where the USB device has been mounted on your system. Now copy any file you like onto your USB storage device and then run the following query:

SELECT action, uid, SUBSTR(target_path, 18) AS path, SUBSTR(md5, 0, 8) AS hash, time FROM file_events WHERE sha1 <> '' AND target_path NOT LIKE '%DS_Store';

Voila! You should now see some file events. The query above uses the SBSTR SQLite function to shorten the result text but you can remove these to get the full file hash and path as need be. How cool is that? We’ve used this across Mac OS X to assure that sensitive files are not being copied to non-encrypted USB devices (which you can determine using Osquery by checking the volume type).

Compliance

Well done for getting this far and whilst compliance isn’t the sexiest of security topics Osquery does make compliance a breeze. In the screenshot below each cube represents a compliance check executed against a collection of systems. Red being zero or little compliance against the tested systems and green representing most or all systems comply.

Most of the queries below have been written to test part of the Center for information security (CIS) cybersecurity framework. Designed to provide peer-reviewed guidance best practice for system configurations. Zercurity provides compliance rules for over 20 frameworks including Windows, major Linux flavours and Mac OS X.

Screenshot showing compliance checks for several CIS frameworks

I’m going to cover a few of the common tables we use to check the configuration of systems. However, you can really use just about all of the virtual tables that Osquery provides to check the configuration or uses of a given system. Some tables are more performant than others.

For each of our queries ,we’ll return a single column named passed containing either a 1 or 0 indicating whether the rule passed or failed.

Windows registry

Osquery provides access to the Windows registry via the registry table. In Windows you can use the regedit.exe tool to inspect your systems configuration as well as check that executed queries are working correctly. This can also be used to check the vast majority of Group Policy objects (GPOs) within the Windows ecosystem.

You can also monitor for system registry changes to identify which registry keys have been modified in order to query for them later.

As an example, we’re going to take the “CIS 18.2.1 (L1) Ensure Local administrator password service ( LAPS) AdmPwd GPO Extension / CSE is installed”. This can be done by checking that the following registry key is present.

SELECT
  COUNT(*) AS passed
FROM
  registry
WHERE
  key = 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\GPExtensions\{D76B9641-3288-4f75-942D-087DE603E3EA}'
  AND name = 'DllName';

In the event that LAPS is not installed then no registry record will be found and thus no results will be returned. Which will result in the COUNT(*) AS passed returning 0. Or 1 in the event the registry key is present.

Mac OS X Plist

Osquery can parse Apple plist flies (used to hold both system and application configuration information) via the plist table. As an example, we’re going to take the “CIS 1.5 Enable system data files and security update installs” compliance check. Which checks to ensure critical updates are automatically installed.

SELECT
  CASE WHEN SUM(passed) = 2 THEN 1 ELSE 0 END AS passed
FROM (
  SELECT key, value AS passed
  FROM plist
  WHERE path = '/Library/Preferences/com.apple.SoftwareUpdate.plist'
  AND (
    key = 'ConfigDataInstall'
    OR key = 'CriticalUpdateInstall'
  )
);

This query checks for two keys within the com.apple.SoftwareUpdate.plist file. If the values for both options are 1 then auto-updates are enabled. Which means we can SUM the two values to get our passed value. Which will result in the CASE WHEN SUM(passed) = 2 THEN 1 ELSE 0 END AS passed returning 1 if both rules conditions are met.

Linux & Mac OS X System controls

Osquery can use system_controls provide access to sysctl names, values, and settings information. As an example, we’re going to create a compliance rule for the “CIS 3.1.1 Ensure IP forwarding is disabled” rule.

SELECT COUNT(*) AS passed 
FROM system_controls 
WHERE name = 'net.ipv4.ip_forward' 
AND current_value = 0 AND config_value = 0;

This query simply checks the system_controls table ensuring that the net.ipv4.ip_forward is both currently disabled and disabled in the config. Unlike the Mac OS X plist we’re only search for one key so the COUNT(*) will only return 1 if net.ipv4.ip_forward is disabled.

Augeas

One of Osquery’s lesser-known tables but very cool nevertheless, augeas will automatically parse known file types. In this example we’re going to check that the sshd_config file is configured to useProtocol 2 for SSH communication as per: “CIS 5.2.2 Ensure SSH Protocol is set to 2”

SELECT COUNT(*) AS passed FROM augeas WHERE path = '/etc/ssh/sshd_config' AND label = 'Protocol' AND value = '2';

In the event the Protocol 2 is present in the config file a single record will be returned.

Carves

There are some file formats that augeas cannot parse. Osquery carves table lets you fetch files from the system which you can then parse remotely. For example we can fetch all the config files from /etc/ . To make use of the carves table, you need to add a few other command-line arguments. We’ll cover this in more detail in a future blog post.

SELECT * FROM carves WHERE path LIKE '/etc/%' AND carve=1;

A few honourable mentions

As an example, of a few other Osquery tables you can use. The Mac OS X alf table lets you check the configuration of OS Xs application layer firewall (ALF) service details. The following query will ensure as per “CIS 2.6.4 Enable Firewall Stealth Mode” that stealth mode is enabled.

SELECT stealth_enabled AS passed FROM alf;

Osquery can also check the user_ssh_keys of you’re developers to ensure that all ssh keys are encrypted.

SELECT CASE WHEN COUNT(*) = SUM(encrypted) THEN 1 ELSE 0 END AS passed 
FROM user_ssh_keys;

Closing

Lastly, I just wanted to cover a few of the things that we’ve been building atop all of this data to help automate security operations. You can really build quite a comprehensive view of your systems and cybersecurity posture. Helping grade risks to your IT estate and group those risks to understand areas of improvement. At Zercurity the results we’re interpenetrating, can not only be weighted be also add a cost of implementation added to help identify quick wins when improving your cybersecurity maturity as well as monitoring that progress week on week to provide reporting.

Workflows & User-focused security (USF)

We’re also using workflows, heavily building off of AWS Lambda functions and AWS state machines to hook Osquery events. Triggering actions whether that be an email notification, JIRA ticket, slack message or any other third party integration. The possibilities here are endless. We’ve also been using Osquery to identify system users. Mapping this data against Active Directory and Google G-suite to push compliance and other system configuration alerts back onto users. Making them aware both of the risks to their system and corperate policies but also how to fix it. In the example above around USB devices. A user can be notified of a new device that’s been plugged in. They can then either acknowledge this new device as their own or escalate the ticket. We can also check to ensure its encrypted and if not provide instructions on how to correctly setup the device.

Automated recommendations and maturity tracking

We’ve also been working to have Zercurity automatically raise issues and track changes providing a complete audit trail. Moreover, Zercurity can find common issues across systems and raise this as a company-wide security goal which can be actioned based on the level of effort required. We’ve been providing these bi-weekly objectives to help companies improve their security posture whilst ensuring coverage of the major cybersecurity themes.

Conclusion

Osquery is the Swiss army knife for cybersecurity teams. There is a whole lot you can do. Osquery can perform as an EDR, Compliance tool, audit system and let you query machines to both preempt threats and find them. A very modern and powerful tool for IT administrations and security teams.

Thanks for reading.

Footnote

Zercurity is a cybersecurity platform aimed to help companies understand their cybersecurity posture and help them improve their cybersecurity hygiene proactively through remediation advice and user-focused security recommendations to find out more visit us at https://www.zercurity.com.