Logging course

Welcome and thanks for joining!

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC0 1.0)

What we will cover

How logging and other observability tools helps us audit activity in IT systems.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

Detect and investigate malicious activity.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Centrally collect and analyze log data from a wide range of sources.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Learn how logging helps us comply with rules and regulations.

© Course authors (CC BY-SA 4.0) - Image: © Pedro Mendes (CC BY-SA 2.0)

Dip our toes into benefits in other areas, such as cost savings and increased availability.

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Requires basic knowledge of...

  • OS and application management
  • Networking
  • The Linux shell
  • Docker and Docker Compose

You'll also need access to a Linux system
(any distribution), web browser and SSH client.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

How we will do it

  • Lectures and Q&A
  • Group presentations
  • Graded labs
  • Continuous reflection
  • Quizzes and scored tests
© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

For detailed notes, glossary, labs and similar, see:
t.menacit.se/log.zip.

These resources should be seen as a complement to an instructor lead course, not a replacement.

© Course authors (CC BY-SA 4.0)

Acknowledgements

Thanks to IT-Högskolan and Särimner for enabling development of the course.

Hats off to all FOSS developers and free culture contributors making it possible.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Free as in beer and speech

Is anything unclear? Got ideas for improvements? Don't fancy the images in the slides?

Create an issue or submit a pull request to
the repository on GitHub!

© Course authors (CC BY-SA 4.0) - Image: © George N (CC BY 2.0)

Let us dig in!

© Course authors (CC BY-SA 4.0) - Image: © Kevin Doncaster (CC BY 2.0)

Vocabulary and basics

© Course authors (CC BY-SA 4.0) - Image: © James Johnstone (CC BY 2.0)

Ship captains have kept logbooks
for hundreds of years.

© Course authors (CC BY-SA 4.0) - Image: © James Johnstone (CC BY 2.0)

The airplane "black box" is another good example.

© Course authors (CC BY-SA 4.0) - Image: © Forsaken Fotos (CC BY 2.0)

Why do we log?

  • Detection of malicious activity
  • Deterrence of bad behavior
  • Review and optimization
  • Legal/Compliance requirements
© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

What makes a good log entry/event?

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Operational logs

Enables us to understand what is
happening in a system.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Audit logs

Enables us to "reenact" events of interest.

© Course authors (CC BY-SA 4.0) - Image: © A Loves DC (CC BY 2.0)

When,
Who,
What,
Where and possibly
Why?

© Course authors (CC BY-SA 4.0) - Image: © A Loves DC (CC BY 2.0)

When?

Time and date when something occurred.

Can be used to create an event timeline across different applications and systems.

Finely tuned/synchronized clocks are vital,
solutions like the Network Time Protocol help.

© Course authors (CC BY-SA 4.0) - Image: © Yellowcloud (CC BY 2.0)

Who?

Which human/computer/application caused the event?

Preferably backed by strong authentication.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

What?

Explaining what activity caused the log entry to be created.

May communicate why the log event might be of interest.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

(from) Where?

Information related to the location of the event causer.

Name of room/building, GPS coordinates,
phone number, IP address, device identifier...

© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

Why?

What was the reason that caused the actor to perform an action that generated the log event?

Provided by the event causer and/or sources helping reviewers to put the log entry in a context.

Many systems, such as electronic health journals, require this to minimize/detect misuse.

© Course authors (CC BY-SA 4.0) - Image: © Pedro Mendes (CC BY-SA 2.0)

Preferably readable by man and machine alike!

© Course authors (CC BY-SA 4.0) - Image: © Jonathan Torres (CC BY 4.0)

In practice, both may contain events of interest for security analysts.

Many systems don't differentiate between them.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

How do we implement logging?

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

"Inspection-based"

Behavior of non-cooperating applications/systems are observed by an external application.

Network traffic sniffing, raw database queries,
syscall interception, resource consumption...

Doesn't require (costly) changes to applications/systems.

© Course authors (CC BY-SA 4.0) - Image: © Johannes P1hde (CC BY 2.0)

Instrumented

Applications are responsible for producing log events when activity of interest occurs.

Less guesswork and more details/context compared to inspection-based logging.

Requires that the application provides trustworthy information.

Sometimes tricky/costly to implement.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Who is looking at the logs?

© Course authors (CC BY-SA 4.0) - Image: © Randy Adams (CC BY-SA 2.0)

Development and operations

Software developers commonly used debug logging to verify expected behavior and identify bugs.

Quality assurance teams use logs to identify performance regressions.

System administrators use logs to understand behavior of systems and implement optimizations.

© Course authors (CC BY-SA 4.0) - Image: © Todd Van Hoosear (CC BY-SA 2.0)

Security personnel

Analysts and threat hunters in
Security Operation Centers
monitor logs to detect malicious activity.

Incident Responders dig through historical logs to build timelines of threat actor actions.

(Likely your first job in the sector)

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Data analysts and scientists

Modern IT environments produce a lot of log data.

Companies employ specialists to find useful insights from logs.

Helps with A/B testing and understanding of user behavior.

Improve operations or sell to third-parties.

© Course authors (CC BY-SA 4.0) - Image: © IAEA (CC BY 2.0)

How do we analyze the logs?

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

"Pattern-based"

Search for occurrence of strings/patterns that are of interest.

May be things like known error codes or
Indicators of Compromise.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Aggregation and correlation

Not all insights can be gained by simply looking for pattern in logs.

Some require counting patterns/field values and correlation between different logs/systems.

May be things like...

  • Common web server paths causing errors
  • Number of failed logins per username/source IP address
  • Approximated physical location of app users
© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

"Anomaly-based"

Surface log events that haven't been seen
before or those that contain information
which is "unusual".

Relies heavily on automated detection,
more about that later!

© Course authors (CC BY-SA 4.0) - Image: © Timothy J Toal (CC BY 4.0)

How can we make our logs more useful?

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC BY 2.0)

Normalization

Massage log events from different sources
to ensure that "field names", timestamps
and similar are formatted the same way.

Makes digging and correlation easier!

© Course authors (CC BY-SA 4.0) - Image: © Joel Rangsmo (CC BY-SA 4.0)

Enrichment

Automation may be used to extend log events with information that may aid in analysis.

Employee position/role, system owner/purpose,
geographic IP lookups, domain/address/file occurrence in IoC lists...

© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)

Visualization

Sometimes a picture says more than
a thousand words.

Usage of visual tools graphs, charts and maps can help humans identify interesting events/patterns.

© Course authors (CC BY-SA 4.0) - Image: © ESA (CC BY-SA 3.0 IGO)

Where do we analyze logs?

© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)

Local analysis

Most applications store logs in a database or simple text files.

Shell utilities and simple scripts can be used to gain important insights.

© Course authors (CC BY-SA 4.0) - Image: © Randy Adams (CC BY-SA 2.0)

Centralized analysis

Instead of storing/analyzing logs on the producing systems, do it centrally.

Standardized protocols and software agents can be used to collect/transfer logs over the network.

What are the benefits?

© Course authors (CC BY-SA 4.0) - Image: © Sbmeaper1 (CC0 1.0)

Ease correlation

Almost every network connected device can produce logs.

Manually checking everything is
time-consuming (see "expensive").

Some insights require a wider-perspective than individual systems.

© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

Minimizes risk of tampering

Log data can be manipulated to hide malicious activity or implicate individuals.

Once hacked/modified, system logs can no longer be trusted.

Centralized logging enables us to preserve the information.

© Course authors (CC BY-SA 4.0) - Image: © Yellowcloud (CC BY 2.0)

Optimize performance and cost

Not all computers are optimized for log storage/analysis.

Logs can be stored on different storage mediums depending on needs.

Retention policies can be managed in one place.

© Course authors (CC BY-SA 4.0) - Image: © Kevin Dooley (CC BY 2.0)

Didn't you mention
machines looking at logs?

© Course authors (CC BY-SA 4.0) - Image: © Meddygarnet (CC BY 2.0)

Alerting

Scheduled searches can continuously look for known bad patterns in log events.

Once identified, automated actions can be taken or humans notified for manual analysis.

© Course authors (CC BY-SA 4.0) - Image: © Torkild Retvedt (CC BY-SA 2.0)

Anomaly detection

Humans are quite good at identifying things out of the ordinary.

They have neither the time nor attention span to analyze the logs from modern IT environments.

Algorithms and Machine Learning can help us, especially with centralized logging.

© Course authors (CC BY-SA 4.0) - Image: © Mauricio Snap (CC BY 2.0)

Centralized logging services often serve as the basis for a...

Security
Information and
Event
Management system.

(Terms are often used interchangeably)

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

SIEMs commonly serve as a source for...

Security
Orchestration,
Automation and
Response systems.

(SIEM + semi-automated handling)

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC0 1.0)

Observability is not just logging.

Metrics are other examples.

Typically not used for security purposes.

© Course authors (CC BY-SA 4.0) - Image: © Chris Dlugosz (CC BY 2.0)
$ curl http://server.example.com:9100/metrics | grep "errs_total"

# HELP node_network_receive_errs_total Network device statistic receive_errs.
# TYPE node_network_receive_errs_total counter
node_network_receive_errs_total{device="enp2s0"} 10185

# HELP node_network_transmit_errs_total Network device statistic transmit_errs.
# TYPE node_network_transmit_errs_total counter
node_network_transmit_errs_total{device="enp2s0"} 1249
© Course authors (CC BY-SA 4.0)

Sounds great - what's the catch?

© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)

Logging overhead

Producing/storing/transferring log events requires CPU cycles, I/O operations, bandwidth, etc.

May require more expensive hardware and impact user experience/performance.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Storage and processing costs

We're tempted to log everything - it may contain useful information.

Storing and processing these ain't free.

Many logging systems use volume-based licensing.

© Course authors (CC BY-SA 4.0) - Image: © Scott Merrill (CC BY-SA 2.0)

Collection of sensitive data

Logs may contain
Personally Identifiable Information,
credentials and other sensitive information.

That may be of interest to malicious actors,
especially if centrailized.

Anonymization/Pseduonymization
may help, but are not a penicillin.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Legal/Compliance challenges

While some laws/compliance frameworks require us to log activity, others prevent it.

Some examples are...

  • Banking/Attorney privacy protection
  • EU employee monitoring laws
  • Storage of credit card numbers under PCI DSS
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Cost of analysis

Someone needs to analyze/act on the collected data/alerts.

Expensive and hard to recruit, especially for 24/7 operations.

Even when using managed cloud services,
some analysis is likely required due to the
shared responsibility model.

© Course authors (CC BY-SA 4.0) - Image: © Jeena Paradies (CC BY 2.0)

We'll dig more into these topics during the course.

Our primary focus will be security related logging.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

Group exercise

Putting knowledge to use

© Course authors (CC BY-SA 4.0) - Image: © Tero Karppinen (CC BY 2.0)

Exercise: Sell em' logging

Participants are split into four or more groups.

Each group will be assigned an example organization that they'll pitch an aspect
of logging to.

Try involving as many use-cases as possible -
add challenges/needs to the scenario if it helps.

After presentation, send slides to
courses+log_010201@0x00.lt

© Course authors (CC BY-SA 4.0) - Image: © Tero Karppinen (CC BY 2.0)

Org. 1: Xample Bank & Finance

Company providing banking and payment services - both to consumers and B2B.

Pitch security and audit logging.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Org. 2: Examplezon Inc.

Company providing a web shop for all kinds of physical and virtual goods.

Pitch log analysis for UX improvements and monetary gains.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

Org. 3: Exemplum Medical

Company that conducts research and manufacturing of medical devices/medicine.

Pitch security, audit and operational logging.

© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Org. 4: Examplx Web Services

Company providing cloud infrastructure and applications as a service.

Pitch operational logging.

© Course authors (CC BY-SA 4.0) - Image: © OLCF at ORNL (CC BY 2.0)

Reflections exercise

What have we learned so far?

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

Answer the following questions

  • What are your most important takeaways?
  • Did you have any "Ahaaa!"-moments?
  • Was anything unclear or were there specifics you didn't understand?

courses+log_010301@0x00.lt

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

Basics recap

Let's refresh our memory

© Course authors (CC BY-SA 4.0) - Image: © Sergio Delgado (CC BY 2.0)

Logging helps us...

  • Debug and optimize systems
  • Understand user behavior and adapt UX
  • Find another venue for revenue
  • Comply with rules and regulation
  • Detect/Investigate malicious activity
© Course authors (CC BY-SA 4.0) - Image: © Raphaël Vinot (CC BY 2.0)

Operational logs

Enables us to understand what is
happening in a system.

Audit logs

When, who, what, where
and (possibly) why?

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Both may contain security-related events!

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Centralized logging services often serve as the basis for a...

Security
Information and
Event
Management system.

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Benefits of centralized logging

  • Ease correlation
  • Minimizes risk of tampering
  • Optimize performance and cost
© Course authors (CC BY-SA 4.0) - Image: © Jan Hrdina (CC BY-SA 2.0)

Beware of...

  • Performance overhead
  • Cost of processing, storage and analysis
  • Logging of sensitive information
  • Legal/Compliance challenges
© Course authors (CC BY-SA 4.0) - Image: © Rick Harris (CC BY-SA 2.0)

Let's move on,
we got a lot to cover!

© Course authors (CC BY-SA 4.0) - Image: © Yves Sorge (CC BY-SA 2.0)

Time and clocks

A not so scary introduction

© Course authors (CC BY-SA 4.0) - Image: © Kenny Cole (CC BY 2.0)

IT systems rely on time and clocks
for a wide variety of important tasks.

Authentication protocols, banking applications, industrial control systems...

Allows us to correlate events/activity in different computers and the real world.

© Course authors (CC BY-SA 4.0) - Image: © Kenny Cole (CC BY 2.0)

What kind of time?

Wall time / Real time.

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Keeping it simple

Most computers count number of seconds
elapsed since the first of January 1970 (UTC).

Commonly called "UNIX time"/"Epoch".

Converted into local time/calendar date
by OS/applications.

(Await the horrors of 2038!)

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

What is a second anyway?

Something something the sun and moon.

In the late 1800s, physicists tried
to properly define a second.

Atomic clocks measure the resonant
frequency of atoms very precisely,
ain't expensive++ these days.

Since 1968, BIPM defines it as
~9 billion frequency transitions of
Cesium 133 at -273 Celsius.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Sounds quite straightforward, doesn't it?

You're not getting away that easily.

Let's talk about time zones and dates...

© Course authors (CC BY-SA 4.0) - Image: © Eric Savage (CC BY-SA 2.0)

Time zones

You wanna eat lunch around 12, right?

Not straight lines, quite a lot of politics involved.

Important to keep track of if we're operating internationally.

© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Daylight savings

Everyone Many of us love a bit of sun,
but hates being confused.

Not everyone changes at the same time.

Many plan to get rid of it, few have succeeded.

...and some of those who've done it
did it in a very annoying way.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Let's make it more exciting!

Some time zones differ by
30 or 45 minutes.

(Some places don't even want
24 hour days!)

© Course authors (CC BY-SA 4.0) - Image: © Kenny Cole (CC BY 2.0)

Why not throw in
leap years and leap seconds?

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

These are not static things and can
change (back and forth) over time.

Not just the Gregorian calendar.

Must be remembered when performing time calculations.

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Is all hope lost?

Are we doomed to live in a confusing time warp?

Could any somewhat sane person wrap their head around this?

© Course authors (CC BY-SA 4.0) - Image: © NASA (CC BY 2.0)

Let's meet
Arthur David Olson and Paul Eggert.

© Course authors (CC BY-SA 4.0) - Image: © Jonathan Torres (CC BY 4.0)

tz database

Dataset and reference code for working with international calendar time.

Continuously updated for an ever-changing world.

Maintained by ICANN since 2011.

© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

Challenges solved?

© Course authors (CC BY-SA 4.0) - Image: © Joel Rangsmo (CC BY-SA 4.0)

Time/Date representation

Many different formats exist for dates and timestamps.

Which part is the year, month and day?
What time zone are we talking about?

Some are more/less readable
by humans and machines alike,
like RFC 3339 and ISO 8601.

(please use one of these!)

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Okay okay -
Time is messy but important, we get it!

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

The two challenges

  1. All clocks show the same time
  2. All clocks show the right time
© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

In theory, if we solve the second
we should automatically solve the first.

In practice, this is tricky - just trust me for now.

Let's start with the first problem...

© Course authors (CC BY-SA 4.0) - Image: © Jan Bommes (CC BY 2.0)

NTP

Network Time Protocol.

Standard for clock synchronization.
Actively developed since 1980s.

Replicates time over UDP port 123.
Uses bag of tricks to calculate
and adjust for network delay.

Mitigates clock drift/skew.

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Example clients/servers

  • ntpd
  • NTPsec
  • OpenNTPD
  • chrony
  • systemd-timesyncd

Some only implement Simple NTP.

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

Weaknesses

Plain-text protocol* vulnerable
to Man-In-The-Middle attacks.

Precision typically limited
to milliseconds.

© Course authors (CC BY-SA 4.0) - Image: © Nikki Tysoe (CC BY 2.0)

NTS

Network Time Security.

Uses TLS and PKI to exchange key
for symmetric authenticated encryption.

Extension to NTP, like HTTPS for HTTP.

Limited software support and a bit more
resource intensive than plain NTP.

© Course authors (CC BY-SA 4.0) - Image: © Christian Siedler (CC BY-SA 2.0)

PTP

Precision Time Protocol.

Version 2 can synchronize clocks
with ~nanosecond precision.

Enabled by special handling in
Network Interface Cards
and Operating Systems.

© Course authors (CC BY-SA 4.0) - Image: © Carl Davies (CSIRO) (CC BY 3.0)

Our clocks are in sync!

Let's focus on the second problem...

© Course authors (CC BY-SA 4.0) - Image: © Andrew Hart (CC BY-SA 2.0)

What's the correct time?

In the basement of BIPM,
atomic clocks tick to define...

Universal
Time
Coordinated.

(not a time zone, but ~matches
GMT except no daylight savings)

© Course authors (CC BY-SA 4.0) - Image: © Warren LeMay (CC BY-SA 2.0)

How does my time server know
what the correct time is?

Ask another one perhaps?

© Course authors (CC BY-SA 4.0) - Image: © Helsinki Hacklab (CC BY 2.0)

Getting reference time

  • Dedicated signaling cable
  • Radio broadcast
  • Satellite navigation system (GNSS, like GPS)
  • Locally connected atomic clock
© Course authors (CC BY-SA 4.0) - Image: © Freed eXplorer (CC BY 2.0)

Clocks break, radio communication can be spoofed/jammed and NTP peers may lie.

What's the solution?

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Use multiple sources and calculate an average!

© Course authors (CC BY-SA 4.0) - Image: © Charles Hoisington, GSFC (CC BY 2.0)

Kool - let's grab some time!

© Course authors (CC BY-SA 4.0) - Image: © Kuhnmi (CC BY 2.0)

Using pool.ntp.org

Used as default by many operating systems
and IoT appliances.

Run by volunteers, anyone* can join and contribute!

Region specific aliases, like "se.pool.ntp.org", can be used in attempts to find servers nearby.

© Course authors (CC BY-SA 4.0) - Image: © John K. Thorne (CC0 1.0)

Cloudflare and NIST provide
good alternatives/complements.

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Using ntp.se

Also known as the
Swedish Distributed Time Service.

Funded by PTS and operated by Netnod.

Provides highly accurate time via
Anycast from several redundant sites
spread over Sweden.

Relies on an open-source FPGA-based
for NTP and NTS. Offers PTP.

© Course authors (CC BY-SA 4.0) - Image: © Bengt Nyman (CC BY 2.0)

Wanna geek out on time?

Join the annual
Netnod Tech Meeting!

© Course authors (CC BY-SA 4.0) - Image: © Jesse James (CC BY 2.0)

Questions and/or thoughts?

© Course authors (CC BY-SA 4.0) - Image: © Wonderlane (CC BY 2.0)

Network traffic logging

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Capturing/Sniffing of network traffic
allows us to implement
inspection-based logging.

Freely available tools like
tcpdump and WireShark/TShark are
easy to use on your own computer/server.

© Course authors (CC BY-SA 4.0) - Image: © Freed eXplorer (CC BY 2.0)
$ tcpdump -v -n -A -i "enp2s0" -- port 80              

tcpdump: listening on enp2s0,
link-type EN10MB (Ethernet),
snapshot length 262144 bytes          

[...]
11:16:57.522596 IP (tos 0x0, ttl 64, id 33090,
offset 0, flags [DF], proto TCP (6), length 127)
198.18.100.3.45924 > 93.184.216.34.80: Flags [P.],
[...]
  GET / HTTP/1.1
  Host: example.com
  User-Agent: curl/7.81.0 
© Course authors (CC BY-SA 4.0)

Most enterprise-grade switches supports
configuration of a "span/mirror/tap" port.

Just connect a computer and start sniffing.

(Works even if other hosts are owned!)

© Course authors (CC BY-SA 4.0) - Image: © Dave Herholz (CC BY-SA 2.0)

Public-cloud features like
Azure Virtual Network TAP and
AWS Traffic Mirroring
provide similar capabilities.

© Course authors (CC BY-SA 4.0) - Image: © Jon Evans (CC BY 2.0)

Enables fairly low-friction
logging implementations!

© Course authors (CC BY-SA 4.0) - Image: © Oklahoma National Guard (CC BY 2.0)

Sounds good! What's the problem?

© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)

Resource intensive

Require many CPU cycles and lots of storage I/O operations to handle gigabits of traffic.

You'll also need quite a bit of disk space to store all that data.

(~43GB for each hour of 100Mbit/s)

© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)

Prevalence of encryption

These days, most interesting network traffic is encrypted.

Interception boxes exist to decrypt, inspect and re-encrypt data streams.

These interceptors require tricky and risky configuration changes on all networked systems.

© Course authors (CC BY-SA 4.0) - Image: © Mario Hoppmann (CC BY 2.0)

Full network capture and storage may
be reasonable in highly sensitive
environments with limited traffic.

Partly alleviate the costs by using
solutions like tc and (e)BPF filters
to minimize processing/storage.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Any alternatives?

© Course authors (CC BY-SA 4.0) - Image: © Michael Garlick (CC BY-SA 2.0)

NIDS

Network Intrusion Detection System.

Look for suspicious network traffic using IoCs.

Functionality typically provided by
enterprise-grade firewalls, dedicated appliances
and open-source software (Snort, Suricata,
Zeek/Bro IDS, etc.)

Doesn't require storage of (all) traffic.

© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)

IPS

Often extended to act as an
Intrusion Prevention System.

Don't just detect attacks, block them.

Sounds great, but introduces some
availability risks.

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Shared problems

Requires lots of computing resources.

Limited by wide-spread use of encryption.

© Course authors (CC BY-SA 4.0) - Image: © Pumpkinmook (CC BY 2.0)

Mayhaps we don't need to store all traffic,
but only metadata?

Let me introduce
Network flow logging.

© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

The basics

Limit collection and storage to information
about network communication and not its content.

Many solutions define a flow as the
same peers, protocol and
(when applicable) port
.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Argus

Highly configurable open-source software.

Only requires a span/mirror/tap port and
commodity server hardware.

© Course authors (CC BY-SA 4.0) - Image: © Helsinki Hacklab (CC BY 2.0)

NetFlow

Proprietary protocol developed by Cisco
for traffic meta-data logging.

Typically implemented in hardware,
resulting in low overhead.

Routers logging traffic (called "exporters")
send flow information to a "collector"
over the network.

May optionally use sampling to minimize
transfer/processing/storage costs.

© Course authors (CC BY-SA 4.0) - Image: © NASA/Bill Stafford (CC BY 2.0)

IPFIX

Formally/Openly standardized by IETF as

Internet
Protocol
Flow
Information
eXport.

More or less the same as NetFlow.

If you wanna play with it but don't have
an "enterprise-grade" switch, try softflowd.

© Course authors (CC BY-SA 4.0) - Image: © NASA (CC BY 2.0)

That's neat, but how is this useful?

© Course authors (CC BY-SA 4.0) - Image: © Kristina Hoeppner (CC BY-SA 2.0)

Operational benefits

Understand how systems really
communicate with each other.

© Course authors (CC BY-SA 4.0) - Image: © Jan Hrdina (CC BY-SA 2.0)

Can we decommission this server or does anything still seem to be using it?

Is traffic really reaching the web server or is it blocked somewhere during the path?

Which database server was the application using during a performance incident?

© Course authors (CC BY-SA 4.0) - Image: © Jan Hrdina (CC BY-SA 2.0)

Security benefits

Detect and investigate suspicious traffic patterns.

© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

A laptop has been infected with malware, did it try to communicate with anything?

Has any device performed port scanning in our network?

Has any of our systems communicated with a known bad host?

© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

Conclusions

Complements host-based logging.

Makes more or less sense depending on your network architecture/security model.

© Course authors (CC BY-SA 4.0) - Image: © Wendelin Jacober (CC0 1.0)

Remember that you may need permission
to capture/log network traffic.

Questions or thoughts?

© Course authors (CC BY-SA 4.0) - Image: © Bixentro (CC BY 2.0)

Log analysis with Coreutils

© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)

UNIX-like systems have historically produces tons of text files containing log events.

Take a peak in "/var/log".

Many different tools exist to tame them.

© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)

Meet GNU Coreutils

The GNU Core Utilities are the basic file,
shell and text manipulation utilities of
the GNU operating system.

These are the core utilities which are
expected to exist on every operating system.

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)
basename fold split
cat head tail
comm join tac
cut md5sum tee
date []/test tr
dirname paste touch
echo pr true
expand/unexpand seq uniq
false sleep wc
fmt sort ...
© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

Let's play around with some of them!

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

cat

$ cat fruits.txt

apple
banana

tac

$ tac fruits.txt

banana
apple
© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Introducing grep

Almost part of Coreutils - I'm cheating a bit.

Only output lines matching pattern:

$ cat favourite_countries.txt

1. Iceland
2. Kazakhstan
3. Greece
4. Turkmenistan

$ cat favourite_countries.txt | grep stan

2. Kazakhstan
4. Turkmenistan 
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Case-insensitive

$ cat logins.log | grep -i admin

18:49 - User "Administrator" logged in

Multiple patterns

$ cat logins.log | grep -e Admin -e root

08:22 - Failed login for user "root"
18:49 - User "Administrator" logged in
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Inverted/Excluding match

$ cat berries.txt

Raspberry
Tomato
Cloudberry

$ cat berries.txt | grep -v Tomato

Raspberry
Cloudberry
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Including and excluding patterns

$ cat berries.txt

Raspberry
Tomato
Cloudberry

$ cat berries.txt | grep berry | grep -v Ras

Cloudberry
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Matching files

$ grep -l password /etc/my_app/*.conf

/etc/my_app/cache.xml
/etc/my_app/db.conf

Files without matches

$ grep -L "Completed" /var/backup/*.log

/var/backup/mail-5_20230904.log
/var/backup/mail-5_20230905.log
/var/backup/www-2_20230904.log
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Include line before match

$ cat auth.log | grep -B 1 root

08:11 Successful login for:
root@127.0.0.1

Include line after match

$ cat auth.log | grep -A 1 "login for"

08:11 Successful login for:
root@127.0.0.1
08:12 Failed login for:
backup@66.96.149.32
© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Wanna know how many matches you got?

wc got you covered!

$ cat logins.log | grep "Error" | wc -l

9001
© Course authors (CC BY-SA 4.0) - Image: © Takomabibelot (CC BY 2.0)

Perhaps you're only interested in parts of the matching lines?

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Meet cut

Split lines for every occurrence of delimiter character into fields.

© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

Extract second space-delimited field

$ cat friends.txt

Eddard "Ned" Stark
Jon "Bastard" Snow

$ cat friends.txt | cut -d " " -f 2

"Ned"
"Bastard"
© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

Extract third comma-separated field

$ cat taxi_rides.log

Date,From,Destination,Cost
0930,Cityterminalen,Granö,1959
1005,Sickla,Liljeholmen,201

$ cat taxi_rides.log | cut -d "," -f 3

Destination
Granö
Liljeholmen
© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

Extract field three and all before

$ cat taxi_rides.log | cut -d "," -f -3

Date,From,Destination
0930,Cityterminalen,Granö
1005,Sickla,Liljeholmen

Extract field three and all after

$ cat taxi_rides.log | cut -d "," -f 3-

Destination,Cost
Granö,1959
Liljeholmen,201
© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

Advanced field selection

$ cat numbers.txt

one two three four five six seven eight nine ten

$ cat numbers.txt | cut -d " " -f 1,3-5,8-

one three four five eight nine ten
© Course authors (CC BY-SA 4.0) - Image: © Sergei F (CC BY 2.0)

Need to clean up your input/output?

tr may be able to help!

© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Replace occurrences of character

$ cat names.txt

Jöel
Jönas

$ cat names.txt | tr ö o

Joel
Jonas
© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Change casing of letters

$ cat methods.txt

get
post

$ cat methods.txt | tr "[[:lower:]]" "[[:upper:]]"

GET
POST
© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Delete specified characters

$ cat username.txt

__--bogdan--__

$ cat username.txt | tr -d "_-"

bogdan
© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Remove repeating character

$ cat friends.txt

Eddard  "Ned"      Stark
Jon     "Bastard"  Snow

$ cat friends.txt | tr -s " "

Eddard "Ned" Stark
Jon "Bastard" Snow
© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Let's make things a bit more interesting by performing basic aggregation.

sort and uniq are common couple for the task!

© Course authors (CC BY-SA 4.0) - Image: © Mike Grauer Jr (CC BY 2.0)

Counting unique occurrences

$ cat logins.txt

root
root
bob
root
backup
backup
backup
root
bob

$ cat logins.txt | sort | uniq -c | sort

2 bob
3 backup
4 root
© Course authors (CC BY-SA 4.0) - Image: © Mike Grauer Jr (CC BY 2.0)

Let's combine them!

$ cat auth.log

Invalid password for user >Foobar< from 10.1.1.3:4121
Untrusted key for user >Admin< from 10.1.1.3:5124
Invalid password for user >Foobar< from 127.0.0.1:3155

$ cat auth.log \
  | grep -i -e "Invalid password" -e "Untrusted key" \
  | cut -d " " -f 5 \
  | tr -d "><" \
  | sort | uniq -c | sort

1 Admin
2 Foobar
© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Many other great tools exist for
working with text filtering.

awk is amazing, but uses its own custom programming language.

sed is another, but heavily relies on
regular expressions which we'll cover later in the course!

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

As previously mentioned, out-of-sync clocks are a common problem.

Let's see how we can modify and correct timestamps in logs.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

What is date?

Command-line tool for working with
calendar time.

Uses the tz database under the hood.

Useful for manual and automated
time/date conversion.

© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Convert to different TZ

$ date -u --date "18:47 CET"

Wed Nov  1 05:47:00 PM UTC 2023
© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Convert to sane format

$ date -u --date "18:47 CET" --rfc-3339=s

2023-11-01 17:47:00+00:00
© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Manually correct time skew

$ date -u --date "09:50 UTC - 1 hour - 5 minutes"

Wed Nov  1 08:45:00 AM UTC 2023
© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Output custom time format

$ date -u --date "09:50:41 UTC" "+%H_%M (==%s)"

09_50 (==1699005041)
© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Advanced time expressions

$ date --date "tuesday next week 13:30 PST"

Tue Nov  7 10:30:00 PM CET 2023
© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Great, but how do we fix a log file?

Let's combine for-loops, cut and date!

© Course authors (CC BY-SA 4.0) - Image: © William Murphy (CC BY-SA 2.0)

Looping in bash

$ cat fruits.txt

apple
banana

$ IFS=$'\n'
$ for LINE in $(cat fruits.txt); do
    $ echo "It's an ${LINE}"
$ done

It's an apple
It's an banana
© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

Putting it all together

$ cat clock_skewed_log.txt

08:14=User "root" logged in
08:15=User "anna" logged out

$ IFS=$'\n'
$ for LINE in $(cat clock_skewed_log.txt); do
  $ TIMESTAMP="$(echo "${LINE}" | cut -d = -f 1)"
  $ MESSAGE="$(echo "${LINE}" | cut -d = -f 2-)"
  $ FIXED_TIMESTAMP="$(date -u --date "${TIMESTAMP} UTC + 45 minutes" "+%H:%M")"
  $ echo "${FIXED_TIMESTAMP}=${MESSAGE}"
$ done

08:59=User "root" logged in
09:00=User "anna" logged out
© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Wanna store the output to a file?

Just use basic redirection:

$ cat auth.log | grep "Failed to" > failed.txt

To prevent overwriting the output file,
use ">>" instead.

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Or even better, with tee:

$ cat auth.log | grep "Failed to" | tee failed.txt

13:37 - Failed to authenticate "boba"
13:38 - Failed to authenticate "fatty"

$ cat failed.txt

13:37 - Failed to authenticate "boba"
13:38 - Failed to authenticate "fatty"

To prevent overwriting the output file,
add the "-a" option to tee.

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Wrapping it up

Learning the ins and outs of Coreutils are a worth-while investment.

When in doubt,
use the man and info commands.

grep -F == grep --fixed-strings

With time, you'll grow your own toolbox for efficiently working with data filtering and analysis.

© Course authors (CC BY-SA 4.0) - Image: © Kristina Hoeppner (CC BY-SA 2.0)

Lab: Analysis with Coreutils

© Course authors (CC BY-SA 4.0) - Image: © Luis Zuno (CC0 1.0)

Lab description

Graded exercise to use GNU Coreutils and grep for analyzing logs and extracting in-sights.

For detailed instructions, see:
"resources/labs/coreutils/README.md".

Remember to download the latest version of
the resources archive! ("log.zip")

© Course authors (CC BY-SA 4.0) - Image: © Luis Zuno (CC0 1.0)

Course recap

Let's refresh our memory

© Course authors (CC BY-SA 4.0) - Image: © Helsinki Hacklab (CC BY 2.0)

Time and clocks

Very messy, but important - especially for enabling log correlation!

Whenever possible, normalize time zone configuration (preferably UTC).

NTP helps us keep our clocks in sync.

NTS prevents MITM attacks and
PTP improves precision.

© Course authors (CC BY-SA 4.0) - Image: © Bruno Cordioli (CC BY 2.0)

Network traffic logging

Capture all traffic flowing across the network
using tap/mirror/span functionality in switches.

Easy to implement inspection-based logging.

Requires lots computing resources and storage,
encrypted traffic is a challenge.

NIDS are a middle-ground that just looks for
suspicious traffic using IoCs/rulesets.

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Flow logging

Only log and store traffic metadata.

Network flow ~=
same peers, protocol and
(when applicable) port.

Routers and networking gear provides
HW-based support for NetFlow/IPFIX.

(In many cases extremely) useful for
both NOC and SOC.

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

UNIX-like systems have historically populated
/var/log with a bunch of text files.

GNU Coreutils provides several useful tools
for text data filtration/extraction.

  • cut: Split/filter lines into distinct fields
  • wc: Count lines/bytes of input data
  • uniq: Basic data aggregation
  • tr: Various clean-up tasks
  • date: Voodoo-magic with date time

And let's not forget GNU grep!
("sed" is not a part of Coreutils)

© Course authors (CC BY-SA 4.0) - Image: © ETC Project (CC0 1.0)

Choo choo - let's move on!

© Course authors (CC BY-SA 4.0) - Image: © Kristoffer Trolle (CC BY 2.0)

Centralized logging

A somewhat gentle introduction

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

What does it take to implement and operate
a centralized logging solution?

Architectural choices and their pros/cons.

We won't talk about specific
products/projects for now.

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

Let's begin with the soft stuff!

© Course authors (CC BY-SA 4.0) - Image: © Jesse James (CC BY 2.0)

Ingestion

How many MB/GB/TB do we need
to process and store per day?

Events Per Second
is another very useful metric.

May require quite a bit of
research and guesstimation.

Influences HW requirements and
cluster architecture.

© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Availability requirements

Must the system always be available for ingestion of logs or can we handle buffering?

Are we required to store logs for a specific amount of time? Can we afford to lose some?

Is it acceptable if analysis/alerting capabilities aren't always working?

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Who will use it?

Security analysts, operations personnel, developers, marketing, data scientists...

May affect the need for capabilities like visualization, reporting and machine learning.

© Course authors (CC BY-SA 4.0) - Image: © Stacy B.H (CC BY 2.0

Managed VS Self-hosted

Should we operate the solution ourselves or rely on a managed service?

Do we have the expertise, time and interest required?

Are there any legal/contractual/policy considerations?

Is there a good fit available?

© Course authors (CC BY-SA 4.0) - Image: © Bret Bernhoft (CC0 1.0)

Support needs

Do you have risk appetite to go through this alone?

Potential to save quite a bit of money.

Not just about the vendor - are consultants/experts available nearby?

© Course authors (CC BY-SA 4.0) - Image: © Shannon Kringen (CC BY 2.0)

Access control

Should all logs be accessible to every analyst?

Do we need to support multiple tenants?

Take a second to meditate upon your needs for vertical and horizontal access control.

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 4.0)

Primary motivations

Remember to be honest with yourselves.
Are you doing this just to tick a box?

Influences performance and feature requirements.

© Course authors (CC BY-SA 4.0) - Image: © Andrew Hart (CC BY-SA 2.0)

Common fee/licensing schemes

  • None: Do what you want, yay!
  • Volume: Amount ingested and/or stored
  • Events: Number of ingested events
  • Features: Enable/Disable functionality
  • Per-seat: Number of users/analysts
© Course authors (CC BY-SA 4.0) - Image: © Freestocks.org (CC0 1.0)

Enough of this - let's have a look at
some of the technical considerations,
shall we?

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Pull VS Push

Should our logging solution
reach out and collect logs?

Majority of solutions require
log producers/intermediaries to
send/deliver the data (push-based).

Generally simpler to implement.

(We'll get back to agents/protocols)

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Where do we parse logs?

Should the producing systems massage logs
to extract relevant data and make them
machine-readable?

Distributes the load and can save bandwidth.

Introduces friction during implementation,
software/management requirements
and processing overhead.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

When do we parse logs?

Index-time VS Search-time.

© Course authors (CC BY-SA 4.0) - Image: © Sbmeaper1 (CC0 1.0)

Index-time parsing

Interesting data is parsed/extracted
before storage.

The heavy-lifting is done once,
improves search performance/cost.

Requires knowledge of log format
beforehand and a bit extra disk space.

© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Search-time parsing

Interesting data is parsed/extracted
during each query.

Enables low-friction ingestion and
less storage space.

Increases query time/cost.

© Course authors (CC BY-SA 4.0) - Image: © Freed eXplorer (CC BY 2.0)

While logging solutions tend to focus on one of the approaches, many supports a hybrid solution.

© Course authors (CC BY-SA 4.0) - Image: © Johannes P1hde (CC BY 2.0)

Retention / Rotation strategies

Time-based

Keep logs around for X amount of days.

Volume-based

Store X gigabytes of events and delete the oldest ones if that's not enough.

Capacity-based

Cram as many events as we can fit into
X% of total disk space.

© Course authors (CC BY-SA 4.0) - Image: © Will Buckner (CC BY 2.0)

In practice, we tend to combine these strategies.

Store sensitive logs for at least two weeks,
but longer if possible.

Do we want a development system or DDoS attack to overwrite our authentication logs?

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Storage tiers

Not all log events are equally interesting.

We can utilize different storage tiers
to optimize cost and performance.

Let's talk about
hot, warm, cold and frozen storage
in somewhat general terms.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Hot and warm storage

Log data frequently accessed during
manual queries and automated analysis.

Backed by a fast storage medium such as
SSDs in RAID configuration.

Multiple replicas can also be used to
improve query performance.

Typically capacity-based retention.

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Cold and frozen storage

Log data that is rarely/never analyzed.

Migration typically happens automatically.

Backed by high-capacity storage mediums,
such as HDDs, cloud object stores and tape.

Useful for compliance and incident response.

© Course authors (CC BY-SA 4.0) - Image: © Lydur Skulason (CC BY 2.0)

Scaling beyond a cluster

Find it hard to cram all events and
users into one log system?

Consider using multiple independent
servers/clusters.

Not just for performance reasons -
aids autonomy/decentralization and
simplifies access control.

Let's look at some solutions!

© Course authors (CC BY-SA 4.0) - Image: © NASA/JPL-Caltech (CC BY 2.0)

Selective forwarding

Specific event types and/or sources
are replicated to other log system.

Enables usage of different logging
applications based on needs/budget.

Can help with optimization of
retention and bandwidth usage.

© Course authors (CC BY-SA 4.0) - Image: © Freed eXplorer (CC BY 2.0)

The downsides

Hard to determine what might be of
interest before shit hits the fan.

Should forwarded events also be
kept around locally?

Can require quite a bit of coordination.

© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Federated searching

Some logging solutions support
"cross-cluster"/"federated" queries.

Enables us to decentralize
collection/retention/access control,
while allowing centralized alerting
and analysis.

Distributes query load and eliminates
need of unnecessary data duplication.

© Course authors (CC BY-SA 4.0) - Image: © A. Gerst, ESA (CC BY-SA 2.0)

Ain't perfect either

Analysis is gonna be painful if data
ain't normalized.

No cross-solution standard/protocol,
often requires versions to be in sync.

© Course authors (CC BY-SA 4.0) - Image: © Ron Cogswell (CC BY 2.0)

Let's summarize

There are many different paths to
choose from - make sure to known
your needs and wants.

Learning how to architect/operate
(and not merely use) logging solutions
can be a great career move.

© Course authors (CC BY-SA 4.0) - Image: © Rob Hurson (CC BY-SA 2.0)

Laws and standards

Staying compliant with logging

© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

As previously mentioned,
much regulation requires us
to log security/privacy related
activity, either directly or indirectly.

We'll limit our scope to
IT related laws/compliance standards.

For more details, check out
our "threat intelligence" course.

© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)

GDPR

General Data Protection Regulation.

Restricts how PII and be stored/processed.

Requires logging when PII is accessed by employees/third-parties.

Especially tricky/costly considering the wide definition of personal data.

(Who watches the watchers?)

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

NIS(2)

Directive on security of
Network and Information Systems.

Puts security-related requirements on
operators of critical infrastructure/services.

Following the baseline and reporting requirements
without logging is near impossible.

© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)

DORA

Digital Operational Resilience Act.

Attempt to harmonize IT security regulation for
financial-sector companies within the EU.

Puts requirements on monitoring, logging and incident reporting.

Similar rules exist in most finance/banking regulation.

© Course authors (CC BY-SA 4.0) - Image: © Bill Badzo (CC BY-SA 2.0)

LEK and Data Retention Directive

Lagen om Elektronisk Kommunikation.

Swedish regulation declaring metadata logging
requirements for Internet/telephony providers,
among other things.

The Data Retention Directive tried to unify
similar regulation within the EU, but was
declared unlawful by the Court of Justice.

Logs must be stored for
two, six or ten months depending on content.
Longer retention may be illegal.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

PCI DSS

Payment Card Industry
Data Security Standard.

Requirement #10:

Track and monitor all access to
network resources and cardholder data

Scope managed by using an isolated
Card Data Environment.

© Course authors (CC BY-SA 4.0) - Image: © Alan Levine (CC0 1.0)

ISO 27001

Requires regular review of
security-related events from
sensitive systems.

NTP or equivalent solution for
accurate time is mandatory.

For details, check out
A.12.4: "Logging and monitoring".

(If you can buy/borrow the standard!)

© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

When not to log

Privacy laws and labor unions protects
certain employee activity on corporate
IT systems, regardless of ownership.

Be especially considerate of
legitimate interest when working
with logs from sources like
End-point Detection and Respone tools,
mail servers, HTTP(S) proxies and
SSL/TLS interceptors.

Ensure that signed and understood
usage polices are in place to CYA.

© Course authors (CC BY-SA 4.0) - Image: © Pumpkinmook (CC BY 2.0)

Wrapping up

Dig deeper into the subject
in future courses.

Questions or thoughts?

© Course authors (CC BY-SA 4.0) - Image: © Loco Steve (CC BY-SA 2.0)

Protecting log data

CIA and privacy

© Course authors (CC BY-SA 4.0) - Image: © Gobi (CC BY 2.0)

The basics

You're logs may contain juicy information
that is of interest to adversaries.

Protection of their integrity is
desirable and may be a (legal) requirement.

Besides regular system hardening,
what else can we do?

© Course authors (CC BY-SA 4.0) - Image: © Gobi (CC BY 2.0)

Data classification

Not all log events are created equal.

Proper tagging/classification of
ingested sources aids enforcement of...

  • Access control
  • Retention policies
  • Backups

Preferably done by producer, if possible -
make sure guidelines are available.

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Append-only / Write-once storage

Certain storage mediums, like tape,
optical disks and special-purpose HDDs can
prevent data manipulation after write.

"Write Once Read Many".

A resonable compromise could be to
utilize external object storage like
AWS S3 (with strict access permissions)
or a hash-chain protected database
such as immudb.

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Anonymization / Pseudonymization

The best way to handle storage
of PII is to not.

Anonymization removes/replaces information
that could be tied to an invidual/entity.

Pseudonymization works similarly,
except that a "lookup table" exist to
revert the process if needed.

Preferably done by producer, if possible.

© Course authors (CC BY-SA 4.0) - Image: © Wendelin Jacober (CC0 1.0)

Data scrubbing

Logs may contain secrets like passwords,
API keys/tokens and credit card numbers,
especially from applications configured
in "debugging" mode.

By configuring "search-and-replace"
rules for known patterns, this sensitive
information may be automatically removed.

As previously mentioned, preferably
performed by the log producer.

(Reducing verbosity reduce retention cost)

© Course authors (CC BY-SA 4.0) - Image: © The Preiser Project (CC BY 2.0)

Availability

Can you access your logs during an attack
or outage to investigate the incident?

What dependencies exist on networking
infrastructure, centralized storage,
authentication providers, etc?

May be a decent reason for outsourcing
hosting to a third-party.

© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

Conclusions

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 3.0)

Course recap

Let's refresh our memory

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Centralized logging requirements

  • Ingestion amount (volume/EPS)
  • Availability requirements
  • Use-cases and intended end-users
  • Hosting and sovereignty
  • Support/Competence needs
  • Security and access control
© Course authors (CC BY-SA 4.0) - Image: © Yellowcloud (CC BY 2.0)

Collection and parsing

Most solutions available utilize "push-based"
collection and centralized parsing.

Index-time parsing helps query performance,
but increases onboarding and storage costs*.

Search-time parsing adds a per-query cost
but increases flexibility and
lowers storage costs.

© Course authors (CC BY-SA 4.0) - Image: © Kuhnmi (CC BY 2.0)

Retention and storage tiers

Storing log data using
time-based, volume-based
or capacity-based retention strategies.

Optimizing cost/performance using
hot, warm, cold, frozen storage tiers.

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Selective forwarding or
federated/cross-cluster querying
enables us to analyze logs from
multiple independent solutions.

Helps us decentralize management,
scale better and embrace autonomy.

(not without some issues/caveats!)

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

Many laws and compliance frameworks require us to log and monitor sensitive activity.

Some also prevents/restricts logging.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Protecting log data

Some example approaches are...

  • Confidentiality: Hardening, pseudonymization
  • Integrity: Forwarding, append-only storage
  • Availability: Replication, offline backups

Heavily dependent on properly
categorizing log sources.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Let's continue, shall we?

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

Regular expressions

Data filtering and extraction

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

As humans with a bit of technical know-how,
mentally parsing this event is easy:

13:36 - "Johan" logged in from 192.0.121.195
13:38 - "Sanna" logged in from 192.0.121.203

Mayhaps you wanna extract fields,
normalize log formats or filter events?

How can we make computers do the same
without lots of sh, grep, cut and tr?

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Introducing regular expression

Language for matching/extracting
patterns in text/data.

Also known as re, regex and regexp.

Exists in several different flavors,
we'll shall focus on the widely used
Extended Regular Expressions and
Perl Compatible Regular Expressions.

Used by almost all logging software
for advanced field extraction/validation.

© Course authors (CC BY-SA 4.0) - Image: © C. Watts (CC BY 2.0)

Let's take it for a spin!

© Course authors (CC BY-SA 4.0) - Image: © C. Watts (CC BY 2.0)

Simple string matching

$ cat auth.log | pcregrep 'login failed'

13:49 - Admin login failed using password
13:49 - Katey login failed using password
13:51 - Admin login failed using key
13:53 - Admin login failed using TOTP
© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Specifying variations

$ cat auth.log | pcregrep \
  'login failed using (password|TOTP)'

13:49 - Admin login failed using password
13:49 - Katey login failed using password
13:53 - Admin login failed using TOTP
© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)
$ cat auth.log | pcregrep \
  '^\d\d:\d\d - [A-Z][a-z]+ login .+ using (TOTP|key)$'

13:51 - Admin login failed using key
13:53 - Admin login failed using TOTP
13:53 - Admin login succeeded using TOTP
^\d\d:\d\d →
Line begins with two digits, a comma and two more digits.

[A-Z][a-z]+ →
Word begins with a uppercase letter, followed by one or
more lowercase letters.

.+ →
Match one or more of any character.

(TOTP|key)$ →
Line ending with "TOTP" or "key".
© Course authors (CC BY-SA 4.0)

A note about wildcards

. →
Matches one character of any kind.

.* →
Matches any character zero or more times.

.+ →
Matches any character one or more times.
© Course authors (CC BY-SA 4.0) - Image: © Ted Eytan (CC BY-SA 2.0)

Named capture groups

$ cat auth.log | pcregrep --only-matching=2 \
  '(?<time>.+) - (?<user>.+) login (?<result>.+) using (?<method>.+)'

Admin
Katey
Admin
Admin
Admin

(Typically turned into log field names, like "time" and "method")

© Course authors (CC BY-SA 4.0)

Repetition ranges and negation

$ cat auth.log

23:51 backup logged in
9:52 janne logged in
11:52 monitor logged in

$ cat auth.log | pcregrep --only-matching=2 \
  '^(?<time>\d{1,2}:\d{1,2}) (?<user>(?!backup|monitor).+) logged in$'

janne
© Course authors (CC BY-SA 4.0)

Advanced features

  • Multi-line matches
  • "Lookahead" / "Lookbehind"
  • UTF-8 character ranges
    ...

Support/Implementations differ between
flavors of regular expression.

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

Any alternatives?

Nothing that has taken off, but the
Simple Regex Language is a kool attempt:

/^(?:[0-9]|[a-z]|[\._%\+-])+(?:@)(?:[0-9]|[a-z]|[\.-])+(?:\.)[a-z]{2,}$/i
begin with any of (digit, letter, one of "._%+-") once or more,
literally "@",
any of (digit, letter, one of ".-") once or more,
literally ".",
letter at least 2 times,
must end, case insensitive
© Course authors (CC BY-SA 4.0)

Regardless of its imperfections,
mastering regex is a very
worthwhile investment.

Scary at first, but a fundamental
skill for developers and
log/data analysts.

A good resource is Deeecode's
"Simplified Regular Expressions" course

Just remember to also include
negative test cases.

© Course authors (CC BY-SA 4.0) - Image: © Amy Nelson (CC BY 3.0)

RegEx exercise

Let's try it out!

© Course authors (CC BY-SA 4.0) - Image: © Kuhnmi (CC BY 2.0)

Why re-invent the wheel?

Pop open your web browser and visit
RegexOne (https://regexone.com)!

Feeling brave? Checkout the
"Practice Problems" section.

© Course authors (CC BY-SA 4.0) - Image: © Kuhnmi (CC BY 2.0)

Log formats and protocols

Pros/Cons of common approaches

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

An ideal log format should enable
analysis by humans and machines alike.

We also want to collect/transfer logs
for centralized analysis/storage.

...all while being mindful of security risks,
performance impact and storage costs.

How do we achieve this?

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Let's begin by discussing
pros/cons of log formats!

© Course authors (CC BY-SA 4.0) - Image: © Joel Rangsmo (CC BY-SA 4.0)

Keeping it simple

Log events delimited by a new line:

13:36 - "Johan" logged in from 192.0.121.195
13:38 - "Sanna" logged in from 192.0.121.203
16:20 - Invalid key from 127.0.0.1 for "Bob"

(Important to remove/escape new line characters
in your log messages - I'm looking at you, Java!)

© Course authors (CC BY-SA 4.0) - Image: © Nicholas Day (CC BY 2.0)

We can try to parse it using some regular expressions:

^(?<time>\d\d:\d\d) - "(?<user>.+)" logged in from (?<ip>\d+\.\d+\.\d+\.\d+)$
© Course authors (CC BY-SA 4.0)

Ain't all that trivial

Most log files contain multiple
different event types - in this
case, perhaps runtime errors?

We could write a bunch more
complex regular expressions,
but it soon gets out of hand.

We're assuming that the format
is stable/won't change - should
developers really guarantee this?

© Course authors (CC BY-SA 4.0) - Image: © Nicholas Day (CC BY 2.0)

Simple key-values (KV)

time=13:36 type=login user=Johan ip=192.0.121.195 success=yes
time=13:38 type=login user=Sanna ip=192.0.121.203 success=yes
time=16:20 type=login user=Bob ip=127.0.0.1 success=no
© Course authors (CC BY-SA 4.0)

Clear and easily parsable fields.

Requires many bytes just to describe
the log structure (don't make Greta cry).

Need to handle escaping/quoting of
special characters like spaces.

© Course authors (CC BY-SA 4.0) - Image: © Aka Tman (CC BY 2.0)

CSV

Let's try Comma Separated Values
with a "header" line instead!

LoginTime,User,DurationSeconds,Commands
09:42,Joe,139,apk-update apk-upgrade ps
12:52,Tim,300,dmesg top kill top reboot
22:19,Adam,36,top dmesg
© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

Requires many bytes just to describe
the log structure (wasteful).

Need to keep the header around
and handle escaping/quoting of
delimiter character.

(Confusingly, commas aren't always
used as field delimiter)

Does this field contain a
string, number, boolean or list?

© Course authors (CC BY-SA 4.0) - Image: © Quinn Dombrowski (CC BY-SA 2.0)

JSON

JavaScript Object Notation got you covered-ish!

[
  {
    "LoginTime": "09:42",
    "User": "Joe",
    "DurationSeconds": 139,
    "Commands": ["apk-update", "apk-upgrade", "ps -faux"]
  },
  {
    "LoginTime": "12:52",
    "User": "Tim",
    "DurationSeconds": 300,
    "Commands": ["dmesg", "top", "kill", "top", "reboot"]
  },
  [...]
]
© Course authors (CC BY-SA 4.0)

NDJSON

Newline Delimited JavaScript Object Notation.
is probably a better fit for log events.

{"LoginTime": "09:42", "User": "Joe", "DurationSeconds": 139, [...]}
{"LoginTime": "12:52", "User": "Tim", "DurationSeconds": 300, [...]}
{"LoginTime": "22:19", "User": "Adam", "DurationSeconds": 36, [...]}
© Course authors (CC BY-SA 4.0)

Filtering (ND)JSON with jq

$ cat logins_log.ndjson | jq -r '.
  | select( 
    .User != "monitor"
    and .DurationSeconds > 60
    and (.Commands | index("dmesg"))
  ) | .User'
  
Tim

(Consider adding "jq" to your
list of "things to learn"!)

© Course authors (CC BY-SA 4.0) - Image: © Scott McCallum (CC BY-SA 2.0)

Swapping problems

Does this field contain a
string, number or list?

Requires many bytes just to describe
the log structure (you made her cry).

Besides data type, JSON doesn't
tell us what the field contains.

© Course authors (CC BY-SA 4.0) - Image: © Scott McCallum (CC BY-SA 2.0)

The Graylog Extended Log Format
and Elastic Common Schema try to
solve the latter problem for JSON.

Many network/security products supports
the Common Event Format,
which relies on an odd mix of
CSV and key-values.

© Course authors (CC BY-SA 4.0) - Image: © ESA (CC BY 3.0 IGO)

XML

eXtensible Markup Language.

Requires an honorable mention.

Similar to JSON, but with more
complexity/bells and whistles.

Used for log storage by Windows
and other enterprise software.

© Course authors (CC BY-SA 4.0) - Image: © Edenpictures (CC BY 2.0)

Hold on a second, we still haven't solved the
"byte wasting problem"!

Let's talk about binary logging formats.

© Course authors (CC BY-SA 4.0) - Image: © Milan Bhatt (CC BY-SA 2.0)

Store and/or transfer key-values.

Requires external schema/lookup table
to (de)serialize/parse data.

Fabricated/Mock example:

  • Byte 1 to 7: UNIX timestamp
  • Byte 8: Event type (Firewall reject)
  • Byte 9 to 12: Source IP address
  • Byte 13 to 14: Destination port
  • Byte 15 to 18: Destination IP address
  • Byte 19 to 20: FW rule identifier
© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

The pros

$ echo "\
2023-11-07T10:53:16+00:00 \
FW_BLOCK 10.13.37.142 3389 \
192.168.119.231 #146" \
| wc --bytes

74

Can save you a lot of storage/bandwidth
and many CPU cycles. Greta is happy!

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

The cons

Very unreadable for humans without a
schema and translation layer.

Limited support in off-the-shelf
centralized logging solutions.

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

If you wanna learn more, check out the
"Protocol Buffers Documentation" website.

© Course authors (CC BY-SA 4.0) - Image: © Wolfgang Stief (CC0 1.0)

Alright, we've chosen a log format.

How do we get these events to the
centralized logging server?

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

syslog

In the beginning (since 80s),
there was syslog.

Local service and network protocol
for log collection/transfer.

Port 514/UDP (and TCP, sometimes).

Still dominant method for sending
logs from network equipment and
embedded appliances.

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

Message content

  • Timestamp
  • Hostname/IP address
  • Facility: 0-23 (11 == FTP daemon)
  • Severity: 0-7 (0 == emergency)
  • Process ID
  • Message

+ perhaps more, depending on which
flavor/standard is followed...

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

What's the problem?

Loosely defined protocol.

In practice, this result in:

  • Bad support for traffic encryption
  • Insufficient authentication capabilities
  • Tight message size restrictions
  • Limited signaling capabilities

Furthermore, it doesn't really specify
how the message part is formatted.

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

Logging over HTTPS

Well understood/supported protocol.

Supports authentication, encryption,
compression and large messages.

Quite a bit of overhead/bloat for
the logging use-case.

Doesn't define a message format.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

GELF

The Graylog Extended Log Format
doesn't just define a message structure.

Supports transfer via...

  • UDP
  • TCP
  • TCP + TLS
  • HTTP
  • HTTPS

Hasn't yet taken over as a
syslog replacement.

© Course authors (CC BY-SA 4.0) - Image: © Mike Grauer Jr (CC BY 2.0)

For better or worse, most logging solutions
use their own custom agents/protocols.

More about those later.

© Course authors (CC BY-SA 4.0) - Image: © RoboticSpider (CC BY 4.0)

Wrapping it up

Log events should have a
clearly defined structure.

Each format has its own
benefits and trade-offs.

Insufficient standardization and
a wide range of user requirements
results in custom agents/protocols
for log transmission over networks.

© Course authors (CC BY-SA 4.0) - Image: © Rising Damp (CC BY 2.0)

Windows logging

© Course authors (CC BY-SA 4.0) - Image: © Daniel Oliva Barbero (CC BY 2.0)

Like any other platform worth its name,
Microsoft Windows produces logs.

The way it does this is however slightly different.

© Course authors (CC BY-SA 4.0) - Image: © Daniel Oliva Barbero (CC BY 2.0)

Event logging subsystem

Provides APIs for structured (key-values)
operating system and application logging.

Handles event retention/rotation.

Stores logs on disk in a custom binary format, exposed to consumers as XML.

Supports log forwarding.

Tries to solve many of the same problems as syslog daemons on Linux/UNIX.

© Course authors (CC BY-SA 4.0) - Image: © Ted Eytan (CC BY-SA 2.0)

Let's have a look, shall we?

© Course authors (CC BY-SA 4.0) - Image: © Ted Eytan (CC BY-SA 2.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)

Let's generate an event

> eventcreate.exe \
  /L APPLICATION /SO MyMockApp \
  /ID 123 /T WARNING \
  /D "Look at me ma - I'm logging!"
© Course authors (CC BY-SA 4.0) - Image: © Roy Luck (CC BY 2.0)
center:100%
© Course authors (CC BY-SA 4.0)

What does the OS log?

Primarily configured using the "audit policy".

Most capabilities are turned off by default, enabling them all makes system unusable.

Requires customization depending on
your use-case.

Have a good look at
Microsoft's "Audit Policy Recommendations"
and CIS's Windows benchmark
for guidance/inspiration.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)

That's neat, but not too exciting.

Let's have a look at the
File Integrity Monitoring
capabilities provided by "Object auditing".

© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)

Log forwarding/collection

Windows Event Forwarding and
Windows Event Collector.

Built-in capability, slightly
tricky to configure.

Supports "source initiated" (push)
and "collector initiated" (pull).

Events are transferred using
authenticated/encrypted HTTP(S).

Often used in combination with
a third-party logging agent.

© Course authors (CC BY-SA 4.0) - Image: © Jonathan Brandt (CC0 1.0)

Conclusions

Capable, but quite different.

Structured logging, but often
requiring a schema/lookup table.

Highly configurable, for better
and/or worse.

For configuration guidance,
check out CIS benchmark.

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Extending Windows auditing

Sysmon and PowerShell logging

© Course authors (CC BY-SA 4.0) - Image: © Tofoli Douglas (CC0 1.0)

Windows provides many knobs for
adjusting logging besides the
"audit policy".

Many third-party tools/products
exist to extend the capabilities
even further.

We'll look at two examples -
Sysmon and PowerShell audit logging.

© Course authors (CC BY-SA 4.0) - Image: © Tofoli Douglas (CC0 1.0)

Meet Sysmon

Released in 2014 as a part
of Microsoft "Sysinternals".

Hooks into Windows kernel
to intercept activity, just like many
End-point Detection and Response tools do.

Gratis, but very capable!

No official support provided by
Microsoft, sadly.

© Course authors (CC BY-SA 4.0) - Image: © Edenpictures (CC BY 2.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)

Most of Sysmon's super powers are
not enabled by default.

Granular configuration options
resulting in complexity.

Community resources like
"sysmon-modular" are
here to help!

© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

Still not all that convinced?

Let's mock an investigation!

© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)

But what was happening
inside PowerShell?!

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

PowerShell is loved by
admins and hackers alike.

Utilizes Windows APIs instead
of spawning processes that are
easily picked up by tools like Sysmon.

Supports extensive audit logging,
but it needs to be enabled.

© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)
center:100%
© Course authors (CC BY-SA 4.0)

Log queries from PowerShell

> Get-WinEvent \
  -LogName "Microsoft-Windows-Sysmon/Operational" \
  | where {$_.Id -eq 22} | Select-Object -ExpandProperty Message

Dns query:
UtcTime: 2023-11-12 15:14:14.430
ProcessGuid: "{944e8c49-ebbf-6550-340b-000000000700}"
QueryName: windows.metasploit.com
QueryResults: type: 5 download2.rapid7.com.edgekey.net;
              type: 5 ::ffff:92.123.206.32;
Image: C:\Windows\System32\WindowsPowerShell\v1\powershell.exe
User: "LAB-LOG-WIN-1\Administrator"
[...]
© Course authors (CC BY-SA 4.0)

Yo dawg, I heard you like logs...

© Course authors (CC BY-SA 4.0) - Image: © Stefan Brending (CC BY-SA 3.0 DE)
center:100%
© Course authors (CC BY-SA 4.0)

Closing thoughts

While many third-party tools exist to aid
threat detection and incident response on
MS Windows, we can get far for free with
official tools and a bit of configuration.

© Course authors (CC BY-SA 4.0) - Image: © Tofoli Douglas (CC0 1.0)

Logging platform overview

Comparison of common log solutions

© Course authors (CC BY-SA 4.0) - Image: © Counselman Collection (CC BY-SA 2.0)

Many products and services are available
for centralized logging.

Let's take a look at some of them
and their pros/cons/trivia.

© Course authors (CC BY-SA 4.0) - Image: © Counselman Collection (CC BY-SA 2.0)

ArcSight

Released by Micro Focus in
early 2000s.

Defined the SIEM product category.

Artifacts such as the
Common Event Format
still lives on.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Splunk

SIEM and data analytics platform.

Provides search-time parsing and
agent configuration management.

Relatively easy to scale.

Used to be the king, but was
also priced accordingly.

Provided as on-prem or SaaS.

Acquired by Cisco in 2023.

© Course authors (CC BY-SA 4.0) - Image: © Marco Verch (CC BY 2.0)

Elastic / "ELK" stack

Elasticsearch, Logstash, Kibana
+ Beats (logging agents).

Index-time parsing backed by
very capable search engine.

Resource hungry, but quite easy to scale.

Support and plugins for "enterprise"
features (like auth and TLS) available
from Elastic.

Closed-sourced in 2021, resulting in
"OpenSearch" fork. Became open again in 2024.

© Course authors (CC BY-SA 4.0) - Image: © Tim Green (CC BY 2.0)

Graylog

Logging solutions with
batteries included.

Uses Elasticsearch OpenSearch
under the hood.

Less capable than the Elastic stack,
but easier to setup/manage.

Closed-sourced in 2021.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Wazuh

Builds upon the ELK stack OpenSearch and
a fork of the HIDS "OSSEC" to provide
an open source solution for...

*drum roll*

eXtended Detection and Response.

Freely available, but developed
by a company that sells SaaS,
consulting and support.

© Course authors (CC BY-SA 4.0) - Image: © Joel Rangsmo (CC BY-SA 4.0)

Loki

FOSS logging solution built
by Grafana Labs.

Trying to take a fresh approach
and learn from previous mistake.

Easier and cheaper to operate.

Less flexible, but good enough
for most operational logging
use-cases. A SIEM? Not yet.

Also available as SaaS.

© Course authors (CC BY-SA 4.0) - Image: © Maja Dumat (CC BY 2.0)

Microsoft Sentinel

Security-focused solution for
centralized logging.

Provides lots of features OoB,
things usually in the realm
of SOC operators/analysts.

Only provided as cloud service.

© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

Many options exist with
pros and cons.

Our focus moving forward
will be OpenSearch.

© Course authors (CC BY-SA 4.0) - Image: © Maja Dumat (CC BY 2.0)

Course recap

Let's refresh our memory

© Course authors (CC BY-SA 4.0) - Image: © Jonathan Brandt (CC0 1.0)

Regular expressions

Language for pattern matching and
data extraction.

Defacto standard for massaging
unstructured log data.

Practice your skills using sites
like RegexOne!

© Course authors (CC BY-SA 4.0) - Image: © Jonathan Miske (CC BY-SA 2.0)

Common formats

We want structured data to enable
better querying/filtering.

Key-values, CSV, (ND)JSON and XML
are commonly used formats.

Usage of binary storage/transfer of
logs decrease overhead/cost.

The Common Event Format
and Elastic Common Schema
exist to provide standardized naming
and data types for fields in logs.

© Course authors (CC BY-SA 4.0) - Image: © Ninara (CC BY 2.0)

Transfer methods and protocols

Syslog is a common, but flawed, protocol
for sending log events over a network.

Graylog Extended Log Format
aims replace syslog and supports several
different transfer mechanisms,
such as UDP and TCP + TLS.

HTTP(S) is a popular option due to its
wide support, but introduces overhead.

© Course authors (CC BY-SA 4.0) - Image: © Jan Bocek (CC BY 2.0)

Logging on Windows

The kernel, system services and
most applications rely on the
Windows event log subsystem.

Log data is structured and stored on
disk in a custom binary XML format.

Windows Event Forwarding uses
HTTP or HTTPS for transferring events to
a Windows Event Collector.

Tweaked through audit policies and
can be extended using tools like Sysmon.

© Course authors (CC BY-SA 4.0) - Image: © Edenpictures (CC BY 2.0)

Thoughts and/or questions?

© Course authors (CC BY-SA 4.0) - Image: © Maximilien Brice / CERN (CC BY-SA 3.0)

OpenSearch introduction

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 3.0)

The OpenSearch project aims to
provide an open-source software suite for
data processing, analysis and visualization.

Popular choice for centralized logging.

Platform used in coming labs and presentations.

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 3.0)

History / Background

In the beginning, there was
Elasticsearch, Kibana and Logstash,
which formed the open-source "ELK stack".

Loved by devops teams, security analysts
and data scientists alike.

The company leading development, Elastic,
made money by selling proprietary plugins
(called "X-Pack") and support/services.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

The community developed freely available
plugins that matched many of Elastic's
proprietary features.

The "Open Distro" project packaged
these together with the open-source
ELK-components to provide a fully
usable out-of-the-box experience.

Elastic didn't like this and were mad
at the big cloud providers for selling
"ELK as a Service" without willingly
"giving them their fair share".

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

In 2021, Elastic closed-sourced
Elasticsearch and Kibana.

This made the community and
companies basing their services
on the ELK stack a bit grumpy.

The OpenSearch project was formed
to provide an open-source fork.

Developed by the community, supported
by the OpenSearch Software Foundation.

(Elastic open sourced it again in 2024)

© Course authors (CC BY-SA 4.0) - Image: © Price Capsule (CC BY-SA 2.0)

What does our implementation
of the stack look like?

© Course authors (CC BY-SA 4.0) - Image: © Rick Massey (CC BY 2.0)
center:100%
© Course authors (CC BY-SA 4.0)

OpenSearch

Fork of Elastic's "Elasticsearch".

It's a search engine, powered by
Apache Lucene, but you can
think of it as a database.

Users can submit arbitrary JSON objects
to an index. Once stored, they are
called "documents" and become
available for queries/analysis.

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Ehmm, perhaps a demonstration
might be reasonable?

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)
$ curl \
  "https://teacher:hunter_2@search-engine.logs.labs.teaching.sh/" \
  --request GET
{
  "name" : "893e26a2db10",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "dVw9XlAYQk2laTvVSv7LpA",
  "version" : {
    [...]
    "minimum_wire_compatibility_version" : "7.10.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}
© Course authors (CC BY-SA 4.0)
$ BASE_URL="https://teacher:hunter_2@search-engine.logs.labs.teaching.sh"
$ curl "${BASE_URL}/" --request GET

{
  "name" : "893e26a2db10",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "dVw9XlAYQk2laTvVSv7LpA",
  "version" : {
    [...]
    "minimum_wire_compatibility_version" : "7.10.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "The OpenSearch Project: https://opensearch.org/"
}
© Course authors (CC BY-SA 4.0)
{
  "name": "Test Examplesson",
  "kool": false,
  "age": 42
}
$ curl "${BASE_URL}/myindex/_doc?pretty" \
  --request POST \
  --header "Content-Type: application/json" \
  --data @example-document.json
{
  "_index" : "myindex",
  "_id" : "RXE06IsBQrucVyA52bmU",
  "_version" : 1,
  "result" : "created",
  [...]
}
© Course authors (CC BY-SA 4.0) - Image: © Fandrey (CC BY 2.0)
$ curl \
  "${BASE_URL}/myindex/_search?pretty" \
  --request GET
{
  [...]
  "hits" : {
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "myindex",
        "_id" : "h3Ey6IsBQrucVyA5xLjN",
        "_score" : 1.0,
        "_source" : {
          "name" : "Test Examplesson",
          "kool" : false,
          "age" : 42
        }
        [...]
© Course authors (CC BY-SA 4.0) - Image: © Fandrey (CC BY 2.0)

Why is it kool?

Amazing analytics capabilities
and quite easy to scale!

© Course authors (CC BY-SA 4.0) - Image: © Fibreman (CC0 1.0)

Data storage

Documents are grouped in indices
(plural of "index").

Documents in an index are
stored in one or more shards.

Shards are spread over
one or more nodes.

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

Node clustering

OpenSearch is "cluster-first".

Adding more nodes and shards can improve
capacity, performance and availability.

Can be scaled down to save
money/electricity.

Node types can be mixed and matched to
implement storage/processing tiers.

© Course authors (CC BY-SA 4.0) - Image: © ORNL (CC BY 2.0)

Searching capabilities

Swiss army knife of data analysis.

Full-text queries and advanced aggregation.

Includes plugins out-of-the-box for
machine learning, correlation, etc.

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Detailed field mapping

Besides the JSON data types, such as strings,
integers and arrays, several others are supported
to aid the search engine find relevant results.

{
  "mappings": {
    "properties": {
      "source.ip": {
        "type": "ip"
      },
      "source.geo.location": {
        "type": "geo_point"
      }
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)

OpenSearch is a complex beast.

Takes quite a bit of expertise
to manage and use.

Don't feel bad if you're
getting confused.

We'll focus the use-case of
centralized logging.

© Course authors (CC BY-SA 4.0) - Image: © Jan Helebrant (CC0 1.0)

Let's move on, shall we?

Next up is data ingestion!

© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)

Logstash

Helps us to build log processing pipelines!

Supports wide range of "inputs", "filters"
and "outputs" to enable centralized logging
in heterogeneous IT environments.

Development led by Elastic,
available under an open-source license.

© Course authors (CC BY-SA 4.0) - Image: © Kevin Dooley (CC BY 2.0)

What is a pipeline?

Pipelines consist of three stages:

  1. Input: Specify how to receive/collect events
  2. Filter: Manipulate, enrich and drop events
  3. Output: Do something with processed events

The filter stage effectively acts as a
script being executed for each log event.

A Logstash instance can run one or more
processing pipelines simultaneously.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Input stage

A single pipeline can have one or more
inputs configured, such as:

  • HTTP(S)
  • Syslog
  • Netflow / IPFIX
  • S3 object storage
  • Redis / Kafka
  • Scheduled command execution
  • Twitter X feed
© Course authors (CC BY-SA 4.0) - Image: © Gytis B (CC BY-SA 2.0)

Filter stage

Conditionally execute zero or more
filter plugins depending on event
content or external factors.

Can look a lot like a script with
if-else conditions and "function calls"
to plugins for manipulating/enriching
log events.

© Course authors (CC BY-SA 4.0) - Image: © Julie Cotinaud (CC BY-SA 2.0)
Name Use-cases
mutate Rename/Normalize fields, change casing, remove fields
drop Stop processing/forwarding of event
grok Use regex to match events and extract field data
cipher Pseudonomization of PII/credentials, data decoding
ruby Whatever you can think of!
© Course authors (CC BY-SA 4.0) - Image: © Julie Cotinaud (CC BY-SA 2.0)

Output stage

A single pipeline can have one or more
outputs configured, such as:

  • File
  • Elasticsearch / OpenSearch
  • Syslog / GELF / Logstash
  • Email / IRC / Slack

Conditional statements can be used to
control which output is used based on
event content.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Still a bit confused?

Let's peak at a real example
from the lab environment!

© Course authors (CC BY-SA 4.0) - Image: © Peter Black (CC BY-SA 2.0)
input {
  # JSON objects POSTed via HTTP
  http {
    type => "json_events"
    port => 8080
    codec => json
  }

  # Events from logging agents
  beats {
    type => "local_events"
    port => 5044
  }
  
[...]
© Course authors (CC BY-SA 4.0) - Image: © Raphaël Vinot (CC BY 2.0)
filter {

  # Parse NGINX web server logs
  if [event][dataset] == "nginx.access" {
    # Tag added to all web server access logs,
    # regardless of server software used
    mutate {
      add_tag => [
        "web_server_access"
      ]
    }

    # Use Grok (regular expressions on steroids)
    # to extract fields from event.
    grok {
      match => {
        "message" => "%{IPORHOST:[source][ip]} - \
        %{DATA:user} \[%{HTTPDATE:time}\] [...]"

[...]
© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)
[...]

  # Parse time from request and replace existing
  # timestamp, as the former is based on when
  # the log event was picked up by the shipping
  # agent and not when the request actually
  # happened
  date {
    match => ["time", "dd/MMM/YYYY:HH:mm:ss Z"]
    target => "@timestamp"
    remove_field => "time"
  }

[...]
© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)
[...]

  # Tag and normalize pre-structured IIS web server logs
  if [event][provider] == "Microsoft-Windows-IIS-Logging" {
    # Normalize field names by creating copies and tag as
    # web server log
    mutate {
      add_tag => [
        "web_server_access"
      ]
  
      # We're making a copy of the fields instead of changing
      # their names, as some already existing queries may
      # rely upon them
      copy => {
        "[winlog][event_data][c-ip]" => "[source][ip]"
        "[winlog][event_data][csUser-Agent]" => "raw_user_agent"
        "[winlog][event_data][sc-status]" => "response_code"
      }
  
[...]
© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)
[...]

  # If event is a web server access log, parse
  # user agent string and extract information,
  # such as browser and operating system version
  if ("web_server_access" in [tags]) {
    useragent {
      source => "raw_user_agent"
      target => "user_agent"
    }
  }
  
  # If event contains a field called "source.ip,
  # lookup  IP address in GeoIP database to find
  # information about its approximate location
  if [source][ip] {
    geoip {
      source => "[source][ip]"
    }
  }

[...]
© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)
[...]

  # The remaing parts of the filter section
  # are just used to control in which index
  # the event will be stored in
  if ("web_server_access" in [tags]) {
    mutate {
      add_field => {
        "index_suffix" => "web_servers"
      }
    }
    	
  } else if [type] == "json_events" {
    mutate {
      add_field => {
        "index_suffix" => "json" 
      }
    }

[...]
© Course authors (CC BY-SA 4.0) - Image: © Rod Waddington (CC BY-SA 2.0)
output {
  opensearch {
    hosts => ["https://opensearch:9200"]

    # Use variable specified during event
    # filtering and date expression to
    # control which index is used for data
    # storage. The date expression will
    # result in new indexes being created
    # each day, making rotation/retention
    # easier to control
    index => "logs-%{index_suffix}-%{+YYYY.MM.dd}"

    user => "logger"
    password => "G0d="
    ssl => true
    ssl_certificate_verification => false
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Nacho Jorganes (CC BY-SA 2.0)

While extremely flexible, Logstash is
quite complex and resource hungry.

Alternatives exist, such as
OpenSearch Data Prepper
and Fluentd.

© Course authors (CC BY-SA 4.0) - Image: © Jorge Franganillo (CC BY 2.0)

How do we get our logs to Logstash?

Drop em' Beats!

© Course authors (CC BY-SA 4.0) - Image: © Sbmeaper1 (CC0 1.0)

Beats

Family of light-weight log agents.

Filebeat, Winlogbeat, Auditbeat,
Metricbeat, Packetbeat, Heartbeat...

Development led by Elastic,
available under an open-source license.

We'll talk more about these later!

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

I'm tired of staring at text
in the terminal!

Let's have a look at
OpenSearch Dashboards.

© Course authors (CC BY-SA 4.0) - Image: © Fredrik Rubensson (CC BY-SA 2.0)

OpenSearch Dashboards

Fork of Elastic's "Kibana".

Web application that exposes analytics
capabilities and provides several data
visualization features.

The main interface humans use for
interaction with OpenSearch.

© Course authors (CC BY-SA 4.0) - Image: © NASA/Chris Gunn (CC BY 2.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)

We've just dipped our toes so far.

During the rest of the course we'll
continue exploring!

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

OpenSearch basics

Let's search that data!

© Course authors (CC BY-SA 4.0) - Image: © Steven Kay (CC BY-SA 2.0)

If you have an OpenSearch instance running,
chances are that you wanna make some searches.

We'll look at some common use-cases and how
its searching super-powers can help us.

(We'll start with general usage before
getting into logging-specific considerations)

© Course authors (CC BY-SA 4.0) - Image: © Steven Kay (CC BY-SA 2.0)
{
  "cve_identifier": "CVE-2023-20273",
  "description": "Management interface code injection",
  "cvss_score": 7.2,
  "included_in_kev": true,
  "category": "Remote code execution",
  "date_published": "2023-10-25",
  "date_updated": "2023-11-15",
  "affected_software": [
    "Cisco IOS",
    "Cisco IOS XE"
  ]
}
$ for CVE_FILE in CVE-*; do
  curl "${BASE_URL}/myvulns/_doc/${CVE_FILE}?pretty" \
  --request PUT \
  --header "Content-Type: application/json" \
  --data @${CVE_FILE}
done
© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)
$ curl "${BASE_URL}/myvulns/_search?pretty" --request GET
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    
[...]
© Course authors (CC BY-SA 4.0) - Image: © Mathias Appel (CC0 1.0)
    
[...]

    "hits" : [
      {
        "_index" : "myvulns",
        "_id" : "CVE-2021-1675",
        "_score" : 1.0,
        "_source" : {
          "cve_identifier" : "CVE-2021-1675",
          "description" : "Code injection in print spooler",
          "cvss_score" : 9.3,
          "included_in_kev" : true,
          "category" : "Remote code execution",
          "date_published" : "2021-06-08",
          "date_updated" : "2022-08-01",
          "affected_software" : [
            "Microsoft Windows"
          ]
        }
[...]
© Course authors (CC BY-SA 4.0) - Image: © Mathias Appel (CC0 1.0)
$ curl \
  "${BASE_URL}/myvulns/_doc/CVE-2023-36036?pretty" --request GET
{
  "_index" : "myvulns",
  "_id" : "CVE-2023-36036",
  "_version" : 1,
  "_seq_no" : 4,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "cve_identifier" : "CVE-2023-36036",
    "description" : "Flaw in Windows Cloud files driver",
    "cvss_score" : 7.8,
    "included_in_kev" : true,
    "category" : "Privilege escalation",
    "date_published" : "2023-11-14",
    "date_updated" : "2023-11-14",
    "affected_software" : [
      "Microsoft Windows"
    ]
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Jason Hall (CC BY 2.0)
$ curl \
  "${BASE_URL}/myvulns/_doc/CVE-2023-36036/_update" \
  --request POST \
  --header "Content-Type: application/json" \
  --data '{"doc": {"cvss_score": 7.5}}'
{
  "_index" : "myvulns",
  "_type" : "_doc"
  "_id" : "CVE-2023-36036",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 2,
  "_primary_term" : 1
}
© Course authors (CC BY-SA 4.0) - Image: © Randy Adams (CC BY-SA 2.0)

Let's get searching with
Lucene Query Language!

(AKA "Query DSL")

© Course authors (CC BY-SA 4.0) - Image: © Jeena Paradies (CC BY 2.0)
{
  "query": {
    "range": {
      "date_updated": {
        "gte": "2023-11-01"
      }
    }
  }
}
$ curl \
  "${BASE_URL}/myvulns/_search?pretty" \
  --request GET \
  --header "Content-Type: application/json" \
  --data @query.json
© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)
[...]

"hits" : {
  "total" : {
    "value" : 3,
    "relation" : "eq"
  },
  "max_score" : 1.0,
  "hits" : [
    {
      "_index" : "myvulns",
      "_id" : "CVE-2023-20273",
      "_score" : 1.0,
      "_source" : {
        "cve_identifier" : "CVE-2023-20273",
        "description" : "Management interface code injection",
        "cvss_score" : 7.2,
        "included_in_kev" : true,
        "category" : "Remote code execution",
        "date_published" : "2023-10-25",
        "date_updated" : "2023-11-15",
        "affected_software" : [
          "Cisco IOS",
          "Cisco IOS XE"
        ]
      }
    },

[...]
© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)
{
  "query": {
    "match": {
      "affected_software": {
        "query": "windows"
      }
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Jack Lawrence (CC BY 2.0)
[...]

"hits" : {
  "total" : {
    "value" : 2,
    "relation" : "eq"
  },
  "max_score" : 1.5532583,
  "hits" : [
    {
      "_index" : "myvulns",
      "_id" : "CVE-2021-1675",
      "_score" : 1.5532583,
      "_source" : {
        [...]
        "affected_software" : [
          "Microsoft Windows"
        ]
      }
    },
      
[...]
© Course authors (CC BY-SA 4.0) - Image: © Jack Lawrence (CC BY 2.0)
{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "date_published": {
              "gte": "2018",
              "lte": "2022"
            }
          }
        }
      ],
      "should": [
        {
          "term": {
            "included_in_kev": {
              "value": true
            }
          }
        },
        {
          "range": {
            "cvss_score": {
              "gte": 5.5
            }
          }
        }
      ]
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 2.3254223,

[...]
© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)
[...]

"hits" : [
  {
    "_index" : "myvulns",
    "_id" : "CVE-2021-1675",
    "_score" : 2.3254223,
    "_source" : {
      "cve_identifier" : "CVE-2021-1675",
      "description" : "Code injection in print spooler",
      "cvss_score" : 9.3,
      "included_in_kev" : true,
      "category" : "Remote code execution",
      "date_published" : "2021-06-08",
      "date_updated" : "2022-08-01",
      "affected_software" : [
        "Microsoft Windows"
      ]
    }
  },

[...]
© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)
[...]

  {
    "_index" : "myvulns",
    "_id" : "CVE-2019-0233",
    "_score" : 1.0,
    "_source" : {
      "cve_identifier" : "CVE-2019-0233",
      "description" : "Faulty validation in file upload",
      "cvss_score" : 5.0,
      "included_in_kev" : false,
      "category" : "Denial of service",
      "date_published" : "2020-09-14",
      "date_updated" : "2022-04-18",
      "affected_software" : [
        "Apache Struts",
        "Oracle MySQL Enterprise Monitor",
        "Oracle Financial Services Data Hub"
      ]
    }
  }

[...]
© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Searches in OpenSearch are
"queries", "aggregations"
or a combination of both.

Queries return matching documents.

Aggregations returns statistics
about document fields.

They can be combined to filter
data for statistical analysis.

© Course authors (CC BY-SA 4.0) - Image: © Simon Claessen (CC BY-SA 2.0)
{
  "aggs": {
    "vulnerability_categories": {
      "terms": {
        "field": "category"
      }
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Lisa Brewster (CC BY-SA 2.0)
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Text fields are not optimised for operations
                    that require per-document field data like
                    aggregations and sorting, so these operations
                    are disabled by default. Please use a
                    keyword field instead. [...]

© Course authors (CC BY-SA 4.0) - Image: © Lisa Brewster (CC BY-SA 2.0)
$ curl "${BASE_URL}/myvulns/_mapping?pretty" --request GET
{
  "myvulns" : {
    "mappings" : {
      "properties" : {
        [...]
      
        "cvss_score" : {
           "type" : "float"
        },
        "date_published" : {
          "type" : "date"
        },
        "date_updated" : {
          "type" : "date"
        },
        "category" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
        }
[...]
© Course authors (CC BY-SA 4.0) - Image: © Lisa Brewster (CC BY-SA 2.0)
{
  "aggs": {
    "vulnerability_categories": {
      "terms": {
        "field": "category.keyword"
      }
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Lisa Brewster (CC BY-SA 2.0)
[...]

"aggregations" : {
   "vulnerability_categories" : {
     "doc_count_error_upper_bound" : 0,
     "sum_other_doc_count" : 0,
     "buckets" : [
       {
         "key" : "Remote code execution",
         "doc_count" : 3
       },
       {
         "key" : "Privilege escalation",
         "doc_count" : 2
       },
       {
         "key" : "Authentication bypass",
         "doc_count" : 1
       },
       {
         "key" : "Cross-site scripting",
         "doc_count" : 1
       },
       {
         "key" : "Denial of service",
         "doc_count" : 1
       }
     ]

[...]
© Course authors (CC BY-SA 4.0) - Image: © Lisa Brewster (CC BY-SA 2.0)
{
  "query": {
    "match": {
      "affected_software": {
        "query": "Juniper Junos"
      }
    }
  },
  "aggs": {
    "vulnerability_categories": {
      "terms": {
        "field": "category.keyword"
      }
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Bret Bernhoft (CC0 1.0)
[...]

"aggregations" : {
  "vulnerability_categories" : {
    "doc_count_error_upper_bound" : 0,
    "sum_other_doc_count" : 0,
    "buckets" : [
      {
        "key" : "Authentication bypass",
        "doc_count" : 1
      },
      {
        "key" : "Remote code execution",
        "doc_count" : 1
      }
    ]
    
[...]
© Course authors (CC BY-SA 4.0) - Image: © Bret Bernhoft (CC0 1.0)

Besides Lucene Query Language,
OpenSearch provides plugins for other ways
to express your searches/aggregations.

Depending on your preferences, you can use
Dashboard Query Language,
Pipe Processing Language or
Structured Query Language.

Under the hood, these get translate to
LQL with varying degrees of success.

(Sigma is also an option!)

© Course authors (CC BY-SA 4.0) - Image: © Wendelin Jacober (CC0 1.0)

Are you missing some of that eye candy?

Let's do some searches and visualizations
in OpenStack Dashboards!

© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)

Wrapping up

Don't be scared, you'll have plenty
of time to get friendly with OpenSearch.

In your day-to-day digging, DQL is
likely the best choice. For advanced
queries and aggregations, learning
LQL is a worthwhile investment.

Having a hard time finding documentation
and guides? Try Googling for Elasticsearch.

© Course authors (CC BY-SA 4.0) - Image: © Guilhem Vellut (CC BY 2.0)

Lab: OpenSearch

Data filtering and visualization

© Course authors (CC BY-SA 4.0) - Image: © Qubodup (CC BY 2.0)

Lab description

Graded exercise to use Logstash for data extraction and OpenSearch Dashboards
for visualization.

For detailed instructions, see:
"resources/labs/opensearch/README.md".

Remember to download the latest version of
the resources archive! ("log.zip")

© Course authors (CC BY-SA 4.0) - Image: © Qubodup (CC BY 2.0)

Enriching logs

Aiding our data analysis

© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

"Enrichment" is the process of improving
the value of our logs.

Often this means providing useful context
for analysts and machines alike.

We've already played around with adding
GeoIP information.

Let's look at some more examples and
how to implement them in OpenSearch.

© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

What about that source/dest?

  • IP reputation
  • IP type (residential, cloud, proxy, etc.)
  • Current host patch level
  • Vulnerability scan and/or Shodan results
  • All kinds of CMDB data!
© Course authors (CC BY-SA 4.0) - Image: © Enrique Jiménez (CC BY-SA 2.0)

Let's not forget humans!

  • Role description
  • Employment location / Timezone
  • Occurrence in data leaks
  • Contact information
© Course authors (CC BY-SA 4.0) - Image: © Randy Adams (CC BY-SA 2.0)

Enrichment can be performed during
ingestion or at search-time.

Like with field parsing, both have
their pros/cons.

Current relevance VS Historic accuracy.

"This IP address is used by evildoers now"
VS
"This IP address was used by evildoers then".

© Course authors (CC BY-SA 4.0) - Image: © Asparukh Akanayev (CC BY 2.0)

Useful filter plugins

  • GeoIP and user agent
  • DNS (forward/reverse lookups)
  • "Translate"
  • JDBC and Memcached
  • HTTP client (for APIs)

...and as always, "ruby"!

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Why may DNS be interesting?

# Forward lookup
$ host suspicious.example.com

suspicious.example.com has address 93.184.215.14
suspiciousexample.com has IPv6 address
2606:2800:21f:cb07:6820:80da:af6b:8b2c

# Reverse lookup
$ host 93.184.215.14

14.215.184.93.in-addr.arpa domain name pointer
suspicious.example.com.
© Course authors (CC BY-SA 4.0)

Erghh - less talk, more examples!

© Course authors (CC BY-SA 4.0) - Image: © Mike Grauer Jr (CC BY 2.0)

/var/ioc/evil_ip.csv ("key-value")

157.245.96.121,Observed in logs during 2025 Xmplify incident
185.120.19.98,Associated with Explum spear phishing campaign
194.61.40.74,Have been trying to brutforce our VPN for years!

Logstash filter pipeline

[...]

if [source][ip] {
  translate {
    source => "[source][ip]"
    target => "ip_related_to_incident"
    dictionary_path => "/var/ioc/evil_ip.csv"
  }
}

[...]
© Course authors (CC BY-SA 4.0) - Image: © Theo Crazzolara (CC BY 2.0)
[...]

"must": [
  {
    "match_phrase": {
      "tags.keyword": "web_server_access"
    }
  },
  {
    "exists": {
      "field": "ip_related_to_incident"
    }
  }
]

[...]
© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)
[...]

"hits" : [
  {
    "_index" : "logs-web_servers-2025.10.19",
    "_id" : "6C0B74sB7PKVx7m-L2xx",
    "_score" : 1.0048822,
    "_source" : {
      "url" : "/internal/nuke_control.aspx",
      "ip_related_to_incident" : "Associated with Explum spear phishing campaign",
      "source" : {
        "ip" : "185.120.19.98",
      [...]
© Course authors (CC BY-SA 4.0)

While OpenSearch relies heavily on
parsing/enrichment during ingestion,
there are some neat things we can do
at search-time.

© Course authors (CC BY-SA 4.0) - Image: © Bret Bernhoft (CC0 1.0)
{
  "known_evil_ip_addresses": [
    "34.76.96.55",
    "198.235.24.39",
    "157.245.96.121",
    "143.198.117.36"
  ],
  "scripted_http_clients": [
    "curl",
    "Go-http-client",
    "Python Requests",
    "Nmap Network Scanner"
  ]
}
$ curl \
  "${BASE_URL}/mylookupdata/_doc/ioc" \
  --request PUT --data @ioc.json \
  --header 'Content-Type: application/json'
© Course authors (CC BY-SA 4.0) - Image: © Lord Jaraxxus (CC BY-SA 4.0)
{
  "query": {
    "bool": {
      "must": [
        {
          "match_phrase": {
            "tags.keyword": "web_server_access"
          }
        },
        {
          "terms": {
            "source.ip": {
              "index": "mylookupdata",
              "id": "ioc",
              "path": "known_evil_ip_addresses"
            }
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "raw_user_agent": {
              "query": "CensysInspect"
            }
          }
        }
      ],
      "should": [
        {
          "terms": {
            "user_agent.name": {
              "index": "mylookupdata",
              "id": "ioc",
              "path": "scripted_http_clients"
            }
          }
        }
      ]
    }
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)
[...]

   "must": [
     {
       "match_phrase": {
         "tags.keyword": "web_server_access"
       }
     },
     {
       "terms": {
         "source.ip": {
           "index": "mylookupdata",
           "id": "ioc",
           "path": "known_evil_ip_addresses"
         }
       }
     }
   ],

[...]
© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)
[...]

  "must_not": [
    {
      "match": {
        "raw_user_agent": {
          "query": "CensysInspect"
        }
      }
    }
  ],

[...]
© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)
[...]

  "should": [
    {
      "terms": {
        "user_agent.name": {
          "index": "mylookupdata",
          "id": "ioc",
          "path": "scripted_http_clients"
        }
      }
    }
  ]

[...]
© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)
[...]

  "hits" : {
    "total" : {
      "value" : 28,
      "relation" : "eq"
    },
    "max_score" : 2.0053382,
    "hits" : [
      {
        "_index" : "logs-web_servers-2025.10.27",
        "_id" : "53JE6osBQrucVyA5EqK1",
        "_score" : 2.0053382,
        "_source" : {
          "request_method": "GET"
          "request_path" : "/admin.php",
          "raw_user_agent" : "curl/8.1.2",
          "source" : {
            "ip" : "143.198.117.36",
            "geo" : {
              "country_iso_code" : "US",
              "continent_code" : "NA",
              "country_name" : "United States"
            }

[...]
© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)

"Search pipelines" and Painless scripts
may be able to help, but a bit out of
scope for this course.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Elastic have since the fork added a
feature to Elasticsearch called
"runtime fields".

Returns selected fields from another
document in query results, not just
checking if they contain values.

Acts a bit like JOIN statements does in
traditional SQL databases.

Very useful for enrichment and OpenSearch
is working on a similar solution.

© Course authors (CC BY-SA 4.0) - Image: © Wendelin Jacober (CC0 1.0)
{
  "query": {
    "match": {
      "ids_alert_title": {
        "query": "exploit attempt"
      }
    }
  },
  "runtime_mappings": {
    "cve_details": {
      "type": "lookup",
      "target_index": "myvulns",
      "input_field": "related_cve",
      "target_field": "id", 
      "fetch_fields": [
        "cvss_score",
        "description",
        "included_in_kev"
      ]
    } 
  }
}
© Course authors (CC BY-SA 4.0) - Image: © Wendelin Jacober (CC0 1.0)

The middle path

input {
  opensearch {
    hosts => ["https://opensearch:9200"]

    schedule => "00 03 * * *"
    index => "logs-*"
    query => '{"query": {"match_all": {}}}'
  }
}

[...]

(Refresh stored enrichment information
on a schedule - best of both worlds?)

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC0 1.0)

Beware of the cost

Doing all that processing ain't free
and will add latency.

Increased query and storage costs.

Complexity in ingestion pipelines
increase the risk of disturbances.

© Course authors (CC BY-SA 4.0) - Image: © OLCF at ORNL (CC BY 2.0)

Conclusion

You've hopefully tasted the
sweet fruit of possibilities!

Most organizations have tons
of potentially useful data
laying around - let's use it!

Computers are cheap,
humans are not.

© Course authors (CC BY-SA 4.0) - Image: © M. Zamani, ESO (CC BY 2.0)

Reporting and alerting

Automating the boring stuff

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

You ain't got time nor interest
enough to stare at those logs
all day long.

Alerting and scheduled reporting
are two common methods that make
computers do the heavy lifting.

What are some relevant considerations?

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Alerts are typically triggered by
scheduled searches, which are called
"monitors" in OpenSearch.

Triggered if search returns results
or if result is above/below threshold.

"Alert me if a log contains evil IPs"
VS
"Alert me if the number of failed logins
for a user are >5 during 10 minutes".

Sliding time-span rather than "real-time",
required for aggregations.

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Let's talk about
the Zen of alerting.

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 3.0)

Alert fatigue

It's a serious problem.

Sucks the fun out of life and
may result in real problems
being ignored/missed.

Just ask Target.

Significant time
should be dedicated to
tweaking thresholds and
minimizing false-positives.

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Is this a good time to go
off-topic and talk a bit about
true/false-positives/negatives?

Probably not but let's do it anyway.

© Course authors (CC BY-SA 4.0) - Image: © Bret Bernhoft (CC0 1.0)

Taming notifications

Minimize interruptions and
context-switching.

Can it wait? Can / Must I do something now?
Some predictive extrapolation may help.

Think about alert priorities, target groups,
alert methods and scheduling.

Focus on end-to-end tests and
high-signal alerts to minimize
risk of false-positives.

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

Actionable information

What is the recipient expected
to do with the information?

Explain why the alert was
triggered and what it could mean.

Link to documentation, run-book
and other relevant guidance.

Even better: consider some
automated remediation.

© Course authors (CC BY-SA 4.0) - Image: © Eric Kilby (CC BY-SA 2.0)

Interrupt-driven work should be
avoided whenever possible.

Scheduled reports may be a
good alternative.

Aid recurring tasks such as
capacity planning and
threat hunting.

Export relevant information
to external systems.

Show that we are actually doing
things without noisy alerts.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Let's look at some example alerts
and meditate upon them!

© Course authors (CC BY-SA 4.0) - Image: © Nirvana Studios (CC BY 4.0)

Time: Sunday 02:42
Recipient: IT department
Notification method: Email
Count last 7 days: 1337
Message content:

Storage utilization on host "sat-06" is 84%.

© Course authors (CC BY-SA 4.0) - Image: © Bruno Sanchez-Andrade Nuño (CC BY 2.0)

Time: Sunday 02:42
Recipient: Satellite operations
Notification method: Ticket
Count last 7 days: 1
Message content:

Storage utilization on host "sat-06" is 84%.

© Course authors (CC BY-SA 4.0) - Image: © Bruno Sanchez-Andrade Nuño (CC BY 2.0)

Severity: Warning!

Storage utilization on host "sat-06" is 84%.

If max capacity is reached, the system may
become unstable and unable to operate.

Automated remediation was not able to
reclaim sufficient disk space.

Based on predictive extrapolation, the max
capacity will be reached on Tuesday 12:12
at the current rate.

For more information about this alert and
troubleshooting guidance, see https://....

© Course authors (CC BY-SA 4.0) - Image: © Bruno Sanchez-Andrade Nuño (CC BY 2.0)

Any other suggestions?

© Course authors (CC BY-SA 4.0) - Image: © Bruno Sanchez-Andrade Nuño (CC BY 2.0)

Let's have a look at how alerts
and reporting can be configured
in OpenSearch Dashboards!

© Course authors (CC BY-SA 4.0) - Image: © Kuhnmi (CC BY 2.0)

Wrapping up

Taming alerts and notifications
is a continuously ongoing battle,
not a one-off effort.

Just giving it some thought
will take you quite far!

If you have multiple tools producing
alerts, have a look at tools like
PagerDuty and Grafana OnCall.

© Course authors (CC BY-SA 4.0) - Image: © Adam Lusch (CC BY-SA 2.0)

Linux auditing

Peaking beyond /var/log/*

© Course authors (CC BY-SA 4.0) - Image: © Micah Elizabeth Scott (CC BY-SA 2.0)

Applications on Linux commonly produce
security related log events and store
them in text-files or syslog.

Pluggable Authentication Modules
provides logging of (most) login attempts.

What about when sensitive configuration
files are modified or suspicious
processes are executed?

Let's look at some more options for
inspection-based auditing on Linux.

© Course authors (CC BY-SA 4.0) - Image: © Micah Elizabeth Scott (CC BY-SA 2.0)

We'll talk about....

  • FIM and inotify
  • SELinux and AppArmor
  • Audit framework
  • eBPF and kprobes
© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)

FIM

File Integrity Monitoring.

Detected attempts to
Create, Read, Update and Delete
important files/directories.

Good fit for Linux since "everything is a "file"*.

Typically implemented by using a database
of file hashes and scheduled checking.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

inotify / fanotify

Features in the Linux kernel
to monitor file access.

Watchers can be registered to notify
a user-space application about
any CRUD operation.

Provides ability to monitor reads and
get instant notice without expensive
scheduled hashing.

Similar to "object access" auditing
on Windows.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)
$ sudo inotifywatch \
  --event access --event modify --event delete \
  --timeout 30 /etc/super_sensitive.conf 

Establishing watches...
Finished establishing watches, now collecting statistics.

total  access  modify  filename
7      5       1       /etc/super_sensitive.conf
© Course authors (CC BY-SA 4.0)

SELinux and AppArmor

Security Enhanced Linux.

Extends the basic access control system
consisting of file permissions.

Policies define what a user or program
can do on the system, like opening
spawning new processes.

Both are examples of
Linux Security Modules.

"Permissive mode" can be used to only log
(and not block) policy violations.

© Course authors (CC BY-SA 4.0) - Image: © Kārlis Dambrāns (CC BY 2.0)
AVC avc: denied  { name_connect } for pid=1338
comm="nginx" dest=8080
scontext=system_u:system_r:httpd_t:s0
tcontext=system_u:object_r:http_cache_port_t:s0
tclass=tcp_socket permissive=0

[...]

AVC avc: denied { read } for name="sdcard" dev="tmpfs" ino=6474
scontext=u:r:untrusted_app_29:s0:c244,c256,c512,c768
tcontext=u:object_r:mnt_sdcard_file:s0
tclass=lnk_file permissive=0 app=com.example.evilapp

[...]

AVC avc: denied  { execheap } for pid=3675
comm="chromium-browse"
scontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
tclass=process permissive=0
© Course authors (CC BY-SA 4.0) - Image: © Kārlis Dambrāns (CC BY 2.0)

Audit framework

Feature in Linux kernel for activity auditing.

Designed to primarily monitor security related events.

Generated audit records can be consumed by a user-space application for processing/storage.

Only supports one consumer at a time*.

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

Auditd

Historically the main consumer of
audit framework events.

Provides "rule configuration" and
logging to file/remote hosts.

Monitor system calls, file access
and "various interesting things".

Performs basic event correlation,
allowing user activity tracing
even if tools like sudo are used.

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)
type=USER_CMD msg=audit(1700115169.839:611):
pid=8527 uid=1900 auid=1900 ses=1
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
msg='cwd="/var/www/html/cgi.bin" cmd="whoami" exe="/usr/bin/sudo"
terminal=pts/3 res=success'UID="webapp" AUID="webapp"

[...]

type=NETFILTER_CFG msg=audit(1700164312.524:77):
table=nat:2 family=2 entries=7 op=nft_register_chain pid=1337
subj=system_u:system_r:iptables_t:s0 comm="nft-manager"

[...]

type=ANOM_PROMISCUOUS msg=audit(1700115655.202:694):
dev=wlan0 prom=256 old_prom=0
auid=901 uid=0 gid=0 ses=1AUID="persbrandt" UID="root" GID="root"
© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)
-D
-b 8192
-f 1
-a exit,always -F arch=b32 -S mount -S umount -k mount
-a exit,always -F arch=b64 -S mount -S umount2 -k mount
-w /bin/su -p x -k priv_esc
-w /usr/bin/sudo -p x -k priv_esc
-w /usr/sbin/stunnel -p x -k stunnel
-w /etc/cron.weekly/ -p wa -k cron
-w /etc/shadow -k etcpasswd
-a exit,always -F arch=b64 -F euid=0 -S execve -k rootexec
-a exit,always -F arch=b32 -F euid=0 -S execve -k rootexec
-w /etc/sudoers -p rw -k priv_esc
-e 2
© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

auditbeat and osquery
are other audit framework consumers.

(We'll get back to auditbeat later!)

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

kprobes and eBPF

kprobes can be used to dynamically instrument
most kernel functions/routines.

eBPF enables developers to create small programs
that can be executed in "kernel-space" when
hooked events occur and do anything*!

Starting to replace audit framework, LSM and
similar features due to its flexibility.

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Notable users

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Many amazing, such wow!

There are however some downsides,
as always...

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Ain't all chocolate and roses

Auditing all system activity requires
a bunch of CPU cycles and storage space.

As with other inspection-based logging,
it ain't always easy to understand
why something is happening.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Wrapping up

We'll play with auditbeat
in the next lab.

If you can't wait, I recommend
installing and configuring Falco
to detect if a Docker container
tries to spawn a shell/initiate
a network connection.

© Course authors (CC BY-SA 4.0) - Image: © Jorge Franganillo (CC BY 2.0)

Course recap

Let's refresh our memory

© Course authors (CC BY-SA 4.0) - Image: © Bixentro (CC BY 2.0)

Our logging stack

Open-source fork of the Elastic/"ELK" stack.

The course lab environment consists of:

  • Beats for collecting and transferring log data from server/producers
  • Logstash for parsing, filtering, normalization and enrichment
  • OpenSearch for log storage and analysis
  • OpenSearch Dashboards for GUI and visualization capabilities
© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Elastic Beats

Family of "logging agents" built on
the "libbeat" code library.

Each has a specific task -
read logs from text files (filebeat),
read logs from Event Log (winlogbeat),
record network traffic (packetbeat)...

Responsible for shipping logs
from our servers.

© Course authors (CC BY-SA 4.0) - Image: © Jesse James (CC BY 2.0)

Logstash

Recieves logs from Beats, Syslog, HTTP...

Filter, parse/extract, normalize and
enrich log events.

Logs pass through scriptable "pipelines",
which consist of three stages:
"input", "filter" and "output".

Plugins ("functions") are provided to
ingest, manipulate and store data.

Writes log events to one or more
OpenSearch indices.

© Course authors (CC BY-SA 4.0) - Image: © Kylie Jaxxon (CC BY-SA 2.0)

OpenSearch

Search engine/"document database"
based on Apache Lucene.

Fork of Elastic's Elasticsearch -
much of the documentation/tutorials
are still usable.

Used to persistently store and
analyze/monitor log events.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Log events are stored as "documents".

These are tied to an "index", which can be used to group similar documents.

An index consists of one or more "shards" that are used to actually store the data on disk.

Shards can be spread over multiple nodes to improve resiliency and search performance.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Query languages

Besides Lucene Query Language,
OpenSearch provides plugins for other ways
to express your searches/aggregations.

Depending on your preferences, you can use
Dashboard Query Language,
Pipe Processing Language or
Structured Query Language.

LQL and DQL are the most commonly used.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Search types

Searches in OpenSearch are
"queries", "aggregations"
or a combination of both.

Queries return matching documents.

Aggregations returns statistics
about document fields.

They can be combined to filter
data for statistical analysis.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Scored results

Query results ("hits") can be scored
to help us find the most relevant
matches first.

Methods like "bool" queries can be
utilized to affect the score
("must", "must_not", "should").

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Aggregations provide statistical insights
into our documents/log events.

Most basic is to count number
of documents matching a filter.

"avg" can be used to calculate the
average value of a specific field
(metric aggregation example).

"terms" works a bit like uniq -c:
counts the occurences of unique
field values in documents
(bucket aggregation example)

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Due to the way OpenSearch stores
"free text" fields, we can't
aggregate on their value.

Luckily, a field of the type
"keyword" is automatically created
for all text fields shorter with
a value shorter than 256 characters.

To aggregate on text field "login_method",
utilize the field "login_method.keyword".

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

OpenSearch Dashboards

Web-based GUI for OpenSearch.

Query and visualize data stored data.

Administration and configuration tasks.

© Course authors (CC BY-SA 4.0) - Image: © Steven Kay (CC BY-SA 2.0)

Enrichment

Aims to provide useful context for
log analysis and alerting.

Log events can be enriched at
ingestion-time, search-time
or on a scheduled basis.

Each approach has its own
pros/cons.

In our stack, enrighment is most
commonly implemented in Logstash
filter pipelines.

© Course authors (CC BY-SA 4.0) - Image: © O. Hainaut, ESO (CC BY 2.0)

The Zen of alerting involves
careful use of notifications,
verbose problem descriptions
and significant efforts to
minimize false-positives.

Great alerts provide context and
actionable information.

Scheduled reports can be used to
minimize "interrupt-driven" work,
improve planning/prognostication
and integrate with other systems
using standardized formats.

© Course authors (CC BY-SA 4.0) - Image: © Graham Drew (CC BY 2.0)

Scheduled searches used for alerts
are called "monitors" in OpenSearch.

"Triggers" can be used to configure
different severity levels/thresholds
(number of results/value of result).

"Channels" are used to send alerts
using email, Slack/Teams, etc.

"Composite monitor" check the status
of other two or more other monitor,
which can be used to minimize
notification volume/interruptions.

© Course authors (CC BY-SA 4.0) - Image: © Graham Drew (CC BY 2.0)

Advanced Linux auditing

TBD!

© Course authors (CC BY-SA 4.0) - Image: © Sbmeaper1 (CC0 1.0)

We got lots of things to cover -
let's move on!

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Query languages

Alternatives for data exploration

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 3.0)

Through plugins, OpenSearch provides
several different query languages
besides Lucene for querying and
aggregating documents.

Let's have a look at these!

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 3.0)

DQL

Dashboard Query Language.

Default option in OpenSearch Dashboards.

Aims to simplify common use-cases
for data filtering.

© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)
# Search for documents containing
# specified string in username field
user:mallory

# Combine multiple search terms using
# conditional statments and make use
# of wildcards and nummeric filters
hostname:db-*.int.example.org \
and (log_level >= 5 or type:exception)
© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)

PPL

Piped Processing Language.

Comfortable for UNIX power-users
and veterans of Splunk/Logpoint.

Supports easy runtime field creation.

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)
# Query all documents in index
# pattern, filter results and
# choose specific output fields
search source=logs-auth-* 
| where status='failed'
| fields user, source_ip

# Perform search-time field
# parsing and filter results
search source=logs-auth-* 
| parse user '.+@(?<domain>.+)'
| where domain='example.com'
| fields user, source_ip
© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

SQL

Structured Query Language.

Many developers and sysadmins are
already proficient in SQL,
making it a great option.

© Course authors (CC BY-SA 4.0) - Image: © Jennifer Morrow (CC BY 2.0)
-- Treat index pattern as
-- table name, document as
-- row and field as
-- column name
SELECT user, source_ip
FROM logs-auth-*
WHERE status = 'failed';

-- Basic aggregation for
-- failed logins per IP
SELECT COUNT(user), source_ip
FROM logs-auth-*
WHERE status = 'failed'
GROUP BY source_ip;
© Course authors (CC BY-SA 4.0) - Image: © Jennifer Morrow (CC BY 2.0)

If you wanna learn more, checkout the
OpenSearch documentation for
DQL and SQL/PPL

Feel like giving 'em a try?
Have a look at the query workbench
and "SQL and PPL" CLI tool.

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Exercise: Playing with SQL/PPL

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Open "Query Workbench" page in
your instance of OpenSearch Dashboards
or use the "SQL and PPL" CLI tool.

Develop queries using both SQL and PPL
to find/filter relevant log events:

  • Web server requests from Chrome browsers
  • Country and IP address for each failed Windows attempt login
  • Top 10 usernames observed during failed Windows login attempts

Send SQL and PPL query strings to
courses+log_012901@0x00.lt

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

AI and ML

Can it help us analyze logs?

© Course authors (CC BY-SA 4.0) - Image: © Eric Chan (CC BY 2.0)

Our centralized logging solution can act
as a data source for machine learning
and other types of AI.

But can it help us improve searching
and analysis?

Let's look at common use-cases and
how they're implemented in OpenSearch!

© Course authors (CC BY-SA 4.0) - Image: © Eric Chan (CC BY 2.0)

Example use-cases

  • Anomaly detection
  • Semantic queries
  • Conversational searching
© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Anomaly detection

Human brains are trained to identify
things out of the ordinary.

With a bit of work, we can make
computers do the same thing.

Enables us to sift through enormous
amounts of logs and act before a
nuance becomes a catastrophe.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Help us identify things like...

  • Unusually high API latency
  • Web server spawning shell process
  • User from finance department logging in to database in the middle of the night

...and things we didn't know could be
interesting - that's the whole point!

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

And as usual...

Computationally expensive
and quite opaque process.

Shit in, shit out -
we need a good "baseline".

Perhaps best as guidance for
development of static detection.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

We'll soon look at how anomaly
detection can be implemented
using OpenSearch.

Let's talk a bit about improving
searching and analysis first!

© Course authors (CC BY-SA 4.0) - Image: © Guilhem Vellut (CC BY 2.0)

Semantic queries

Traditionally, we've relied on
lexical/"keyword"-based searching.

Give me all logs containing the
string "authentication".

Natural Language Processing
helps us fetch more relevant results.

A good model understands the connection
between words like "authentication" and
"login"/"logout". It can also guess what
is "preambling" in our queries.

© Course authors (CC BY-SA 4.0) - Image: © Pyntofmyld (CC BY 2.0)

Conversational searching

Takes NLP one stage further by performing
a similar process for search results.

Often involves usage of a
Large Language Models, like ChatGPT.

Uses search results to provide answers,
not just pre-trained model data.

Context/previous dialogs should be
considered to improve experience.

© Course authors (CC BY-SA 4.0) - Image: © Kojach (CC BY 2.0)

With that background covered,
let's look at how OpenSearch can help!

© Course authors (CC BY-SA 4.0) - Image: © John Regan (CC BY 2.0)

Managing machine learning

Most functionality is provided
by the included "ML Commons" plugin.

Ability to run (pre-trained) models
on searches and indexed documents.

May use "local" or "remote" models.

Supports "node tagging" to optimize
things like I/O performance
and GPU/accelerator access.

Primarily accessible using the API.

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Anomaly detection

Provided as a high-level feature accessible
through OpenSearch Dashboards.

The easiest one to use relies on the
unsupervised Random Cut Forest algorithm
to compute anomaly grades/confidence scores.

Let's take it for a spin!

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)

Things to consider

If you can't represent it as number or aggregation,
the easy-to-use anomaly detection won't help.

Still needs quite a bit of guidance, in many cases
that effort could be better spent on statically
configured thresholds/outliers.

But it's kinda kool?

Curious to learn more? Have a look at the
"supported algorithms" documentation.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Semantic queries

Provides pre-trained
sentence transformation models.

Processing implemented through
OpenSearch ingest and search pipelines
(not to be confused with Logstash pipelines).

Can be combined with traditional
keyword-based approaches to
create "hybrid queries".

If you wanna play around, check out the
semantic search tutorial.

© Course authors (CC BY-SA 4.0) - Image: © Eric Nielsen (CC BY 2.0)

Conversational searching

Utilize a third-party provider like
ChatGPT, Amazon Bedrock and DeepSeek.

Option to use self-hostable solution
like Cohere ($$$).

"Experimental support" for "open models"
that may be self-hosted.

No nice "ChatGPT"-like UI provided
out-of-the-box, mainly APIs.

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Just an appetizer, I'm far from
an expert in this area!

The features are right there,
especially anomaly detection -
take them for a spin if you're interested.

© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)

Instrumenting applications

Considerations for implementation

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

We've talked about the pros/cons of
inspection-based and
instrumented logging.

Let's try to put our knowledge
to good use!

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Introducing the example app

Small HTTP API to support
gift procurement during Christmas.

Kids can add an item
to their wish list.

Elves can review wish list
and add items to gift list.

Santa Claus is root and can
do whatever he pleases!

© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

Available end-points

@app.route('/api/wishes')
def handle_wishes():
  user, privileges = authenticate(request)

[...]

@app.route('/api/gifts')
def handle_gifts():
  user, privileges = authenticate(request)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

User/Privilege configuration

users = {
  '7b58a15b': 'santa',
  '5e07deaf': 'elfie',
  'e2c853dc': 'sindy',
  '85181af2': 'greta'
}

privileges = {
  'santa': ['admin'],
  'elfie': ['review_wishes', 'add_gift'],
  'sindy': ['make_wish'],
  'greta': []
}
© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

Authentication mechanism

def authenticate(request):
  if not 'X-Key' in request.headers:
    abort(401)

  api_key = request.headers['X-Key']

  if not api_key in users.keys():
    abort(403)

  user = users[api_key]
  return user, privileges[user]
© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

Authorization control

def handle_wishes():
  user, privileges = authenticate(request)

  if request.method == 'POST':
    if 'admin' in privileges:
      pass
  
    elif not 'make_wish' in privileges:
      abort(403)

    description = request.get_json()
    set_wish(user, description)

    return Response(status=204)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

Let's implement some logging!

© Course authors (CC BY-SA 4.0) - Image: © Theo Crazzolara (CC BY 2.0)

Authentication failure logging

def authenticate(request):
  client_ip = request.remote_addr

  if not 'X-Key' in request.headers:
    print(
      f'Request without key from {client_ip}',
      file=sys.stderr)
      
    abort(401)
    
  api_key = request.headers['X-Key']
  if not api_key in users.keys():
    print(
      f'Request with invalid key from {client_ip}',
      file=sys.stderr)
     
    abort(403)

  return users[api_key], privileges[users[api_key]]
© Course authors (CC BY-SA 4.0) - Image: © Bill Smith (CC BY 2.0)

Request without key

$ curl --request GET "${BASE_URL}/api/gifts"
Request without key from 87.242.66.56
© Course authors (CC BY-SA 4.0) - Image: © Bill Smith (CC BY 2.0)

Request with invalid key

$ curl \
  --request GET "${BASE_URL}/api/gifts" \
  --header 'X-Key: hunter_2'
Request with invalid key from 23.61.227.126
© Course authors (CC BY-SA 4.0) - Image: © Bill Smith (CC BY 2.0)

Logging successful authentication

[...]

user = users[api_key]
print(
  f'Authenticated request from {client_ip} as {user}',
  file=sys.stderr)

return user, privileges[user]

[...]
Authenticated request from 104.26.0.74 as sindy
© Course authors (CC BY-SA 4.0) - Image: © Solarbotics (CC BY 2.0)

"logging" module

Utilities to aid with log creation,
filtering and rotation.

Can log to file, syslog, HTTP, etc.

Included in Python standard library.

© Course authors (CC BY-SA 4.0) - Image: © Solarbotics (CC BY 2.0)

Configuring basic logger

[...]

import logging as log

log.basicConfig(
  format='%(levelname)s: %(message)s',
  level=log.DEBUG)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Solarbotics (CC BY 2.0)

Updating logging statements

[...]

if not api_key in users.keys():
  log.warn(f'Request with invalid key from {client_ip}')  
  abort(403)
  
user = users[api_key]
log.info(f'Authenticated request from {client_ip} as {user}')
return user, privileges[user]

[...]
© Course authors (CC BY-SA 4.0) - Image: © Solarbotics (CC BY 2.0)

Enjoying log levels

INFO: Authenticated request from 172.25.0.3 as elfie
WARNING: Request without key from 65.9.55.4
WARNING: Request with invalid key from 194.18.169.38
© Course authors (CC BY-SA 4.0) - Image: © Solarbotics (CC BY 2.0)

Logging application activity

While authentication attempts are
important to log, so are the user activities.

Can help us operate the service
and detect abuse.

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Adding write logging

[...]

if request.method == 'POST':
  if 'admin' in privileges:
    pass

  elif not 'make_wish' in privileges:
    log.warn((
      f'User {user} from {client_ip}'
      ' tried to add a wish to their wish list'
      ' but did not have sufficient privileges'))
    
    abort(403)
    
  description = request.get_json()
  set_wish(user, description)
  log.info((
    f'User {user} from {client_ip} added '
    f'{description} to their wish list'))
    
  return Response(status=204)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Adding read logging

[...]

elif request.method == 'GET':
  if 'admin' in privileges:
    pass

  elif not 'review_wishes' in privileges:
    log.warn((
      f'User {user} from {client_ip}'
      ' tried to get/review the wish list'
      ' but did not have sufficient privileges'))
      
    abort(403)
    
  log.info((
    f'User {user} from {client_ip}'
    ' got/reviewed wish list'))

  return wishes

[...]
© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Inspecting the result

INFO: User sindy from 104.26.1.74 added
      Wine to their wish list
WARNING: User greta from 145.235.0.55 tried
         to add a wish to their wish list
         but did not have sufficient
         privileges
WARNING: User sindy from 104.26.0.74 tried to
         get/review the wish list but did not
         have sufficient privileges
© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Templating log messages

All these format strings are getting quite repetitive.

If we want to include information such as the requesting user agent, all lines must be modified.

Let's try to solve it in a better way!

© Course authors (CC BY-SA 4.0) - Image: © USGS EROS (CC BY 2.0)

Building prefix string

def handle_wishes():
  user, privileges = authenticate(request)
  log_prefix = (
    f'{request.remote_addr} - {user} - '
    f'{request.method} - {request.path}: ')

[...]
© Course authors (CC BY-SA 4.0) - Image: © USGS EROS (CC BY 2.0)

Updating log statements

[...]

elif not 'review_wishes' in privileges:
  log.warn(
    log_prefix +
    'Tried to get/review the wish list '
    'but did not have sufficient privileges')

  abort(403)
  
log.info(log_prefix + 'Got/reviewed wish list')
return wishes

[...]
© Course authors (CC BY-SA 4.0) - Image: © USGS EROS (CC BY 2.0)

Look at those entries!

INFO: 104.26.0.74 - sindy - POST - /api/wishes:  
      Added Wine to their wish list
INFO: 172.25.0.3 - elfie - GET - /api/wishes:   
      Got/reviewed wish list
WARNING: 145.235.0.55 - greta - POST - /api/wishes:
         Tried to add a wish to their wish list
         but did not have sufficient privileges
© Course authors (CC BY-SA 4.0) - Image: © USGS EROS (CC BY 2.0)

Structured logging

We've spent lots of time talking
about its many benefits.

For Python, the freely available
"structlog" library can be used.

© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Importing/Configuring library

import structlog

structlog.configure(
  processors=[
    structlog.processors.add_log_level,
    structlog.processors.TimeStamper(fmt='iso'),
    structlog.processors.JSONRenderer(indent=2)])

log = structlog.get_logger() 

[...]
© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Binding shared data

def handle_wishes():
  user, privileges = authenticate(request)
  rlog = log.bind(
    source={
      'ip': request.remote_addr},
    user=user,
    privileges=privileges,
    method=request.method,
    path=request.path,
    has_required_privilege=False,
    user_agent=request.headers.get(
      'User-Agent', 'unknown'))

[...]
© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Producing structured logs

[...]

elif not 'make_wish' in privileges:
  rlog.warn(
    'Tried to add a wish to their wish list'
    'but did not have sufficient privileges',
    required_privilege='make_wish')
  
  abort(403)

description = request.get_json()
set_wish(user, description)

rlog.info(
  f'Added {description} to their wish list',
  has_required_privilege=True,
  wish_list_item=description)
 
return Response(status=204)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Such structure, many wow!

{
  "source": {
    "ip": "104.26.0.74"
  },
  "user": "sindy",
  "privileges": [
    "make_wish"
  ],
  "method": "POST",
  "path": "/api/wishes",
  "has_required_privilege": true,
  "user_agent": "Firefox (Mac OS/x86_64)",
  "wish_list_item": "Wine",
  "event": "Added Wine to their wish list",
  "level": "info",
  "timestamp": "2023-11-25T14:28:42.907819Z"
}
© Course authors (CC BY-SA 4.0) - Image: © Reid Campbell (CC0 1.0)

Things are starting to look quite nice!

I did however notice some things
while peeking at the logs...

© Course authors (CC BY-SA 4.0) - Image: © Rolf Dietrich Brecher (CC BY 2.0)
{
  "source": {
    "ip": "172.25.0.99"
  },
  "user": "santa",
  "privileges": [
    "admin"
  ],
  "method": "POST",
  "path": "/api/gifts",
  "has_required_privilege": true,
  "user_agent": "curl/8.4.0",
  "gift_item": "Gold chain",
  "gift_recipient": "santa",
  "event": "Granting gift to santa",
  "level": "info",
  "timestamp": "2023-11-25T14:58:12.807811Z"
}
© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Adding abuse detection

[...]

if recipient == user:
  rlog = rlog.bind(is_suspicious=True)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)
{
  "source": {
    "ip": "172.25.0.99"
  },
  "user": "santa",
  "privileges": [
    "admin"
  ],
  "method": "DELETE",
  "path": "/api/gifts",
  "has_required_privilege": true,
  "user_agent": "curl/8.4.0",
  "recipient": "sindy",
  "event": "Deleting gift grant for sindy",
  "level": "info",
  "timestamp": "2023-11-25T14:59:42.807811Z"
}
© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)

Improving accountability

[...]

elif request.method == 'DELETE':
  data = request.get_json()

  if not data['reason']:
    rlog.warn(
      'User did not specify reason'
      'for deleting gift grant')

    abort(403)
  
  rlog.info(
    'Deleting gift grant for '
    + data['recipient'],
    has_required_privilege=True,
    recipient=data['recipient'],
    reason=data['reason'])

[...]
© Course authors (CC BY-SA 4.0) - Image: © Brendan J (CC BY 2.0)
{
  "source": {
    "ip": "172.25.0.3"
  },
  "user": "elfie",
  "privileges": [
    "review_wishes",
    "add_gift"
  ],
  "method": "POST",
  "path": "/api/gifts",
  "has_required_privilege": true,
  "user_agent": "NetScape Explorer 0.3",
  "gift_recipient": "sindy",
  "gift_item": "Winegums",
  "event": "Granting gift to sindy",
  "level": "info",
  "timestamp": "2023-11-25T15:13:02.807811Z"
}
© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)
{
  "source": {
    "ip": "172.25.0.3"
  },
  "user": "elfie",
  "privileges": [
    "review_wishes",
    "add_gift"
  ],
  "method": "POST",
  "path": "/api/gifts",
  "has_required_privilege": true,
  "user_agent": "NetScape Explorer 0.3",
  "gift_recipient": "soc_analyst",
  "gift_item": "Huge raise",
  "event": "Granting gift to soc_analyst",
  "level": "info",
  "timestamp": "2023-11-25T17:58:12.102311Z"
}
© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

Masking sensitive data

[...]

if recipient == 'soc_analyst':
  description = '*******'

[...]
© Course authors (CC BY-SA 4.0) - Image: © Jason Thibault (CC BY 2.0)

Besides audit information,
let's take the opportunity to
implement operational metrics!

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Measuring performance

[...]

start_time = time.time()
set_gift(recipient, description)
end_time = time.time()
seconds_elapsed = end_time - start_time

rlog.info(
  f'Granting gift to {recipient}',
  processing_time=seconds_elapsed,
  has_required_privilege=True,
  gift_recipient=recipient,
  gift_item=description)

[...]
© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)
{
  "source": {
    "ip": "172.25.0.3"
  },
  "user": "elfie",
  "privileges": [
    "review_wishes",
    "add_gift"
  ],
  "method": "POST",
  "path": "/api/gifts",
  "has_required_privilege": true,
  "user_agent": "NetScape Explorer 0.3",
  "is_suspicious": true,
  "processing_time": 0.4201374053955078,
  "gift_recipient": "elfie",
  "gift_item": "Respect",
  "event": "Granting gift to elfie",
  "level": "info",
  "timestamp": "2023-11-25T18:01:32.102311Z"
}
© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

Let's wrap this up, shall we?

© Course authors (CC BY-SA 4.0) - Image: © Jonathan Brandt (CC0 1.0)

Integrity monitoring

A somewhat gentle introduction

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

So you wanna protect the integrity of
your files and perhaps the whole system?

Several options besides simple FIMs.

Let's talk about some of them!

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

General considerations

Continuous or scheduled detection?

CRUD or just CRUD?

Platform support?

Overlap with other security software?

© Course authors (CC BY-SA 4.0) - Image: © Kevin Dooley (CC BY 2.0)

FOSS providing FIM

  • Tripwire Open Source
  • Samhain
  • OSSEC / Wazuh agent
  • osquery
  • auditd
  • Auditbeat
© Course authors (CC BY-SA 4.0) - Image: © Adam Greig (CC BY-SA 2.0)

Regardless which solution we choose,
there are some shared challenges...

© Course authors (CC BY-SA 4.0) - Image: © Adam Lusch (CC BY-SA 2.0)

The state of systems change over time as
applications gets installed/updated and
administrators modify configuration.

FIM tells us that a file has changed,
but not necessarily its contents
before and after.

(Usage of immutable systems, such as Docker
containers, can greatly reduce the burden -
especially when combined with a read-only
file system configuration).

© Course authors (CC BY-SA 4.0) - Image: © Kurayba (CC BY-SA 2.0)

Knowing which files to include on
the FIM watch-list can be tricky -
especially with confidence when if
performing incident recovery.

Some sensitive files, like databases,
are modified continuously during
normal usage of the system.

During forensic analysis, it may be
interesting to inspect changes in
files that aren't critical to system
integrity, like web browser history.

© Course authors (CC BY-SA 4.0) - Image: © Jesse James (CC BY 2.0)

Disk/File system snapshots can
be a useful complement...

© Course authors (CC BY-SA 4.0) - Image: © Brocken Inaglory (CC BY-SA 3.0)

Disk snapshots

Copy storage medium bit-by-bit
to create a clone/replica.

Provides something pristine we can
analyze without fear of changes.

Hardware-based solutions exist
that have been certified to not
affect the original medium.

Preferable for forensic use-cases,
but requires lots of storage space.

© Course authors (CC BY-SA 4.0) - Image: © Cory Doctorow (CC BY-SA 2.0)

File system snapshots

Copy allocated parts of file system,
enables incremental backups limited
to files that have been changed.

May be performed continuously.

Windows' Volume Shadow Copy and
file systems using Copy-o-Write,
such as APFS, BTRFS and ZFS, make
the process quite efficient!

© Course authors (CC BY-SA 4.0) - Image: © Joel Rangsmo (CC BY-SA 4.0)

Once we have two snapshots that we
wanna compare, tools like diffoscope,
"The Sleuth Kit" and commercial
offerings such as OpenText Forensic
can help us make sense of it.

(Advanced forensic analysis is an
interesting topic, but out-of-scope
for this course, I'm afraid!)

© Course authors (CC BY-SA 4.0) - Image: © Nirvana Studios (CC BY 4.0)

How can we trust the FIM or
any other logs if the system
has been compromised?

I've told you that you shouldn't,
but that ain't always very practical.

Let's talk a bit about boot and
runtime integrity protection...

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)
center
© Course authors (CC BY-SA 4.0)

What's "secure boot"?

Not just a thing to make neckbeards mad!

Utilizes cryptographic signatures during
the computer's boot process to prevent
execution of untrusted firmware,
loaders and operating systems.

Each component in the boot chain is
responsible for verifying the next.

Most systems ship with a trust store
managed by Microsoft, some support
configuration of custom keys/CAs.

© Course authors (CC BY-SA 4.0) - Image: © Jusotil 1943 (CC0 1.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)

"Measure boot" takes it a further!

Each component in the boot chain is
responsible for verifying the next
and storing its hash digest in a
hash chain, typically provided
by a TPM or similar HSM.

Enables us to know what software
was booted, not just that it was
cryptographically signed.

Sometimes used in combination with
"TPM attestation" to verify system state
before providing access to secrets.

© Course authors (CC BY-SA 4.0) - Image: © Adam Lusch (CC BY-SA 2.0)
center
© Course authors (CC BY-SA 4.0)

What about integrity monitoring
and protection post-boot?

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Linux's IMA

Integrity Measurement Architecture.

Monitors execution of programs and performs
hashing of their content ("measurement")
before execution.

Can be used to verify runtime integrity and
notify administrators if unexpected
applications are run.

May also be used to block execution of
modified/untrusted programs, but comes
with many gotchas and complexity.

© Course authors (CC BY-SA 4.0) - Image: © Tobin (CC BY-SA 2.0)

If you think this sounds cool/useful,
checkout Keylime and the
"System Transparency" project.

You can also check out Joel's talk
from SEC-T, which is available on YouTube.

© Course authors (CC BY-SA 4.0) - Image: © Kuhnmi (CC BY 2.0)

Conclusions

Integrity protection ain't just
about basic FIMs.

Usage of immutable systems
surely simplifies monitoring
of state changes.

TPM + Measured boot + IMA ~= <3

While improving trust in the
system, there may still be
vulnerabilities affecting
its trustworthiness.

© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)

Course recap

Let's refresh our memory

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC0 1.0)

Linux auditing

Applications on Linux commonly log
security related events to syslog.

FIM software can be combined with
"inotify" feature for efficiency
and ability to detect reads.

The Linux audit framework enables
logging syscalls and other kernel
activity of interest.

Features like "eBPF" are beginning to
replace current audit functionality,
mainly due to its flexibility.

© Course authors (CC BY-SA 4.0) - Image: © Lars Juhl Jensen (CC BY 2.0)

OpenSearch provides support for queries
using LQL, DQL, PPL and SQL.

DQL aims to be easy to use for filtering,
but lack advanced aggregation features.

PPL is designed to ease on-boarding of
shell lovers and users used to other
SIEMs like Splunk.

SQL provides a query language known by
many developers and data scientists.

DQL, PPL and SQL gets translated to LQL,
with varying degrees of success and
quality of error messages.

© Course authors (CC BY-SA 4.0) - Image: © Sergei Gussev (CC BY 2.0)

AI/ML for log analysis

While a centralized log solution can
act as a data source for AI/ML, it
may also improve the search and
analysis experience.

OpenSearch provides several pre-trained
freely available for usage*.

The RCF algorithm is commonly used
for anomaly detection.

NLP and LLMs can be used to provide
semantic and conversational queries.

© Course authors (CC BY-SA 4.0) - Image: © Yellowcloud (CC BY 2.0)

Instrumenting applications

Usage of templating/data binding to reduce
repetition and ease instrumentation process.

Usage of libraries capable of producing
structured logs using JSON or similar
well-supported format.

Using domain-specific knowledge to implement
detection of malicious/suspicious behavior.

© Course authors (CC BY-SA 4.0) - Image: © Rob Hurson (CC BY-SA 2.0)

Integrity monitoring

If we can't trust the integrity of a system,
we can't put much faith in its logs.

Using a FIM is a good start, but not enough.

Solutions like secure boot/measured boot tries
to prevent/detect manipulation of low-level
software, such as the UEFI implementation.

Features like Linux's IMA enables runtime
detection of untrusted/manipulated software.

Not bulletproof, but an improvement.

(Side-quest into disk cloning/forensics)

© Course authors (CC BY-SA 4.0) - Image: © William Warby (CC BY 2.0)

All caught up?
Let's move on!

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Logging agents

Collecting and shipping data

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Some applications/devices support
logging using standard protocols
like Syslog, GELF and plain HTTP.

In many cases, we need to utilize
end-point software to collect and
ship the event data.

These are called logging agents.

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Choosing an agent

Input sources?

Filtering/Manipulation capabilities?

Buffering support?

Network encryption/authentication?

Remote configuration management?

Platform/Operating system support?

© Course authors (CC BY-SA 4.0) - Image: © OLCF at ORNL (CC BY 2.0)

When using OpenSearch,
Fluent Bit and Elastic Beats
are the most common choices.

© Course authors (CC BY-SA 4.0) - Image: © Mike Grauer Jr (CC BY 2.0)

Fluent Bit

Very light-weight open source logging agent.

Supports several common data sources,
such as file, systemd's journal and
the Windows event log.

Supports wide-range of outputs, including
Fluentd, Logstash Data Prepper and
directly writing to OpenSearch API.

© Course authors (CC BY-SA 4.0) - Image: © Ludm (CC BY-SA 2.0)

Elastic Beats

Family of "data shippers" developed by Elastic
and community members to replace Logstash
on end-points/log producers.

Built using "libbeat", sharing common features
and configuration file format.

Available in proprietary and open-source versions.
Access to some features is restricted, like the
Auditbeat "system" module.

© Course authors (CC BY-SA 4.0) - Image: © Jesse James (CC BY 2.0)

Some neat features

  • Fairly light-weight and portable
  • Straight-forward to install and configure
  • Providers several different "processors" for filtering/enrichment/anonymization
  • Great control of buffering behavior and Kafka/Redis output support
© Course authors (CC BY-SA 4.0) - Image: © Alan Shearman (CC BY 2.0)

Let's have a look at...

  • Filebeat
  • Winlogbeat
  • Packetbeat
  • Auditbeat
© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Honorable mentions

  • Serialbeat
  • Nvidiagpubeat
  • Openvpnbeat
  • Browserbeat
  • Discobeat
© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Filebeat

Originally designed to read logs and other data
from text files.

Provides built-in parsers for common formats.

Have integrated functionality from "journalbeat"
to read logs directly from systemd's journal.

(Suffering from dissociative identity disorder,
supports reading logs from message queues,
Office365, NetFlow, TCP, etc.)

© Course authors (CC BY-SA 4.0) - Image: © Kylie Jaxxon (CC BY-SA 2.0)

/etc/filebeat/filebeat.yml

---
filebeat.inputs:
  - type: "filestream"
    id: "my-fancy-app"
    paths: ["/var/log/myapp/*"]

  - type: "journald"
    id: "everything"
    enabled: true

filebeat.modules:
  - module: "nginx"
    error:
      enabled: true
      var.paths: ["/var/log/nginx/error.log"]
    access:
      enabled: true
      var.paths: ["/var/log/nginx/access.log"]
      input.processors:
        - replace.fields:
          - field: "message"
            pattern: "ss_nr=[0-9]+"
            replacement: "ss_nr=**********"

output.logstash:
  enabled: true
  ssl.enabled: false
  hosts:
    - "logs-a.example.com:5044"
    - "logs-b.example.com:5044"
© Course authors (CC BY-SA 4.0) - Image: © Kylie Jaxxon (CC BY-SA 2.0)

/etc/filebeat/filebeat.yml

---
filebeat.inputs:
  - type: "filestream"
    id: "my-fancy-app"
    paths: ["/var/log/myapp/*"]

  - type: "journald"
    id: "everything"
    enabled: true

[...]
© Course authors (CC BY-SA 4.0) - Image: © Kylie Jaxxon (CC BY-SA 2.0)

/etc/filebeat/filebeat.yml

[...]

filebeat.modules:
  - module: "nginx"
    error:
      enabled: true
      var.paths: ["/var/log/nginx/error.log"]
    access:
      enabled: true
      var.paths: ["/var/log/nginx/access.log"]
      input.processors:
        - replace.fields:
          - field: "message"
            pattern: "ss_nr=[0-9]+"
            replacement: "ss_nr=**********"

[...]
© Course authors (CC BY-SA 4.0) - Image: © Kylie Jaxxon (CC BY-SA 2.0)

/etc/filebeat/filebeat.yml

[...]

output.logstash:
  enabled: true
  ssl.enabled: false
  hosts:
    - "logs-a.example.com:5044"
    - "logs-b.example.com:5044"
© Course authors (CC BY-SA 4.0) - Image: © Kylie Jaxxon (CC BY-SA 2.0)
center
© Course authors (CC BY-SA 4.0)

Winlogbeat

Collects data from the Windows event log.

Handles messy schema/field mapping.

Commonly paired with Filebeat, as many
applications don't utilize the event log.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

C:\ProgramData\Elastic\Beats
\winlogbeat\winlogbeat.yml

---
winlogbeat.event_logs:
  - name: "Security"
  - name: "Microsoft-Windows-Sysmon/Operational"
  - name: "Windows PowerShell"
    event_id: 400, 403, 600, 800

  - name: "ForwardedEvents"
    tags: ["forwarded"]

output.logstash:
  enabled: true
  ssl.enabled: false
  hosts:
    - "logs-a.example.com:5044"
    - "logs-b.example.com:5044"
© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Packetbeat

Agent providing inspection-based
network logging.

Can record flow data and decode
a hand-full of common protocols
for payload monitoring.

Commonly used as a bandage for
systems that have limited log
production capabilities.

Can monitor span/tap port.

© Course authors (CC BY-SA 4.0) - Image: © Scott Schiller (CC BY 2.0)

/etc/packetbeat/packetbeat.yml

---
packetbeat.interfaces:
  type: "af_packet"
  device: "ens3"

packetbeat.flows:
  timeout: "30s"
  period: "10s"

packetbeat.protocols:
  - type: "icmp"
    enabled: true

  - type: "dns"
    ports: [53]

output.logstash:
  enabled: true
  ssl.enabled: false
  hosts:
    - "logs-a.example.com:5044"
    - "logs-b.example.com:5044"
© Course authors (CC BY-SA 4.0) - Image: © Scott Schiller (CC BY 2.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)

Auditbeat

Provides FIM functionality on
Linux, Windows and Mac OS.

Consumer of audit framework on
Linux, compatible with auditd
rules configuration syntax.

Closed-source version includes
"system" module, providing much
of the same functionality, but
also supports Windows/Mac OS
and with simpler configuration.

© Course authors (CC BY-SA 4.0) - Image: © Johan Neven (CC BY 2.0)

Just remember that...

There's overhead associated with processing,
especially audit events and network data.

Neither Fluent Bit nor Elastic Beats comes with
built-in remote configuration management.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Wrapping up

You'll soon get a chance to try them!

Questions and/or half-baked thoughts?

© Course authors (CC BY-SA 4.0) - Image: © Randy Adams (CC BY-SA 2.0)

Lab: Using Elastic Beats

Extracting logs and audit information

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Lab description

Graded exercise to use Auditbeat for
FIM/process activity logging and
OpenSearch monitors for alerting.

For detailed instructions, see:
"resources/labs/beats/README.md".

© Course authors (CC BY-SA 4.0) - Image: © Halfrain (CC BY-SA 2.0)

Best (and worst) practices

More or less painful lessons

© Course authors (CC BY-SA 4.0) - Image: © Greg Lloy (CC BY 2.0)

There's lots more to learn about successfully
implementing logging solutions.

The following (opinionated) slides cover some
of the lessons I've (painfully) learned.

© Course authors (CC BY-SA 4.0) - Image: © Greg Lloy (CC BY 2.0)

Setup retention/rotation

No one (but the NSA) can afford
to store logs forever.

Before ingesting a new log source,
make sure to check and communicate
retention requirements/policy.

Backup log data whenever required,
but be aware of the cost.

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Retention in OpenSearch

Backups are provided by the
"Scheduled snapshots" feature.

While possible to delete specific
documents (log events) in OpenSearch,
the most straight forward way is to
rotate (delete) whole indicies.

Don't store log events with different
retention requirements in the same index.

Retention/Rotation/Storage tier migration
is handled by "State management policies".

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)

Tagging log types

Grouping log sources commonly
searched together.

Spend the time before everything
is burning during an incident.

In OpenSearch, we can utilize
"index patterns" (sometimes)
or "index aliases".

© Course authors (CC BY-SA 4.0) - Image: © Thierry Ehrmann (CC BY 2.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)
center
© Course authors (CC BY-SA 4.0)

Monitoring ingestion

© Course authors (CC BY-SA 4.0) - Image: © Pelle Sten (CC BY 2.0)

Documenting known unknowns

© Course authors (CC BY-SA 4.0) - Image: © Steve Jurvetson (CC BY 2.0)

Working with Sigma

© Course authors (CC BY-SA 4.0) - Image: © David Revoy (CC BY 4.0)

Schedule alert-review

© Course authors (CC BY-SA 4.0) - Image: © Pedro Ribeiro Simões (CC BY 2.0)

Source/Query cost analysis

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Make it a procurement requirement

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Including logging in SDLC

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Sell it as BI!

© Course authors (CC BY-SA 4.0) - Image: © Chris Gunn, NASA (CC BY 2.0)

UTC is your friend

© Course authors (CC BY-SA 4.0) - Image: © Martin Fisch (CC BY 2.0)

While just scratching the surface,
I hope these lessons gave you
some useful insights!

© Course authors (CC BY-SA 4.0) - Image: © Stéphane Gallay (CC BY 2.0)

What's next?

Leveling up your knowledge

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

System auditing and log analysis are
useful (but complex) areas of expertise.

Let's look at possible future steps
to serve as guidance on your journey.

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

Keep on playing

OpenSearch is free as in speech
and free as in beer.

Grab the Docker Compose file and
keep going where you left off!

© Course authors (CC BY-SA 4.0) - Image: © Miguel Discart (CC BY-SA 2.0)

Online training and tutorials

As previously mentioned, lots of the
documentation/guides designed for the
Elastic/ELK stack also applies to
OpenSearch (pre 7.11 release).

Checkout the "Elastic Training Portal",
George Bridgeman's Elastic/OpenSearch tutorials
and courses on sites like Udemy.

© Course authors (CC BY-SA 4.0) - Image: © Paris Buttfield-Addison (CC BY 2.0)

Boss of the SOC

Dataset provided by Splunk containing
security-related logs for practicing
detection/analysis.

Version 1 to 3 are freely available!

Ported to work with Elastic/OpenSearch
by the "BOTES project".

© Course authors (CC BY-SA 4.0) - Image: © Edenpictures (CC BY 2.0)

Trying something else

Splunk has provides free trials.

5.0GB per day for 14 days on Splunk Cloud.
0.5GB per day for 60 days for self-hosting.

Loki is available as FOSS
and Grafana Cloud is free up to 50GB storage.

© Course authors (CC BY-SA 4.0) - Image: © Bret Bernhoft (CC0 1.0)

Course summary

Let's wrap this up!

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

We've talked about a lot of topics so far.

Let's try to summarize them!

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Types of logs

© Course authors (CC BY-SA 4.0) - Image: © David J (CC BY 2.0)

Inspection-based logging

© Course authors (CC BY-SA 4.0) - Image: © Crazy Crusty (CC0 1.0)

Rules and regulations

© Course authors (CC BY-SA 4.0) - Image: © Nicholas A. Tonelli (CC BY 2.0)

Time and calendars

© Course authors (CC BY-SA 4.0) - Image: © Edd Thomas (CC BY 2.0)

Integrity monitoring

© Course authors (CC BY-SA 4.0) - Image: © Fritzchens Fritz (CC0 1.0)

Log formats and protocols

© Course authors (CC BY-SA 4.0) - Image: © Håkan Dahlström (CC BY 2.0)

Log analysis in the shell

© Course authors (CC BY-SA 4.0) - Image: © Marcin Wichary (CC BY 2.0)

Wielding OpenSearch

© Course authors (CC BY-SA 4.0) - Image: © Tom Held (CC BY 2.0)

DE_END2

© Course authors (CC BY-SA 4.0) - Image: © ESA-G (CC BY-SA 3.0 IGO)

Reflections exercise

What have you learned?

© Course authors (CC BY-SA 4.0) - Image: © Freestocks.org (CC0 1.0)

Answer the following questions

  • What are your most important takeaways?
  • Did you have any "Ahaaa!"-moments?
  • Was anything unclear or were there specifics you didn't understand?

courses+log_013901@0x00.lt

© Course authors (CC BY-SA 4.0) - Image: © Austin Design (CC BY-SA 2.0)

Group exercise

Putting knowledge to use

© Course authors (CC BY-SA 4.0) - Image: © Loco Steve (CC BY-SA 2.0)

Exercise: Logless bank

Participants are split into one or more groups.

Each group is tasked with presenting a logging
implementation plan for a technically skilled
CISO at high-security organization.

Focus on security related aspects and try
to provide concrete examples of solutions.

After presentation, send slides to
courses+log_014001@0x00.lt

© Course authors (CC BY-SA 4.0) - Image: © Loco Steve (CC BY-SA 2.0)

You're <INSERT NAME HERE>, a consulting
company that helps clients improve their
security by implementing logging solutions.

Customers appreciate that you're able to
offer everything from developer guidance
about "audit log design" to legal advice
regarding storage of sensitive information.

Occasionally, you also need to soothe budget
concerns by highlighting logging's "value add".

In just a few minutes, you're going to try
convincing your biggest client yet...

© Course authors (CC BY-SA 4.0) - Image: © Indrora (CC BY 2.0)

Xample Bank & Finance is a retail bank
that provides lending and payment services
to customers in the Nordics and Baltic states.

Headquarters in Sweden with customer support
staff in Spanish call-centers and IT department
outsourced to India. 1336 workers in total.

While they claim to have "customer's privacy
and security" as their highest priority,
the previous CISO didn't think logging
was important - hence, none exist!

He has been brutally fired and replaced.

© Course authors (CC BY-SA 4.0) - Image: © Cory Doctorow (CC BY-SA 2.0)

Their IT environment consists of an on-prem
data center in the HQ basement for the core
banking software and credit card handling,
IaaS provided by AWS for the web/mobile
apps that customers use, and SaaS for
communication/collaboration (Office 365).

Their servers mainly run Linux, but there
are some Windows systems for user management
and supporting services (Active Directory).

They have several offices (with equipment
like printers) and remote workers that are
connected to an internal network using VPNs.

The client devices run Windows or macOS.

© Course authors (CC BY-SA 4.0) - Image: © Cory Doctorow (CC BY-SA 2.0)

You've been asked to provide a somewhat
detailed implementation plan as a
presentation to help them monitor
their complex IT environment.

Suggestions of software/products,
instrumented logging in bank apps,
legal/compliance advice, tweaks of
system configuration and everything
in between is highly appreciated.

The presentation should contain a
prioritized list of recommended efforts.

Keep it technical, but feel free to provide
scary examples to help seal the deal.

© Course authors (CC BY-SA 4.0) - Image: © Dennis van Zuijlekom (CC BY-SA 2.0)

Any questions?

© Course authors (CC BY-SA 4.0) - Image: © Stig Nygaard (CC BY 2.0)

Welcome participants and wait for everyone to get settled. Introduction of the lecturers and their background. Segue: In this course we'll talk about logging...

Understand the run-time behavior and activity in modern IT systems.

Attacks or misuse. Fundamental for TD and IR.

Discover the benefits of centralized collection, normalization and analysis of log data.

Essential part to comply with most IT related laws/compliance schemes.

With the exception of the first lab, remote system provided by teacher. Use Vagrant or whatever you're comfortable with to setup a lab system.

- We'll cover lots of things in a short amount of time - In order to be able to do this we'll use scientifically proven methods to Make It Stick - Basically what the slide says - Don't forget to have fun! - If available, show detailed course schedule

- There are several resources to help you learn - Speaker notes in slides are heavily recommended for recaps/deep diving - May also be available through LMS, depending on how the course is consumed - The course is designed to be instructor lead, won't make the most of it on your own, see as aid - Presentations may be recorded, but only the speaker side for good and bad

The course wouldn't be available if it wasn't for financial support - Thanks!

- Encourage participants to make the course better - Learners are likely the best to provide critique, lecturers are likely a bit home-blind - No cats or dogs allowed! - Feel free to share it with friends or use it yourself later in your career

The term "log book" is old. Use as anectode: https://upload.wikimedia.org/wikipedia/commons/a/a8/Speyer_Handlog.jpg

- Confusing name, far from black! - Semi-automated system used to record what happened in/to the airplane. - Help us understand accidents and prevent future ones. Segue: We also use logging in computer systems...

- Review logs for IoCs and undesired activity. - Just the knowledge of that activity is monitored may deter undesired activity. - Help us understand how things actually work, why they don't and where to improve - Behavior of users/customers in our services - GDPR/PCI DSS requires logging of access to PII/credit card information

- Depends on what type of log we are talking about. Segue: Two broad categories...

- Why is the system inaccessible? - What is causing request latency? - Typically helps developers, system administrators and business analysists - A good operational log helps these people do their jobs

- Primarily interesting for security related roles. - Play detective with red strings and a "crazy wall" Segue: So what makes a good audit log entry?

- The 5 W:s of audit logging - Each log entry should ideally answer these questions.

- Essential for putting events in cronological order - In distributed systems, accurate time is crucial for correlation of events - More about time/clocks later!

- Useful context for events - Which user/system administrator could I ask/question about the happening - Is it resonable that a guy in sales is trying to access IT management systems? - How about a recently fired (disgruntled) employee who are trying to download all shared files? - Goes without saying, but the better the authentication the more we can trust this

- "Failed to authenticate against database due to wrong password" - "Could not delete file due to insufficient privileges" - "Safe-door unlocked"

- Event causer == human/computer - Help us put to make sense of the event - Is it resonable that Janne is trying to access their email from Murmansk? - Was the action performed from an IP address or computer controlled by the organization? - Can't always trust this information Segue: And lastly... why?

- Searches in the police data registry - Ticket ID/Documentation for why a firewall exception was added - May not be provided by a human, but rather another system to make sense of events - "This database entry was deleted due to user X performing action Y in system Z" Segue: Now that we know what should be in the audit log entry, how do we present the info?

- These logs will most likely be monitored by computers and analyzed by humans - Clear separation of individual events - Clear separation of the 5 W:s, should be easy to differentiate betwen when, what... - More about different log formats and their pros/cons late

- Some type of events are hard to categorize. - An application's permission failure to access a database may be of interest of both ops and sec - Often all logs are written to the same file/database table - A large part of the job in a SOC is filtering logs for relevant events

We know what we want, how do we actually get ahold of these logs?

- Sometimes known as "black box observability" (not to be confused with airplanes) - Useful for legacy systems who haven't been designed to produce desired logs Segue: Quite low-level, may be hard to answer W:s except when and where....

- The application - Prefered, but may be costly/very hard to implement - Requires cooperation from software/system developer

- As we've talked about audit logging, security personnel are a given consumer Segue: But there are also others who are interested...

- Let's be a bit more specific

- A/B testing == What effect did change X have on metrix Y? - Some businesses make their living on selling user behavior data to others

Are we reading each log event, row for row? No.

- Why is it so neat to have computers monitor the logs for us?

Example: Fail2Ban, automated order of disks based on total utilization

- Usually simple counters or gauges - Scraped and stored in a time-series database

Fail open or closed? Auditd is an example

https://upload.wikimedia.org/wikipedia/commons/e/ec/World_Time_Zones_Map.svg

https://www.netnod.se/sites/default/files/2022-06/NTS-FPGA-presentation-christer.pdf

https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands

Examples of vertical access control: - Delete data (retention rules) - Modify detection/scrubbing rules

https://www.riksdagen.se/sv/dokument-och-lagar/dokument/svensk-forfattningssamling/lag-2022482-om-elektronisk-kommunikation_sfs-2022-482/

- Origins in 50s, use heavily in computers since late 60s

https://user-images.githubusercontent.com/20878432/43869313-29afa944-9b72-11e8-83fa-f8e8859875fc.png

https://go2docs.graylog.org/current/getting_in_log_data/gelf.html#GELFPayloadSpecification https://www.elastic.co/docs/reference/ecs https://www.elastic.co/docs/current/en/integrations/cef https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/cef-implementation-standard/Content/CEF/Chapter%201%20What%20is%20CEF.htm

https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/security_guide/sec-audit_record_types#sec-Audit_Record_Types

https://sysdig.com/blog/getting-started-writing-falco-rules/ https://falcosecurity.github.io/rules/ https://cilium.io/

https://docs.opensearch.org/latest/search-plugins/sql/ppl/index/

https://docs.opensearch.org/latest/search-plugins/sql/sql/index/

https://opensearch.org/blog/semantic-search-solutions/

https://opensearch.org/docs/latest/search-plugins/conversational-search/

https://fluentbit.io/

https://www.elastic.co/docs/reference/beats/libbeat/community-beats

https://sigmahq.io/