Quick Forensics Analysis of Apache logs, (Fri, Mar 29th)

Category :

SANS Full Feed

Posted On :

Sometimes, you’ve to quickly investigate a webserver logs for potential malicious activity. If you’re lucky, logs are already indexed in real-time in a log management solution and you can automatically launch some hunting queries. If that’s not the case, you can download all logs on a local system or a cloud instance and index them manually. But it’s not always the easiest/fastest way due to the amount of data to process.

These days, I’m always trying to process data as close as possible of their location/source and only download the investigation results. So you reduce the bandwidth usage, and local resources (memory, CPU, …) 

I had to analyze a huge set of Apache logs (the current one included all the archived ones – for 1 year) and used the following solution: mal2csv[1] (Malformed Access Logs to CSV). As the name says, the main purpose of this tool is to convert an Apache access log into a CSV file (easier to process in some cases) but it has two interesting extra features:

It deobfuscates encoding (common in web attacks)  to humanly readable text
It checks log entries against the PHPIDS[2] regex rules to identify known malicious requests.

Interesting log entries are stored in separate files for further review.

On the web server, Docker was available. To perform my forensic analysis, I created a Docker image to not pollute the server with extra tools (and deleted after the processing). Simple config:

FROM ubuntu:latest
LABEL maintainer=”Xavier Mertens <[email protected]>”
RUN apt update &&
apt install -y git python3
RUN git clone https://github.com/RandomRhythm/mal2csv.git
WORKDIR /opt/mal2csv
ENTRYPOINT [“python3”, “./mal2csv.py”]

Once the image is built, access log files can be analyzed like this (if they are located in a default location for Apache):

# mkdir /var/tmp/results
# for F in /var/log/apache2/access.log*
zcat -f $F >/var/tmp/results/$(basename $F).txt
  docker run -it –rm -v /var/tmp/results:/data mal2csv:1.0 -i /data/$(basename $F).txt -o /data/$(basename $F).txt -d -l -p -r -f

This loop will process all access.log files one by one, and extract them in /var/tmp/results. For every log, 3 files will be created. Example:

-rw-r–r– 1 root root 20488876 Mar 28 15:33 access.log.txtLogOutput.Formatted
-rw-r–r– 1 root root 880986 Mar 28 15:33 access.log.txtLogOutput.Formatted.IDS
-rw-r–r– 1 root root 1418806 Mar 28 15:33 access.log.txtLogOutput.Formatted.interesting

The “Output.Formatted” file will contain all events converted in CSV. The two others are more interesting:

The “Formatted.IDS” file will contain a listing of events that match PHPIDS rules:

“24”,”Detects basic obfuscated JavaScript script injections”,”GET /config/.env HTTP/1.1″
“35”,”Detects common comment types”,”GET /phpMyAdmin+++—/index.php HTTP/1.1″
“20”,”Detects JavaScript language constructs”,”GET /index.php?s=/Index/\think\app/invokefunction&function=call_user_func_array&vars[0]=md5&vars[1][]=HelloThinkPHP21 HTTP/1.1″
“8”,”Detects self-executing JavaScript functions”,”GET /?a=fetch&content=<php>die(@md5(HelloThinkCMF))</php> HTTP/1.1″

The “Formatted.Interesting” file will contain the original events that match a PHPIDS rule. Now, you know where to put more effort in your investigations.

Pretty straightforward to perform a quick first analysis of your logs! Note that mal2csv can also process Microsoft IIS logs (use the “-m” command line switch) and the detection rules are located in two files:


Easy to maintain them to add your own rules!

[1] https://github.com/RandomRhythm/mal2csv
[2] https://github.com/PHPIDS/PHPIDS

Xavier Mertens (@xme)
Senior ISC Handler – Freelance Cyber Security Consultant

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.