Richard Biever is Duke University’s chief information security officer.
Gagan Kaur is a data scientist at Duke University.

The problem

Security teams have the unenviable task of protecting large, complex, and diverse environments. Areas of focus typically center on protecting an internal network and today, cloud environments. Security infrastructure such as intrusion prevention/detection, firewalls, log analysis, and endpoint detection and response (EDR) are common tools used to protect organizations from active attacks originating from “outside” the network. These tools are often supplemented with threat intelligence, such as STINGAR, to identify malicious external entities engaged in exploiting vulnerabilities or instigating malware or denial of service attacks.

The focus on the “outside-in” attacks often don’t pick up another technique, designed to compromise unsuspecting users accessing websites. These “drive-by” attacks occur when a user visits a website, often a legitimate one, that is serving malicious code through one of the many 3^rd party code integrations on the website (e.g., ad networks). The malicious code is designed to pop up warnings, prompt visitors to install software, or attempt to compromise the visitor’s machine through a vulnerable browser. These attacks are often difficult to protect against even with enterprise solutions such as DNS block lists, web proxies, or secure web gateways. Additionally, attackers have developed tactics to launch these drive-by attacks against targeted populations vs the older technique of attempting to compromise any site visitor.

As security teams take active measures to protect their enterprises from active attacks, we must also know and mitigate passive digital attacks originating from malicious code on websites. Likewise, the owners of these websites must take more active steps to protect visitors and remove malicious 3^rd party code.

Proxywar-e: detecting malicious code in websites

For the past two years, Duke faculty member David Hoffman, students, and the IT Security Office have partnered with Chris Olsen and the Proxywar-e team to develop a solution to “know and mitigate digital attacks targeting Duke; automatically stop attacks; inform digital entities enabling the attackers via 3rd-party code; understand and stop the digital attack vector’s impact on our network and endpoints; simplify attribution for digital attacks.”

The project originally began as a way to identify how attackers are using 3^rd party code on legitimate websites as infrastructure to scam or compromise unsuspecting users. The goal was to introduce the policy recommendations of:

Malicious activities delivered through 3^rd party code on legitimate websites is not a victimless crime
Website owners should be held responsible for allowing their web infrastructure to be misused for malicious means
Those targeted and victimized have a voice and a right to attribute the origination of attacks and expectation that the issues are (LINK TO BLOG??)

The project also demonstrated how Proxywar-e can be a valuable tool for security teams to protect their users AND supply actionable information back to originating websites for remediation.

The solution uses proxy nodes running in virtual containers that browse selected websites several times an hour, mimicking the browsing habits of a “student” or “faculty” member (in the case of Duke). As the agent browses the website, it identifies “normal” or expected web interactions and the not-so-normal interactions (incidents) including:

Compromise attempt
The content delivered has been compromised because it is directly involved in delivering or injecting malicious activity including malicious redirections, popups, fake software updates, and exploit kits.

Click fraud
The content will piggyback extra code which is acting in a fraudulent manner that includes cookie stuffing, click fraud and impression fraud.

Phishing
The content has been seen to auto-redirect to popups and browser hijacks in the form of fake surveys and may also result in redirections to other malicious content.

Scam
The content is believed to be scam-related content enticing users to enter in personal information for retargeting and reselling purposes and/or related to the selling of products which deliver false claims.

Cloaking
A delivery technique used to hide the true intentions. The ad server will deliver a known malicious threat when the correct geography, browser, and/or device are met. If the correct user is not met, the ad delivery will remain clean with no malicious delivery taking place. When correct conditions are met, the user is typically redirected to fake websites and fake advertisements enticing people to click into them for the payload to be delivered (click-bait).

Malicious redirect
The content has been seen consistently redirecting to malicious content.

Software install prompt
The content leads to malicious and unwanted activity in the form of fake software updates which will then install unwanted programs such as toolbars, adware, or other forms of malware onto the user’s computer.

Suspicious
The content matches previous patterns and characteristics of known malicious actors. The flag is placed to discover and locate previous heuristics on new and modified versions of existing threats.

The data collected includes actionable information on the source of the malicious or suspicious 3^rd party code, as well as the actions that were attempted. The information can then be added to threat intelligence lists, DNS block lists, web application firewalls, SIEMs, or other endpoint-related security tools. Additionally, the data can be shared back with the website owner to address the cyber hygiene issue.

Case study

The MediaTrust Digital and Security Operations (DSO) team identified a campaign that “used advanced obfuscated code and delivery patterns to evade signature-based defenses used by publishers.” The campaign, named Ghostcat-3PC by the DSO, has resulted in over 1,100 outbreaks across Europe and the United States since 2019.

The campaign was detected on Duke’s Proxywar-e instances when browsing the website usatoday.com. Using obfuscated JavaScript, the code checked to see if it was running in one of the targeted countries and on an actual web page (particularly a mobile device browser) as opposed to a sandbox environment. It next checked for the presence of an adblocker (presumably to see if the adblocker was detecting the script) and displayed a malicious popup such as the one below. In addition to drive-by phishing attacks, the malware would also attempt to install a trojan named “Android.Xiny.5260” if the browser and device fingerprinted was a mobile android device.

Data analysis

In the two years that Duke has worked with the Proxywar-e team, we have identified several interesting trends for the agents that have run on Duke IP addresses as well as those running at homes connected to commercial-grade ISPs such as Google, AT&T, and Spectrum.

The team captured a total of 25M scans and 50k unique incidents observed across Duke and at-home devices last year (between Jan’21 – Dec’21) with a monthly average of 2.5M scans and 3380 incidents. While total scans reduced 36% from average of 4.5M in Dec’20 to 1.2M Dec’21, average incidents remained roughly consistent throughout last year with a particularly interesting spike of around 7x in July’21. We’ll dive deeper into the reason in the coming figures.

Total scans on left; total incidents on the right

Looking at the number of scans across different networks, Duke vs home devices connected to commercial networks (abbreviated WFH1, WFH2, WFH3), we can see that:

Number of scans across all networks decrease over time.
Monthly scan trends from all three home networks (WFHx) are correlated hinting related activities targeting home devices whereas Duke scans do not follow the same trend. This shows that malicious activities depend on the type of network.
Number of scans from Duke during March’21 increased as opposed to the decrease in scans across all home networks

Increase in scans at Duke vs decrease in scans at home devices in March’21

The incidents can be further categorized by types of malicious activity namely, compromised, scams, suspicious, click fraud, software installs, redirects, phishing, and cloaking as described earlier. The figure below shows the top categories by total scans with software install prompts (36%) and malicious redirects (27%) the most common types of observed malicious incidents.

Percentage of types of malicious incidents

Looking at the trend over time for the above types of activities reveals that rate of scam exhibits the highest monthly increase in July 2021 responsible for the spike we saw earlier in the total incidents chart.

Future opportunities

Proxywar-e represents and intriguing proposal for both enterprise security efforts and cybersecurity research. Proxywar-e agents masquerading as different types of users (young/old, men/women, geographic location, profession, etc.) can discover malicious or fraudulent activities delivered through 3^rd party code installed on websites. This information can be used to update traditional security platforms and threat intelligence lists as well as informing digital entities enabling attackers in near real time.

The data collected could also be a boon to security researchers to discover, categorize, and understand how attackers are making use of websites to target individuals and gain access to systems, accounts, or data. The data collected could be used by researchers testing new detection techniques and defenses to keep pace with attacker techniques.

Defending against malicious code embedded in legitimate websites

The problem

Proxywar-e: detecting malicious code in websites

Case study

Data analysis

Future opportunities

Leave a Reply Cancel reply