Cybersecurity

Magecart Classification Using a Deep Learning Approach

December 30th, 2021 | By Jscrambler | 3 min read

The Jscrambler research team gathered insights about the Magecart Classification system from a deep learning approach.

Strict GDPR laws and regulations make it imperative for web applications to be as safe as possible. Not only from a legal point of view but also from a business perspective, it is advisable to adopt safe procedures since having full control over your application can ensure the business behind it is more profitable.

When it comes to the biggest victims of client-side threats that often result in data and customer loss, the Finance and E-commerce sectors are clearly strong candidates. R

Regarding E-commerce, specifically, web skimmers (e.g., Magecart) are known to illegally capture information related to payments, such as credit card numbers and other critical data.

There are other examples of relevant threats, such as the injection of price comparison ads, but the bottom line is that both the E-commerce and Financial Services sectors are prime targets for attackers looking to steal personal user data and transaction-related information.

Leaving the client-side unprotected represents a breach of compliance with regulations, often leading to huge penalties.

Jscrambler Webpage Integrity Against Client-Side Threats

Answering this key business need of ensuring that the client-side is secure, Jscrambler Webpage Integrity, or WPI, provides a holistic approach that aims to mitigate every client-side threat, including:

Supply chain attacks, such as Magecart and formjacking.
MitB trojans, Bots, 0-day threats & APT (Advanced Persistent Threats).
Customer journey hijacking.

Magecart Detection Experiment

In line with this approach, our team conducted a thorough experiment, addressing the detection of Magecart specifically, under the initiative of the AppOwl project, a cooperation between Jscrambler and INESC TEC.

The project consists of a Magecart classification system using the data generated in-house. To gather this data, our team relied on our WPI-embedded agent to track webpage changes and get details about that change, the stack of function calls that triggered it, and the script that triggered the change, among others.

Our team used a browser plugin to trigger the download of a real Magecart script tailored for each webpage.

For each visit, the browser automation software navigates through a set of webpages that a normal user would typically visit in order to sign in, check out, or pay for items, filling in fields on these webpages.

When the Magecart script is inserted into the browser, it steals the data in those fields and sends it to a local server. But during each webpage request, the Magecart script will run together with the rest of the webpage scripts, and depending on the webpage and the request, different scripts will be running and enacting changes to the webpage. This is where detection comes in.

Our team used two models for detecting Magecart: a dense neural network and a sequential LSTM model (long short-term memory model), as per their default implementations in Keras-TensorFlow.

The models were then trained with different epochs over the training data, for which an additional epoch would result in little improvement in performance.

Experiment Results

The results of the tests show that the classifiers correctly detect a specific set of sequences labeled as Magecart that we can safely interpret as Magecart behavior.

Those sequences include the injection of the script into the document’s head, its download, and execution, followed by the changes (poison) of the “on click” event on different objects in the DOM (e.g., button, div), and finally, the script collects user data and exfiltrates it from the browser.

Our team’s classifier is also able to detect Magecart behavior even when some occurrences are missing because of data capture problems.

After training a model on a given website, our team used it on another website targeted by the same Magecart script to understand if the model could be reused.

The results show that model reuse is feasible, although not all websites seem capable of correctly generalizing to other websites.

The nature of the website and the sequences of occurrences that characterize non-Magecart behavior might influence the classification results when reusing models.

Final Thoughts

Overall, the experiments conducted by Jscrambler’s Research team outline how Jscrambler Webpage Integrity is capable of accurately detecting Magecart behavior in websites and allowing companies to effectively block these behaviors and prevent web skimming attacks regardless of the attack vector.

Interested in getting a copy of the full research? Please contact us.

Jscrambler

The leader in client-side Web security. With Jscrambler, JavaScript applications become self-defensive and capable of detecting and blocking client-side attacks like Magecart.

View All Articles

Must read next

Cybersecurity

Keeping Magecart Off The Holiday Stocking: Quick Guide

In this post, we'll get into the details about Magecart skimmers as well as how to prevent Magecart from infiltrating your website.

December 9, 2021 | By Jscrambler | 3 min read

Learn More

Cybersecurity

How To Prevent Customer Journey Hijacking and Increase Holiday Sales

In this post, we'll explore how Customer Journey Hijacking affects E-commerce retailers, and how they can protect their businesses against this threat.

December 23, 2021 | By Jscrambler | 3 min read

Learn More

Magecart Classification Using a Deep Learning Approach

Jscrambler Webpage Integrity Against Client-Side Threats

Magecart Detection Experiment

Experiment Results

Final Thoughts

Must read next

Keeping Magecart Off The Holiday Stocking: Quick Guide

How To Prevent Customer Journey Hijacking and Increase Holiday Sales

Subscribe to Our Newsletter