Using AI to find logic flaws

How an AI Agent discovered a business logic flaw in a Popular Web Application

Identifying and exploiting logic flaws, in applications, poses a unique challenge because it requires deep insight, intuition, and human ingenuity, something that automated vulnerability scanners lack. With the increasing volume of web applications across the world and a scarcity of skilled pentesters, efforts to detect and exploit these bugs before releasing software are becoming difficult.

We’ve developed an AI penetration tester, called Shinobi, which augments human offensive security engineers with scalability, precision, and creative problem-solving in their web application testing engagements. This post explores how Shinobi skilfully uncovered and exploited a business logic flaw in a complex real-world application — a vulnerability that could be exploited to obtain personal information.

Shinobi’s Capabilities

Shinobi can Explore

It is designed to operate like a seasoned offensive security engineer, autonomously exploring and learning about its target. Unlike tools that rely solely on predefined rules or scans, it interacts with applications dynamically using its browser and it analyses their behavior to spot unusual patterns. Once it has sufficient understanding, it sets high-level goals for itself and begins the process of vulnerability discovery.

Shinobi can Exploit

It emulates human pentesters: it formulates hypotheses about how systems might fail, explores attack vectors that could exploit those weaknesses, and carefully tests its theories in a systematic and calculated way. This approach allows it to go beyond surface-level issues, delving into complex logic paths and uncovering vulnerabilities that would typically require creative and in-depth thinking to identify. It doesn’t merely claim the application is vulnerable, it writes exploits to prove it is.

The Target

Shinobi’s target was a complex web application with a rich user interface. Its purpose was to help users create/edit designs, share them anonymously, and download them when required. Aside from design tools, it had the usual features like login, my account, and search.

Spotting an opportunity

While exploring the application, Shinobi encountered a feature that allowed users to anonymously share their creative work via links.

At first, this appeared to be a straightforward functionality, but from a privacy perspective, it posed a key question: Could these sharing links unintentionally expose the identity of those who created them? Shinobi began by systematically analyzing how the feature was implemented, exploring the potential for information leakage, and finally set itself a mission.

Shinobi stands out from other security scanners and testing tools because it understands an application's purpose and its unique risks. Traditional static code analysis tools and security scanners rely on hardcoded rules to identify vulnerabilities. However, every application is different—a banking app has distinct goals and risks compared to an e-commerce platform. Existing tools fail to recognize these differences because they lack the ability to reason.

From idea to exploit

1. Creating a shared link – Mission Start

Shinobi began its mission by analyzing a shared link, uncovering two key details:

The link was publicly accessible, requiring no login or session verification.
It opened the link in a new browser tab to check for sensitive data leakage but found none.
However, the link structure revealed metadata, including the draft ID—an essential entry point for further investigation. Shinobi identified the ID as a UUID, making it resistant to guessing or brute-force attacks.

2. Looking for alternatives to complete the mission

Since the shared link didn’t expose any sensitive data, Shinobi shifted its focus to other potential clues, starting with the application's create/edit design feature.

It created a new design project in the interactive editor and proceeded to download the design. On the checkout page, it was noticed that the test account’s email was displayed.

After experimenting with various inputs, Shinobi concluded that manipulating them didn’t reveal information from other accounts. Realizing this approach wouldn’t lead to further insights, it decided to try a different strategy.

3. Remembering and Joining the Dots

This is where Shinobi’s reasoning truly shines. It considers a new approach—if it can edit someone else’s design, it might be able to see their personal information on the checkout page.

But first, it needs access to another user's design. That’s when it recalls a key detail: the ID in the shared link’s URL closely resembles the one in the create/edit page’s URL. If it can obtain an anonymous shared link, it could extract the ID and use it on the create/edit page to reveal who created it—effectively deanonymizing the user.

Shinobi’s ability to think and remember, much like a human expert, comes from its cognitive architecture. It stays focused on the task while tracking past actions and patterns. By dynamically pivoting between strategies and leveraging insights gathered throughout the test, it transforms security testing into a more intelligent and adaptive process—a true game changer.

4. Completing the mission - Exploiting the Flaw

At this stage, the only missing piece is an anonymous shared link from an account it doesn’t have access to. So, Shinobi sent us an email requesting one, and we happily obliged.

This highlights another key advantage of Shinobi: its human operator remains in full control, and it isn’t afraid to ask for help when needed. Unlike traditional “fire-and-forget” black-box scanners, Shinobi fosters a more collaborative approach, providing complete visibility into its actions throughout the test.

Once it receives the shared link, Shinobi extracts the ID from the URL and inserts it into its session on the create/edit page. Instantly, a design from the other account appears, confirming that access controls aren’t strictly enforced. But now comes the moment of truth—Shinobi clicks “Download” and, just like that... personal information from the other account is exposed.

Shinobi detects this immediately and declares the mission successful, proving that shared links can indeed be deanonymized.

While the vulnerability wasn’t that catastrophic in nature, the capability of shinobi to think outside the box, to connect different avenues of information it found on its exploration was a breath of fresh air as opposed to the traditional vulnerability scanners which rely solely on their rules, not on their ingenuity to find logic flaws that are notorious to locate. They remain the last bastion of vulnerabilities that threaten applications, and their discovery could not be automated till today.

Stay tuned, we’ll be sharing more of Shinobi’s adventures with you soon enough.

Last updated 4 months ago