Skip to main content

By the Knostic Research Team

The first stage of any attack is Recon. In this deep-dive, we'll show you how we used Shodan and some custom, AI-accelerated Python tools to hunt for exposed Model Context Protocol (MCP) servers. 

The Art of the Digital Fingerprint: H-ow to Spot an MCP Server in the Wild

Instead of focusing on how malicious MCP servers could harm users, we wanted to find out whether legitimate but insecurely configured MCP servers might put their own creators at massive risk. 

To find these servers, we turned to Shodan. Our first major challenge was to teach Shodan what an MCP server looks like. A search for http will return billions of results, but a search for a highly specific, obscure server version might return zero. 

Our mcp_scanner.py script contains an exhaustive list of over 100 Shodan filters, developed through an iterative process of research and refinement. Our key fingerprinting categories included:

  • Protocol Name & Markers: The most direct method involves searching for the literal string "Model Context Protocol", or key components of its language like "jsonrpc": "2.0" and "method": "initialize". These are strong indicators because they are part of the protocol's formal specification. Any compliant server should exhibit these characteristics.
  • Transport Layer Detection: A more subtle, but powerful, technique is to look for SSE headers. We covered this in our previous blog. The Content-Type header for SSE is text/event-stream. By itself, this filter is broad, but when you chain it with others—for example, http.html:"mcp" content:"text/event-stream"—you start to zero in on targets that not only use the right transport but also mention the protocol in their content.
  • Common Endpoint Paths: We leveraged Shodan's ability to index HTML content to search for links and forms pointing to common endpoints like /mcp, /messages, and /api/mcp. A server might have a generic homepage, but if that page contains a link to /mcp/sse, it's a huge clue.
  • Framework & Implementation Detection: This technique looks for the fingerprints of the tools used to build the MCP server itself. For example, a server that returns a Server: uvicorn header is almost certainly running a Python application using the FastAPI or Starlette frameworks. If that same server's banner also contains MCP keywords, we can be highly confident that we've found a Python-based MCP implementation. This helps us understand the underlying tech stack, which can hint at common misconfigurations or vulnerabilities associated with that framework.

The Moment of Truth: From "Maybe" to "Definitely"

The output from our Shodan scanner was a massive, noisy list of potential targets. The next, and most critical, phase of our research was to move from "maybe this is an MCP server" to "this is definitely an MCP server." 

This is where our vibe-coded tool, mcp_func_checker.py, came into play. Its job was singular: take the list of potential targets and attempt to have a polite, protocol-compliant conversation with each one to confirm its identity. We followed a strict, multi-step procedure designed to build an undeniable case without causing harm.

  1. The Handshake (The Front Door Approach): The script's first action is to send a legitimate initialize request to the server's root directory (/). This isn't just a random ping; it's the official MCP way of saying, "Hello, I'm an MCP client, are you an MCP server?" A proper MCP server is required by the protocol specification to understand and respond to this handshake. The request itself contains information about our client ("name": "mcp-scanner"), and a valid response should contain information about the server and its capabilities (e.g., what tools, resources, and protocol versions it supports). A correct, well-formed JSON-RPC response to this specific method is the strongest possible signal of a genuine MCP server.
  2. The Response Validation (Checking the ID): The script doesn't only look for a 200 OK HTTP status. Our script parses the JSON response and validates that it conforms to the MCP specification. This is a deep check. Does it have a "jsonrpc": "2.0" field? Does the id in the response match the id that we sent in the request? Is there a result object in the response, as opposed to an error object? Does that result object contain the expected fields like serverInfo? This level of strict validation weeds out generic web servers, misconfigured APIs, and other noise.
  3. Probing Endpoints (The "Trying Side Doors" Strategy): If the handshake at the root directory fails, we don't give up. Not every developer follows convention. Some might place their MCP listener at a different URL path for legacy reasons, because their framework defaults to it, or even as a weak attempt at "security through obscurity." Our script has a built-in list of common endpoints to try, such as /mcp, /api/mcp, /sse, or /v1/messages. It systematically works through this list, attempting the full handshake and validation process at each one. This makes our check far more robust and less likely to miss a non-standard deployment.

This rigorous, automated, and AI-accelerated verification process was the crucible of our research. It's what allowed us to sift through the mountain of noise from our initial Shodan sweep and distill it down to a clean, high-confidence list of confirmed, publicly exposed MCP servers. This list formed the foundation for the rest of our investigation into the real-world risks.

Knock, Knock... Any Tools Home? The Ethics of Poking Around

It’s not our objective to do harm, so we wanted to make absolutely sure to do no harm. We never triggered the actual functionality of any tool with a tools/call request. Doing so could incur API costs for the owner or manipulate their data. Instead, we performed a "tools invocation” with a simple, safe, read-only request: tools/list. This is the MCP equivalent of asking, "So, what can you do?" without actually asking it to do anything.

If the server responded with a list of available tools, we knew it was not only active but fully configured and ready to accept commands. 

The Elephant in the Room: Why No Auth is a Ticking Time Bomb

This is where our findings shift from a technical curiosity to a critical security warning for anyone running an MCP server. The protocol specification, particularly in earlier versions, did not mandate an authorization mechanism. This has led to a proliferation of insecure deployments that can be exploited in numerous ways, putting the server owners at direct risk.

  • Data Exfiltration and Command Execution: An insecure server is a foothold into an organization's network. An attacker could use a poorly configured tool to read sensitive files (read_file('/etc/passwd')) or, in a worst-case scenario, achieve Remote Code Execution (RCE), giving them complete control of the server.

  • Cost Harvesting: Imagine a tool that connects to your company's AWS account. An attacker finds your open server and writes a simple loop asking the tool to spin up the largest, most expensive GPU instance available... a thousand times. By Monday morning, your cloud bill has a new comma in it, and your CFO is having a panic attack.

  • "Keys to the Kingdom": Attackers may find and extract OAuth tokens, API keys, and database credentials stored on the server, granting them access to all the other services the AI is connected to.

You can read more about our recommendations to secure MCP environments here. 

For every developer, architect, and organization building on or using MCP, this isn't a suggestion; it's an urgent call to action. The responsibility to secure this frontier lies squarely on our shoulders. Here is a detailed guide on how to move from a vulnerable to a hardened posture. 

Now that you can locate MCP endpoints, let’s unpack how they actually talk to GenAI models. Follow our next topic on  “How Model Context Protocol (MCP) Servers Communicate.”

Missed the last installment? Catch up with What Is a “Model Context Protocol” Server in GenAI, our foundation for this series.

bg-shape-download

Learn How to Protect Your Enterprise Data Now!

Knostic delivers an independent, objective assessment, complementing and integrating with Microsoft's own tools.
Assess, monitor and remediate.

folder-with-pocket-mockup-leaned
background for career

What’s next?

Want to solve oversharing in your enterprise AI search? Let's talk.

Knostic offers the most comprehensively holistic and impartial solution for enterprise AI search.

protect icon

Knostic leads the unbiased need-to-know based access controls space, enabling enterprises to safely adopt AI.