All Resources
In this article:
minus iconplus icon
Share the Blog

Why Legacy Data Classification Tools Don't Work Well in the Cloud (But DSPM Does)

September 7, 2023
5
Min Read
Data Security

Data security teams are always trying to understand where their sensitive data is. Yet this goal has remained out of reach for a number of reasons.

The main difficulty is creating a continuously updated data catalog of all production and cloud data. Creating this catalog would involve:

  1.  Identifying everyone in the organization with knowledge of any data stores, with visibility into its contents
  1. Connecting a data classification tool to these data stores
  1. Ensure there’s network connectivity by configuring network and security policies
  1. Confirm that business-critical production systems using each data source won’t be negatively affected, causing damage to performance or availability

Having a process this complex requires a major investment of resources, long workflows, and will still probably not provide the full coverage organizations are looking for. Many so-called successful implementations of such solutions will prove unreliable and too difficult to maintain after a short period of time.

Another pain with a legacy data classification solution is accuracy. Data security professionals are all too aware of the problem of false positives (i.e. wrong classification and data findings) and false negatives (i.e. missing classification of sensitive data that remains unknown). This is mainly due to two reasons.

 

  • Legacy classification solutions rely solely on patterns, such as regular expressions, to identify sensitive data, which falls short in both unstructured data and structured data. 
  • These solutions don’t understand the business context around the data, such as how it is being used, by whom, for what purposes and more.

Without the business context, security teams can’t get any actionable items to remove or protect sensitive data against data risks and security breaches.

Lastly, there’s the reason behind high operational costs. Legacy data classification solutions were not built for the cloud, where each data read/write and network operation has a price tag.

The cloud also offers a much more cost efficient data storage solution and advanced data services that causes organizations to store much more data than they did before moving to the cloud. On the other hand, the public cloud providers also offer a variety of cloud-native APIs and mechanisms that can extremely benefit a data classification and security solution, such as automated backups, cross account federation, direct access to block storage, storage classes, compute instance types, and much more. However, legacy data classification tools, that were not built for the cloud, will completely ignore those benefits and differences, making them an extremely expensive solution for cloud-native organizations.

DSPM: Built to Solve Data Classification in the Cloud 

These challenges have led to the growth of a new approach to securing cloud data - Data Security Posture Management, or DSPM. Sentra’s DSPM  is able to provide full coverage and an up-to-date data catalog with classification of sensitive data, without any complex deployment or operational work involved. This is achieved thanks to a cloud-native agentless architecture, using cloud-native APIs and mechanisms.

A good example of this approach is how Sentra’s DSPM architecture leverages the public cloud mechanism of automated backups for compute instances, block storage, and more. This allows Sentra to securely run its robust discovery and classification technology from within the customer’s premises, in any VPC or subscription/account of the customer’s choice.

This offers a number of benefits:

  1. The organization does not need to change any existing infrastructure configuration, network policies, or security groups.
  1. There’s no need to provide individual credentials for each data source in order for Sentra to discover and scan it.
  1. There is never a performance impact on the actual workloads that are compute-based/bounded, such as virtual machines, that run in production environments. In fact, Sentra’s scanning will never connect via network or application layers to those data stores.

Another benefit of a DSPM built for the cloud is classification accuracy.  Sentra’s DSPM provides an unprecedented level of accuracy thanks to more modern and cloud-native capabilities.This starts with advanced statistical relevance for structured data, enabling our classification engine to understand with high confidence that sensitive data is found within a specific column or field, without scanning every row in a large table.

Sentra leverages even more advanced algorithms for key-value stores and document databases. For unstructured data, the use of AI and LLM -based algorithms unlock tremendous accuracy in understanding and detecting sensitive data types by understanding the context within the data itself. Lastly, the combination of data-centric and identity-centric security approaches provides greater context that allows Sentra’s users to know what actions they should take to remediate data risks when it comes to classification.

Here are two examples of how we apply this context:

1. Different Types of Databases

Personal Identifiable Information (PII) that is found in a database in which only users from the Analytics team have access to, is often a privacy violation and a data risk. On the other hand, PII that is found in a database that only three production microservices have access to is expected,  but requires the data to be isolated within a secure VPC. 

2. Different Access Histories

If 100 employees have access to a sensitive shadow data lake, but only 10 people have actually accessed it in the last year. In this case, the solution would be to reduce permissions and implement stricter access controls. We’d also want to ensure that the data has the right retention policy, to reduce both risks and storage costs. Sentra’s risk score prioritization engine takes multiple data layers into account, including data access permissions, activity, sensitivity, movement and misconfigurations, giving enterprises greater visibility and control over their data risk management processes.

With regards to costs, Sentra’s Data Security Posture Management (DSPM) solution utilizes innovative features that make its scanning and classification solution about two or three orders of magnitude more cost efficient than legacy solutions. The first is the use of smart sampling, where Sentra is able to cluster multiple data units that share the same characteristics, and using intelligent sampling with statistical relevance, understand what sensitive data exists within such data assets that are grouped automatically. This is extremely powerful especially when dealing with data lakes that are often the size of dozens of petabytes, without compromising the solution coverage and accuracy.

Second, Sentra’s modern architecture leverages the benefits of cloud ephemeral resources, such as snapshotting and ephemeral compute workloads with a cloud-native orchestration technology that leverages the elasticity and the scale of the cloud. Sentra balances its resource utilization with the needs of the customer's business, providing advanced scan settings that are built and designed for the cloud. This allows teams to optimize cost according to their business needs, such as determining the frequency and sampling of scans, among more advanced features.

To summarize:

  1. Given the current macroeconomic climate, CISOs should find DSPMs like Sentra as an opportunity to increase their security and minimize their costs
  2. DSPM solutions like Sentra bring an important context - awareness to security teams and tools, allowing them to do better risk management and prioritization by focusing on whats important
  3. Data is likely to continue to be the most important asset of every business, as more organizations embrace the power of the cloud. Therefore, a DSPM will be a pivotal tool in realizing the true value of the data while ensuring it is always secured
  4. Accuracy is key and AI is an enabler for a good data classification tool

<blogcta-big>

Yair brings a wealth of experience in cybersecurity and data product management. In his previous role, Yair led product management at Microsoft and Datadog. With a background as a member of the IDF's Unit 8200 for five years, he possesses over 18 years of expertise in enterprise software, security, data, and cloud computing. Yair has held senior product management positions at Datadog, Digital Asset, and Microsoft Azure Protection.

Subscribe

Latest Blog Posts

Yair Cohen
Yair Cohen
February 5, 2026
3
Min Read

OpenClaw (MoltBot): The AI Agent Security Crisis Enterprises Must Address Now

OpenClaw (MoltBot): The AI Agent Security Crisis Enterprises Must Address Now

OpenClaw, previously known as MoltBot, isn't just another cybersecurity story - it's a wake-up call for every organization. With over 150,000 GitHub stars and more than 300,000 users in just two months, OpenClaw’s popularity signals a huge change: autonomous AI agents are spreading quickly and dramatically broadening the attack surface in businesses. This is far beyond the risks of a typical ChatGPT plugin or a staff member pasting data into a chatbot. These agents live on user machines and servers with shell-level access, file system privileges, live memory control, and broad integration abilities, usually outside IT or security’s purview.

Older perimeter and endpoint security tools weren’t built to find or control agents that can learn, store information, and act independently in all kinds of environments. As organizations face this shadow AI risk, the need for real-time, data-level visibility becomes critical. Enter Data Security Posture Management (DSPM): a way for enterprises to understand, monitor, and respond to the unique threats that OpenClaw and its next-generation kin pose.

What makes OpenClaw different - and uniquely dangerous - for security teams?

OpenClaw runs by setting up a local HTTP server and agent gateway on endpoints. It provides shell access, automates browsers, and links with over 50 messaging platforms. But what really sets it apart is how it combines these features with persistent memory. That means agents can remember actions and data far better than any script or bot before. Palo Alto Networks calls this the 'lethal trifecta': direct access to private data, exposure to untrusted content, communication outside the organization, and persistent memory.

This risk isn't hypothetical. OpenClaw’s skill ecosystem functions like an unguarded software supply chain. Any third-party 'skill' a user adds to an agent can run with full privileges, opening doors to vulnerabilities that original developers can’t foresee. While earlier concerns focused on employees leaking information to public chatbots, tools like OpenClaw operate quietly at system level, often without IT noticing.

From theory to reality: OpenClaw exploitation is active and widespread

This threat is already real. OpenClaw’s design has exposed thousands of organizations to actual attacks. For instance, CVE-2026-25253 is a severe remote code execution flaw caused by a WebSocket validation error, with a CVSS score of 8.8. It lets attackers compromise an agent with a single click (critical OpenClaw vulnerability).

Attackers wasted no time. The ClawHavoc malware campaign, for example, spread over 341 malicious 'skills’, using OpenClaw’s official marketplace to push info-stealers and RATs directly into vulnerable environments. Over 21,000 exposed OpenClaw instances have turned up on the public internet, often protected by nothing stronger than a weak password, or no authentication at all. Researchers even found plaintext password storage in the code. The risk is both immediate and persistent.

The shadow AI dimension: why you’re likely exposed

One of the trickiest parts of OpenClaw and MoltBot is how easily they run outside official oversight. Research shows that more than 22% of enterprise customers have found MoltBot operating without IT approval. Agents connect with personal messaging apps, making it easy for employees to use them on devices IT doesn’t manage, creating blind spots in endpoint management.

This reflects a bigger shift: 68% of employees now access free AI tools using personal accounts, and 57% still paste sensitive data into these services. The risks tied to shadow AI keep rising, and so does the cost of breaches: incidents involving unsanctioned AI tools now average $670,000 higher than those without. No wonder experts at Palo Alto, Straiker, Google Cloud, and Intruder strongly advise enterprises to block or at least closely watch OpenClaw deployments.

Why classic security tools are defenseless - and why DSPM is essential

Despite many advances in endpoint, identity, and network defense, these tools fall short against AI agents such as OpenClaw. Agents often run code with system privileges and communicate independently, sometimes over encrypted or unfamiliar channels. This blinds existing security tools to what internal agent 'skills' are doing or what data they touch and process. The attack surface now includes prompt injection through emails and documents, poisoning of agent memory, delayed attacks, and natural language input that bypasses static scans.

The missing link is visibility: understanding what data any AI agent - sanctioned or shadow - can access, process, or send out. Data Security Posture Management (DSPM) responds to this by mapping what data AI agents can reach, tracing sensitive data to and from agents everywhere they run. Newer DSPM features such as real-time risk scoring, shadow AI discovery, and detailed flow tracking help organizations see and control risks from AI agents at the data layer (Sentra DSPM for AI agent security).

Immediate enterprise action plan: detection, mapping, and control

Security teams need to move quickly. Start by scanning for OpenClaw, MoltBot, and other shadow AI agents across endpoints, networks, and SaaS apps. Once you know where agents are, check which sensitive data they can access by using DSPM tools with AI agent awareness, such as those from Sentra (Sentra’s AI asset discovery). Treat unauthorized installations as active security incidents: reset credentials, investigate activity, and prevent agents from running on your systems following expert recommendations.

For long-term defense, add continuous shadow AI tracking to your operations. Let DSPM keep your data inventory current, trace possible leaks, and set the right controls for every workflow involving AI. Sentra gives you a single place to find all agent activity, see your actual AI data exposure, and take fast, business-aware action.

Conclusion

OpenClaw is simply the first sign of what will soon be a string of AI agent-driven security problems for enterprises. As companies use AI more to boost productivity and automate work, the chance of unsanctioned agents acting with growing privileges and integrations will continue to rise. Gartner expects that by 2028, one in four cyber incidents will stem from AI agent misuse - and attacks have already started to appear in the news.

Success with AI is no longer about whether you use agents like OpenClaw; it’s about controlling how far they reach and what they can do. Old-school defenses can’t keep up with how quickly shadow AI spreads. Only data-focused security, with total AI agent discovery, risk mapping, and ongoing monitoring, can provide the clarity and controls needed for this new world. Sentra's DSPM platform offers precisely that. Take the first steps now: identify your shadow AI risks, map out where your data can go, and make AI agent security a top priority.

<blogcta-big>

Read More
David Stuart
David Stuart
Nikki Ralston
Nikki Ralston
February 4, 2026
3
Min Read

DSPM Dirty Little Secrets: What Vendors Don’t Want You to Test

DSPM Dirty Little Secrets: What Vendors Don’t Want You to Test

Discover  What DSPM Vendors Try to Hide 

Your goal in running a data security/DSPM POV is to evaluate all important performance and cost parameters so you can make the best decision and avoid unpleasant surprises. Vendors, on the other hand, are looking for a ‘quick win’ and will often suggest shortcuts like using a limited test data set and copying your data to their environment.

 On the surface this might sound like a reasonable approach, but if you don’t test real data types and volumes in your own environment, the POV process may hide costly failures or compliance violations that will quickly become apparent in production. A recent evaluation of Sentra versus another top emerging DSPM exposed how the other solution’s performance dropped and costs skyrocketed when deployed at petabyte scale. Worse, the emerging DSPM removed data from the customer environment - a clear controls violation.

If you want to run a successful POV and avoid DSPM buyers' remorse you need to look out for these "dirty little secrets".

Dirty Little Secret #1:
‘Start small’ can mean ‘fails at scale’

The biggest 'dirty secret' is that scalability limits are hidden behind the 'start small' suggestion. Many DSPM platforms cannot scale to modern petabyte-sized data environments. Vendors try to conceal this architectural weakness by encouraging small, tightly scoped POVs that never stress the system and create false confidence. Upon broad deployment, this weakness is quickly exposed as scans slow and refresh cycles stretch, forcing teams to drastically reduce scope or frequency. This failure is fundamentally architectural, lacking parallel orchestration and elastic execution, proving that the 'start small' advice was a deliberate tactic to avoid exposing the platform’s inevitable bottleneck.In a recent POV, Sentra successfully scanned 10x more data in approximately the same time than the alternative:

Dirty Little Secret #2:
High cloud cost breaks continuous security

Another reason some vendors try to limit the scale of POVs is to hide the real cloud cost of running them in production. They often use brute-force scanning that reads excessive data, consumes massive compute resources, and is architecturally inefficient. This is easy to mask during short, limited POVs, but quickly drives up cloud bills in production. The resulting cost pressure forces organizations to reduce scan frequency and scope, quietly shifting the platform from continuous security control to periodic inventory. Ultimately, tools that cannot scale scanners efficiently on-demand or scan infrequently trade essential security for cost, proving they are only affordable when they are not fully utilized. In a recent POV run on 100 petabytes of data, Sentra proved to be 10x more operationally cost effective to run:

Dirty Little Secret #3:
‘Good enough’ accuracy degrades security

Accuracy is fundamental to Data Security Posture Management (DSPM) and should not be compromised. While a few points difference may not seem like a deal breaker, every percentage point of classification accuracy can dramatically affect all downstream security controls. Costs increase as manual intervention is required to address FPs. When organizations automate controls based on these inaccuracies, the DSPM platform becomes a source of risk. Confidence is lost. The secret is kept safe because the POV never validates the platform's accuracy against known sensitive data.

In a recent POV Sentra was able to prove less than one percent rate of false positives and false negatives:

DSPM POV Red Flags 

  • Copy data to the vendor environment for a “quick win”
  • Limit features or capabilities to simplify testing
  • Artificially reduce the size of scanned data
  • Restrict integrations to avoid “complications”
  • Limit or avoid API usage

These shortcuts don’t make a POV easier - they make it misleading.

Four DSPM POV Requirements That Expose the Truth

If you want a DSPM POV that reflects production reality, insist on these requirements:

1. Scalability

Run discovery and classification on at least 1 petabyte of real data, including unstructured object storage. Completion time must be measured in hours or days - not weeks.

2. Cost Efficiency

Operate scans continuously at scale and measure actual cloud resource consumption. If cost forces reduced frequency or scope, the model is unsustainable.

3. Accuracy

Validate results against known sensitive data. Measure false positives and false negatives explicitly. Accuracy must be quantified and repeatable.

4. Unstructured Data Depth

Test long-form, heterogeneous, real-world unstructured data including audio, video, etc. Classification must demonstrate contextual understanding, not just keyword matches.

A DSPM solution that only performs well in a limited POV will lead to painful, costly buyer’s regret. Once in production, the failures in scalability, cost efficiency, accuracy, and unstructured data depth quickly become apparent.

Getting ready to run a DSPM POV? Schedule a demo.

<blogcta-big>

Read More
David Stuart
David Stuart
January 28, 2026
3
Min Read

Data Privacy Day: Why Discovery Isn’t Enough

Data Privacy Day: Why Discovery Isn’t Enough

Data Privacy Day is a good reminder for all of us in the tech world: finding sensitive data is only the first step. But in today’s environment, data is constantly moving -across cloud platforms, SaaS applications, and AI workflows. The challenge isn’t just knowing where your sensitive data lives; it’s also understanding who or what can touch it, whether that access is still appropriate, and how it changes as systems evolve.

I’ve seen firsthand that privacy breaks down not because organizations don’t care, but because access decisions are often disconnected from how data is actually being used. You can have the best policies on paper, but if they aren’t continuously enforced, they quickly become irrelevant.

Discovery is Just the Beginning

Most organizations start with data discovery. They run scans, identify sensitive files, and map out where data lives. That’s an important first step, and it’s necessary, but it’s far from sufficient. Data is not static. It moves, it gets copied, it’s accessed by humans and machines alike. Without continuously governing that access, all the discovery work in the world won’t stop privacy incidents from happening.

The next step, and the one that matters most today, is real-time governance. That means understanding and controlling access as it happens. 

Who can touch this data? Why do they have access? Is it still needed? And crucially, how do these permissions evolve as your environment changes?

Take, for example, a contractor who needs temporary access to sensitive customer data. Or an AI workflow that processes internal HR information. If those access rights aren’t continuously reviewed and enforced, a small oversight can quickly become a significant privacy risk.

Privacy in an AI and Automation Era

AI and automation are changing the way we work with data, but they also change the privacy equation. Automated processes can move and use data in ways that are difficult to monitor manually. AI models can generate insights using sensitive information without us even realizing it. This isn’t a hypothetical scenario, it’s happening right now in organizations of all sizes.

That’s why privacy cannot be treated as a once-a-year exercise or a checkbox in an audit report. It has to be embedded into daily operations, into the way data is accessed, used, and monitored. Organizations that get this right build systems that automatically enforce policies and flag unusual access - before it becomes a problem.

Beyond Compliance: Continuous Responsibility

The companies that succeed in protecting sensitive data are those that treat privacy as a continuous responsibility, not a regulatory obligation. They don’t wait for audits or compliance reviews to take action. Instead, they embed privacy into how data is accessed, shared, and used across the organization.

This approach delivers real results. It reduces risk by catching misconfigurations before they escalate. It allows teams to work confidently with data, knowing that sensitive information is protected. And it builds trust - both internally and with customers because people know their data is being handled responsibly.

A New Mindset for Data Privacy Day

So this Data Privacy Day, I challenge organizations to think differently. The question is no longer “Do we know where our sensitive data is?” Instead, ask:

“Are we actively governing who can touch our data, every moment, everywhere it goes?”

In a world where cloud platforms, AI systems, and automated workflows touch nearly every piece of data, privacy isn’t a one-time project. It’s a continuous practice, a mindset, and a responsibility that needs to be enforced in real time.

Organizations that adopt this mindset don’t just meet compliance requirements, they gain a competitive advantage. They earn trust, strengthen security, and maintain a dynamic posture that adapts as systems and access needs evolve.

Because at the end of the day, true privacy isn’t something you achieve once a year. It’s something you maintain every day, in every process, with every decision. This Data Privacy Day, let’s commit to moving beyond discovery and audits, and make continuous data privacy the standard.

<blogcta-big>

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.