Sentra Expands Data Security Platform with On-Prem Scanners for Hybrid Environments
All Resources
In this article:
minus iconplus icon
Share the Blog

Why Legacy Data Classification Tools Don't Work Well in the Cloud (But DSPM Does)

September 7, 2023
5
Min Read
Data Security

Data security teams are always trying to understand where their sensitive data is. Yet this goal has remained out of reach for a number of reasons.

The main difficulty is creating a continuously updated data catalog of all production and cloud data. Creating this catalog would involve:

  1.  Identifying everyone in the organization with knowledge of any data stores, with visibility into its contents
  1. Connecting a data classification tool to these data stores
  1. Ensure there’s network connectivity by configuring network and security policies
  1. Confirm that business-critical production systems using each data source won’t be negatively affected, causing damage to performance or availability

Having a process this complex requires a major investment of resources, long workflows, and will still probably not provide the full coverage organizations are looking for. Many so-called successful implementations of such solutions will prove unreliable and too difficult to maintain after a short period of time.

Another pain with a legacy data classification solution is accuracy. Data security professionals are all too aware of the problem of false positives (i.e. wrong classification and data findings) and false negatives (i.e. missing classification of sensitive data that remains unknown). This is mainly due to two reasons.

 

  • Legacy classification solutions rely solely on patterns, such as regular expressions, to identify sensitive data, which falls short in both unstructured data and structured data. 
  • These solutions don’t understand the business context around the data, such as how it is being used, by whom, for what purposes and more.

Without the business context, security teams can’t get any actionable items to remove or protect sensitive data against data risks and security breaches.

Lastly, there’s the reason behind high operational costs. Legacy data classification solutions were not built for the cloud, where each data read/write and network operation has a price tag.

The cloud also offers a much more cost efficient data storage solution and advanced data services that causes organizations to store much more data than they did before moving to the cloud. On the other hand, the public cloud providers also offer a variety of cloud-native APIs and mechanisms that can extremely benefit a data classification and security solution, such as automated backups, cross account federation, direct access to block storage, storage classes, compute instance types, and much more. However, legacy data classification tools, that were not built for the cloud, will completely ignore those benefits and differences, making them an extremely expensive solution for cloud-native organizations.

DSPM: Built to Solve Data Classification in the Cloud 

These challenges have led to the growth of a new approach to securing cloud data - Data Security Posture Management, or DSPM. Sentra’s DSPM  is able to provide full coverage and an up-to-date data catalog with classification of sensitive data, without any complex deployment or operational work involved. This is achieved thanks to a cloud-native agentless architecture, using cloud-native APIs and mechanisms.

A good example of this approach is how Sentra’s DSPM architecture leverages the public cloud mechanism of automated backups for compute instances, block storage, and more. This allows Sentra to securely run its robust discovery and classification technology from within the customer’s premises, in any VPC or subscription/account of the customer’s choice.

This offers a number of benefits:

  1. The organization does not need to change any existing infrastructure configuration, network policies, or security groups.
  1. There’s no need to provide individual credentials for each data source in order for Sentra to discover and scan it.
  1. There is never a performance impact on the actual workloads that are compute-based/bounded, such as virtual machines, that run in production environments. In fact, Sentra’s scanning will never connect via network or application layers to those data stores.

Another benefit of a DSPM built for the cloud is classification accuracy.  Sentra’s DSPM provides an unprecedented level of accuracy thanks to more modern and cloud-native capabilities.This starts with advanced statistical relevance for structured data, enabling our classification engine to understand with high confidence that sensitive data is found within a specific column or field, without scanning every row in a large table.

Sentra leverages even more advanced algorithms for key-value stores and document databases. For unstructured data, the use of AI and LLM -based algorithms unlock tremendous accuracy in understanding and detecting sensitive data types by understanding the context within the data itself. Lastly, the combination of data-centric and identity-centric security approaches provides greater context that allows Sentra’s users to know what actions they should take to remediate data risks when it comes to classification.

Here are two examples of how we apply this context:

1. Different Types of Databases

Personal Identifiable Information (PII) that is found in a database in which only users from the Analytics team have access to, is often a privacy violation and a data risk. On the other hand, PII that is found in a database that only three production microservices have access to is expected,  but requires the data to be isolated within a secure VPC. 

2. Different Access Histories

If 100 employees have access to a sensitive shadow data lake, but only 10 people have actually accessed it in the last year. In this case, the solution would be to reduce permissions and implement stricter access controls. We’d also want to ensure that the data has the right retention policy, to reduce both risks and storage costs. Sentra’s risk score prioritization engine takes multiple data layers into account, including data access permissions, activity, sensitivity, movement and misconfigurations, giving enterprises greater visibility and control over their data risk management processes.

With regards to costs, Sentra’s Data Security Posture Management (DSPM) solution utilizes innovative features that make its scanning and classification solution about two or three orders of magnitude more cost efficient than legacy solutions. The first is the use of smart sampling, where Sentra is able to cluster multiple data units that share the same characteristics, and using intelligent sampling with statistical relevance, understand what sensitive data exists within such data assets that are grouped automatically. This is extremely powerful especially when dealing with data lakes that are often the size of dozens of petabytes, without compromising the solution coverage and accuracy.

Second, Sentra’s modern architecture leverages the benefits of cloud ephemeral resources, such as snapshotting and ephemeral compute workloads with a cloud-native orchestration technology that leverages the elasticity and the scale of the cloud. Sentra balances its resource utilization with the needs of the customer's business, providing advanced scan settings that are built and designed for the cloud. This allows teams to optimize cost according to their business needs, such as determining the frequency and sampling of scans, among more advanced features.

To summarize:

  1. Given the current macroeconomic climate, CISOs should find DSPMs like Sentra as an opportunity to increase their security and minimize their costs
  2. DSPM solutions like Sentra bring an important context - awareness to security teams and tools, allowing them to do better risk management and prioritization by focusing on whats important
  3. Data is likely to continue to be the most important asset of every business, as more organizations embrace the power of the cloud. Therefore, a DSPM will be a pivotal tool in realizing the true value of the data while ensuring it is always secured
  4. Accuracy is key and AI is an enabler for a good data classification tool

<blogcta-big>

Yair brings a wealth of experience in cybersecurity and data product management. In his previous role, Yair led product management at Microsoft and Datadog. With a background as a member of the IDF's Unit 8200 for five years, he possesses over 18 years of expertise in enterprise software, security, data, and cloud computing. Yair has held senior product management positions at Datadog, Digital Asset, and Microsoft Azure Protection.

Subscribe

Latest Blog Posts

Ofir Yehoshua
Ofir Yehoshua
November 17, 2025
4
Min Read

How to Gain Visibility and Control in Petabyte-Scale Data Scanning

How to Gain Visibility and Control in Petabyte-Scale Data Scanning

Every organization today is drowning in data - millions of assets spread across cloud platforms, on-premises systems, and an ever-expanding landscape of SaaS tools. Each asset carries value, but also risk. For security and compliance teams, the mandate is clear: sensitive data must be inventoried, managed and protected.

Scanning every asset for security and compliance is no longer optional, it’s the line between trust and exposure, between resilience and chaos.

Many data security tools promise to scan and classify sensitive information across environments. In practice, doing this effectively and at scale, demands more than raw ‘brute force’ scanning power. It requires robust visibility and management capabilities: a cockpit view that lets teams monitor coverage, prioritize intelligently, and strike the right balance between scan speed, cost, and accuracy.

Why Scan Tracking Is Crucial

Scanning is not instantaneous. Depending on the size and complexity of your environment, it can take days - sometimes even weeks to complete. Meanwhile, new data is constantly being created or modified, adding to the challenge.

Without clear visibility into the scanning process, organizations face several critical obstacles:

  • Unclear progress: It’s often difficult to know what has already been scanned, what is currently in progress, and what remains pending. This lack of clarity creates blind spots that undermine confidence in coverage.

  • Time estimation gaps: In large environments, it’s hard to know how long scans will take because so many factors come into play — the number of assets, their size, the type of data - structured, semi-structured, or unstructured, and how much scanner capacity is available. As a result, predicting when you’ll reach full coverage is tricky. This becomes especially stressful when scans need to be completed before a fixed deadline, like a compliance audit. 

    "With Sentra’s Scan Dashboard, we were able to quickly scale up our scanners to meet a tight audit deadline, finish on time, and then scale back down to save costs. The visibility and control it gave us made the whole process seamless”, said CISO of Large Retailer.
  • Poor prioritization: Not all environments or assets carry the same importance. Yet without visibility into scan status, teams struggle to balance historical scans of existing assets with the ongoing influx of newly created data, making it nearly impossible to prioritize effectively based on risk or business value.

Sentra’s End-to-End Scanning Workflow

Managing scans at petabyte scale is complex. Sentra streamlines the process with a workflow built for scale, clarity, and control that features:

1. Comprehensive Asset Discovery

Before scanning even begins, Sentra automatically discovers assets across cloud platforms, on-premises systems, and SaaS applications. This ensures teams have a complete, up-to-date inventory and visual map of their data landscape, so no environment or data store is overlooked.

Example: New S3 buckets, a freshly deployed BigQuery dataset, or a newly connected SharePoint site are automatically identified and added to the inventory.

Comprehensive Asset Discovery with Sentra

2. Configurable Scan Management

Administrators can fine-tune how scans are executed to meet their organization’s needs. With flexible configuration options, such as number of scanners, sampling rates, and prioritization rules - teams can strike the right balance between scan speed, coverage, and cost control.

For instance, compliance-critical assets can be scanned at full depth immediately, while less critical environments can run at reduced sampling to save on compute consumption and costs.

3. Real-Time Scan Dashboard

Sentra’s unified Scan Dashboard provides a cockpit view into scanning operations, so teams always know where they stand. Key features include:

  • Daily scan throughput correlated with the number of active scanners, helping teams understand efficiency and predict completion times.
  • Coverage tracking that visualizes overall progress and highlights which assets remain unscanned.
  • Decision-making tools that allow teams to dynamically adjust, whether by adding scanner capacity, changing sampling rates, or reordering priorities when new high-risk assets appear.
Real-Time Scan Dashboard with Sentra

Handling Data Changes

The challenge doesn’t end once the initial scans are complete. Data is dynamic, new files are added daily, existing records are updated, and sensitive information shifts locations. Sentra’s activity feeds give teams the visibility they need to understand how their data landscape is evolving and adapt their data security strategies in real time.


Conclusion

Tracking scan status at scale is complex but critical to any data security strategy. Sentra provides an end-to-end view and unmatched scan control, helping organizations move from uncertainty to confidence with clear prediction of scan timelines, faster troubleshooting, audit-ready compliance, and smarter, cost-efficient decisions for securing data.

<blogcta-big>

Read More
Ward Balcerzak
Ward Balcerzak
November 12, 2025
4
Min Read
Data Security

Best DSPM Tools: Top 9 Vendors Compared

Best DSPM Tools: Top 9 Vendors Compared

Enhanced DSPM Adoption Is the Most Important Data Security Trend of 2026

Over the past few years, organizations have realized that traditional security tools can’t keep pace with how data moves and grows today. Exploding volumes of sensitive data now flourish across multi-cloud environments, SaaS platforms, and AI systems, often without full visibility by the teams responsible for securing it. Unstructured data presents the greatest risk - representing over 80% of corporate data.

That’s why Data Security Posture Management (DSPM) has become a critical part of the modern security stack. DSPM tools help organizations automatically discover, classify, monitor, and protect sensitive data - no matter where it lives or travels.

But in 2026, the data security game is changing. Many DSPMs can tell you what your data is,  but more is needed. Leading DSPM platforms are going beyond visibility. They’re delivering real-time AI-enhanced contextual business insights, automated remediation, and AI-aware accurate protection that scales with your dynamic data.

AI-enhanced DSPM Capabilities in 2026

Not all DSPM tools are built the same. The top platforms share a few key traits that define the next generation of data security posture management:

Capability Why It Matters
Continuous discovery and classification at scale Real-time visibility into all sensitive data across cloud, SaaS, and on-prem systems. Efficiency, at petabyte scale, to allow for scanning frequency commensurate with business risk.
Contextual risk analysis Understanding what data is sensitive, who can access it, and how it’s being used. Understanding the business context around data so that appropriate actions can be taken.
Automated remediation Native capabilities and Integration with systems that correct risky configurations or excessive access automatically.
Integration and scalability Seamless connections to CSPM, SIEM, IAM, ITSM, and SOAR tools to unify data risk management and streamline workflows.
AI and model governance Capabilities to secure data used in GenAI agents, copilot assistants, and pipelines.

Top DSPM Tools to Watch in 2026

Based on recent analyst coverage, market growth, and innovation across the industry, here are the top DSPM platforms to watch this year, each contributing to how data security is evolving.

1. Sentra

As a cloud-native DSPM platform, Sentra focuses on continuous data protection, not just visibility. It discovers and accurately classifies sensitive data in real time across all cloud environments, while automatically remediating risks through policy-driven automation.

What sets Sentra apart:

  • Continuous, automated discovery and classification across your entire data estate - cloud, SaaS, and on-premises.
  • Business Contextual insights that understand the purpose of data, accurately linking data, identity, and risk.
  • Automatic learning to discern customer unique data types and continuously improve labeling over time.
  • Petabyte scaling and low compute consumption for 10X cost efficiency.
  • Automated remediation workflows and integrations to fix issues instantly.
  • Built-in coverage for data flowing through AI and SaaS ecosystems.

Ideal for: Security teams looking for a cloud-native DSPM platform built for scalability in the AI era with automation at its core.

2. BigID

A pioneer in data discovery and classification, BigID bridges DSPM and privacy governance, making it a good choice for compliance-heavy sectors.


Ideal for: Organizations prioritizing data privacy, governance, and audit readiness.

3. Prisma Cloud (Palo Alto Networks)

Prisma’s DSPM offering integrates closely with CSPM and CNAPP components, giving security teams a single pane of glass for infrastructure and data risk.


Ideal for: Enterprises with hybrid or multi-cloud infrastructures already using Palo Alto tools.

4. Microsoft Purview / Defender DSPM

Microsoft continues to invest heavily in DSPM through Purview, offering rich integration with Microsoft 365 and Azure ecosystems. Note: Sentra integrates with Microsoft Purview Information Protection (MPIP) labeling and DLP policies.

Ideal for: Microsoft-centric organizations seeking native data visibility and compliance automation.

5. Securiti.ai

Positioned as a “Data Command Center,” Securiti unifies DSPM, privacy, and governance. Its strength lies in automation and compliance visibility and SaaS coverage.


Ideal for: Enterprises looking for an all-in-one governance and DSPM solution.

6. Cyera

Cyera has gained attention for serving the SMB segment with its DSPM approach. It uses LLMs for data context, supplementing other classification methods, and provides integrations to IAM and other workflow tools.


Ideal for: Small/medium growing companies that need basic DSPM functionality.

7. Wiz

Wiz continues to lead in cloud security, having added DSPM capabilities into its CNAPP platform. They’re known for deep multi-cloud visibility and infrastructure misconfiguration detection.

Ideal for: Enterprises running complex cloud environments looking for infrastructure vulnerability and misconfiguration management.

8. Varonis

Varonis remains a strong player for hybrid and on-prem data security, with deep expertise in permissions and access analytics and focus on SaaS/unstructured data.


Ideal for: Enterprises with legacy file systems or mixed cloud/on-prem architectures.

9. Netwrix

Netwrix’s platform incorporates DSPM-related features into its auditing and access control suite.

Ideal for: Mid-sized organizations seeking DSPM as part of a broader compliance solution.

Emerging DSPM Trends to Watch in 2026

  1. AI Data Security: As enterprises adopt GenAI, DSPM tools are evolving to secure data used in training and inference.

  2. Identity-Centric Risk: Understanding and controlling both human and machine identities is now central to data posture.

  3. Automation-Driven Security: Remediation workflows are becoming the differentiator between “good” and “great.”

Market Consolidation: Expect to see CNAPP, legacy security, and cloud vendors acquiring DSPM startups to strengthen their coverage.

How to Choose the Right DSPM Tool

When evaluating a DSPM solution, align your choice with your data landscape and goals:

  • Cloud-Native Company Choose tools designed for cloud-first environments (like Sentra, Securiti, Wiz).
  • Compliance Priority Platforms like Sentra, BigID or Securiti excel in privacy and governance.
  • Microsoft-Heavy Stack Purview and Sentra DSPM offer native integration.
  • Hybrid Environment Consider Varonis, Prisma Cloud, or Sentra for extended visibility.
  • Enterprise Scalability Evaluate deployment ease, petabyte scalability, cloud resource consumption, scanning efficiency, etc. (Sentra excels here)

*Pro Tip: Run a proof of concept (POC) across multiple environments to test scalability, accuracy, and operational cost effectiveness before full deployment.

Final Thoughts: DSPM Is About Action

The best DSPM tools in 2026 share one core principle, they help organizations move from visibility to action.

At Sentra, we believe that the future of DSPM lies in continuous, automated data protection:

  • Real-time discovery of sensitive data @ scale
  • Context-aware prioritization for business insight
  • Automated remediation that reduces risk instantly

As data continues to power AI, analytics, and innovation, DSPM ensures that innovation never comes at the cost of security. See how Sentra helps leading enterprises protect data across multi-cloud and SaaS environments.

<blogcta-big>

Read More
Gilad Golani
Gilad Golani
November 6, 2025
4
Min Read

How SLMs (Small Language Models) Make Sentra’s AI Faster and More Accurate

How SLMs (Small Language Models) Make Sentra’s AI Faster and More Accurate

The LLM Hype, and What’s Missing

Over the past few years, large language models (LLMs) have dominated the AI conversation. From writing essays to generating code, LLMs like GPT-4 and Claude have proven that massive models can produce human-like language and reasoning at scale.

But here's the catch: not every task needs a 70-billion-parameter model. Parameters are computationally expensive - they require both memory and processing time.

At Sentra, we discovered early on that the work our customers rely on for accurate, scalable classification of massive data flows - isn’t about writing essays or generating text. It’s about making decisions fast, reliably, and cost-effectively across dynamic, real-world data environments. While large language models (LLMs) are excellent at solving general problems, it creates a lot of unnecessary computational overhead.

That’s why we’ve shifted our focus toward Small Language Models (SLMs) - compact, specialized models purpose-built for a single task - understanding and classifying data efficiently. By running hundreds of SLMs in parallel on regular CPUs, Sentra can deliver faster insights, stronger data privacy, and a dramatically lower total cost of AI-based classification that scales with their business, not their cloud bill.

What Is an SLM?

An SLM is a smaller, domain-specific version of a language model. Instead of trying to understand and generate any kind of text, an SLM is trained to excel at a particular task, such as identifying the topic of a document (what the document is about or what type of document it is), or detecting sensitive entities within documents, such as passwords, social security numbers, or other forms of PII.

In other words: If an LLM is a generalist, an SLM is a specialist. At Sentra, we use SLMs that are tuned and optimized for security data classification, allowing them to process high volumes of content with remarkable speed, consistency, and precision. These SLMs are based on standard open source models, but trained with data that was curated by Sentra, to achieve the level of accuracy that only Sentra can guarantee.

From LLMs to SLMs: A Strategic Evolution

Like many in the industry, we started by testing LLMs to see how well they could classify and label data. They were powerful, but also slow, expensive, and difficult to scale. Over time, it became clear: LLMs are too big and too expensive to run on customer data for Sentra to be a viable, cost effective solution for data classification.

Each SLM handles a focused part of the process: initial categorization, text extraction from documents and images, and sensitive entity classification. The SLMs are not only accurate (even more accurate than LLMs classifying using prompts) - they can run on standard CPUs efficiently, and they run inside the customer’s environment, as part of Sentra’s scanners.

The Benefits of SLMs for Customers

a. Speed and Efficiency

SLMs process data faster because they’re lean by design. They don’t waste cycles generating full sentences or reasoning across irrelevant contexts. This means real-time or near-real-time classification, even across millions of data points.

b. Accuracy and Adaptability

SLMs are pre-trained “zero-shot” language models that can categorize and classify generically, without the need to pre-train on a specific task in advance. This is the meaning of “zero shot” - it means that regardless of the data it was trained on, the model can classify an arbitrary set of entities and document labels without training on each one specifically. This is possible due to the fact that language models are very advanced, and they are able to capture deep natural language understanding at the training stage.

Regardless of that, Sentra fine tunes these models to further increase the accuracy of the classification, by curating a very large set of tagged data that resembles the type of data that our customers usually run into.

Our feedback loops ensure that model performance only gets better over time - a direct reflection of our customers’ evolving environments.

c. Cost and Sustainability

Because SLMs are compact, they require less compute power, which means lower operational costs and a smaller carbon footprint. This efficiency allows us to deliver powerful AI capabilities to customers without passing on the heavy infrastructure costs of running massive models.

d. Security and Control

Unlike LLMs hosted on external APIs, SLMs can be run within Sentra’s secure environment, preserving data privacy and regulatory compliance. Customers maintain full control over their sensitive information - a critical requirement in enterprise data security.

A Quick Comparison: SLMs vs. LLMs

The difference between SLMs and LLMs becomes clear when you look at their performance across key dimensions:

Factor SLMs LLMs
Speed Fast, optimized for classification throughput Slower and more compute-intensive for large-scale inference
Cost Cost-efficient Expensive to run at scale
Accuracy (for simple tasks) Optimized for classification Comparable but unnecessary overhead
Deployment Lightweight, easy to integrate Complex and resource-heavy
Adaptability (with feedback) Continuously fine-tuned, ability to fine tune per customer Harder to customize, fine-tuning costly
Best Use Case Classification, tagging, filtering Reasoning and analysis, generation, synthesis

Continuous Learning: How Sentra’s SLMs Grow

One of the most powerful aspects of our SLM approach is continuous learning. Each Sentra customer project contributes valuable insights, from new data patterns to evolving classification needs. These learnings feed back into our training workflows, helping us refine and expand our models over time.

While not every model retrains automatically, the system is built to support iterative optimization: as our team analyzes feedback and performance, models can be fine-tuned or extended to handle new categories and contexts.

The result is an adaptive ecosystem of SLMs that becomes more effective as our customer base and data diversity grow, ensuring Sentra’s AI remains aligned with real-world use cases.

Sentra’s Multi-SLM Architecture

Sentra’s scanning technology doesn’t rely on a single model. We run many SLMs in parallel, each specializing in a distinct layer of classification:

  1. Embedding models that convert data into meaningful vector representations
  2. Entity Classification models that label sensitive entities
  3. Document Classification models that label documents by type
  4. Image-to-text and speech-to-text models that are able to process non-textual data into textual data

This layered approach allows us to operate at scale - quickly, cheaply, and with great results. In practice, that means faster insights, fewer errors, and a more responsive platform for every customer.

The Future of AI Is Specialized

We believe the next frontier of AI isn’t about who can build the biggest model, it’s about who can build the most efficient, adaptive, and secure ones.

By embracing SLMs, Sentra is pioneering a future where AI systems are purpose-built, transparent, and sustainable. Our approach aligns with a broader industry shift toward task-optimized intelligence - models that do one thing extremely well and can learn continuously over time.

Conclusion: The Power of Small

At Sentra, we’ve learned that in AI, bigger isn’t always better. Our commitment to SLMs reflects our belief that efficiency, adaptability, and precision matter most for customers. By running thousands of small, smart models rather than a single massive one, we’re able to classify data faster, cheaper, and with greater accuracy - all while ensuring customer privacy and control.

In short: Sentra’s SLMs represent the power of small, and the future of intelligent classification.

<blogcta-big>

Read More
decorative ball
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

Gartner Certificate for Sentra