Sentra Expands Data Security Platform with On-Prem Scanners for Hybrid Environments
All Resources
In this article:
minus iconplus icon
Share the Blog

How Contextual Data Classification Complements Your Existing DLP

August 12, 2024
3
Min Read
Data Security

Using data loss prevention (DLP) technology is a given for many organizations. Because these solutions have historically been the best way to prevent data exposure, many organizations already have DLP solutions deeply entrenched within their infrastructure and security systems to assist with data discovery and classification.

However, as we discussed in a previous blog post about embracing cloud DLP and DSPM, traditional DLP often struggles to keep up with disparate cloud environments and the sheer volume of data that comes with them. As a result, many teams experience false alarms and alert fatigue — not to mention extensive manual tuning — as they try to get their DLP solutions to work with their cloud-based or hybrid data ecosystems. However, simply ripping out and replacing these solutions isn’t an option for most organizations, as they are costly and play such a significant role in security programs.

 

Many organizations need a complementary solution instead of a replacement for their DLP — something that will improve the effectiveness and accuracy of their existing data discovery and “border control” security technologies.

Contextual data classification can play this role with cloud-aware functionality that can discover all data, identify what data is at risk, and gauge the actions that cloud users take and differentiate between routine activities and anomalies that could indicate actual threats. This can then be used to better harden the policies and controls governing data movement.

Why Cloud Data Security Requires More than DLP

While traditional data loss prevention (DLP) technology plays an integral role in many businesses’ data security approaches, it can start to falter when used within a cloud environment. Why? DLP uses pre-defined patterns to detect suspicious activity. Often, this doesn’t work in the context of regular cloud activities. Here are the two main ways that DLP conflicts with the cloud:

Perimeter-Based Security Controls

DLP was originally created for on-premise environments with a clearly defensible perimeter. A DLP solution can only see general patterns, such as a file getting sent, shared, or copied, and cannot capture nuanced information beyond this. So, a DLP solution often flags routine activities (e.g., sharing data with third-party applications) as suspicious in the data discovery process. When the DLP blocks these everyday actions, it impedes business velocity and alerts the security team needlessly.

In modern cloud-first organizations, data needs to move freely to / from the cloud in order to meet dynamic business demands. DLP often is too restrictive (or, conversely, too permissive) since it lacks a fundamental understanding of the data sensitivity and only sees data when it moves. As a result, it misses the opportunity to protect data at rest. If too restrictive, it can disrupt business. If too permissive, it can miss numerous insider, supply chain, or other threats that look like authorized activity to the DLP.

Limited Classification Engines

The classification engines built into traditional DLPs are limited to common data types, such as social security or credit card numbers. As a result, they can miss nuanced, sensitive data, which is more common in a cloud ecosystem. For example, passport numbers stored alongside the passport holders’ names could pose a risk if exposed, while either the names or numbers on their own are not a risk. Or, DLP solutions could miss intellectual property or trade secrets, a form of data that wasn’t even stored online twenty years ago but is now prevalent in cloud environments.

Data unique to the industry or specific business may also be missed if proper classifiers don’t detect it. The ability to tailor classifiers for these proprietary data types is very important (but often absent in commercial DLP offerings!)

Because of these limitations, many businesses see a gap between traditional DLP solutions' discovery and classification patterns and the realities of a multi-cloud and/or hybrid data estate.

Existing DLP solutions ultimately can’t comprehend what’s going on within a cloud environment because they don’t understand the following pieces of information:

  • Where sensitive data exists, whether within structured or unstructured data. 
  • Who uses it and how they use it in an everyday business context. 
  • Which data is likely sensitive because of its origins, neighboring data, or other unique characteristics.

Without this information, the DLP technology will likely flag non-risky actions as suspicious (e.g., blocking services in IaaS/PaaS environments) and overlook legitimate threats (e.g., exfiltration of unstructured sensitive data). 

Improve Data Security with Sentra’s Contextual Data Classification

Adding contextual data classification to your DLP can provide this much-needed context. Sentra’s DSPM solutionoffers data classification functionality that can work alongside or feed your existing DLP technology. We leverage LLM-based algorithms to accurately understand the context of where and how data is used, then detect when any sensitive data is misplaced or misused based on this information. Applicable sensitivity tags can be sent via API directly to the DLP solution for actioning. 

When you integrate Sentra into your existing DLP solution, our classification engine will tag and label files, and then add this rich, contextual information as metadata.

 

Here are some examples of how our technology complements and extends the abilities of DLP solutions:

  1. Sentra can discover nuanced proprietary, sensitive data and detect new identifiers such as “transaction ID” or “intellectual property.” 
  2. Sentra can use exact data matching to detect whether data was partially copied from production and flag it as sensitive.
  3. Sentra can detect when a given file likely contains business context because of its owner, location, etc. For example, a file taken from the CEO’s Google Drive or from a customer’s data lake can be assumed to be sensitive.  

In addition, we offer a simple, agentless deployment and prioritize the security of your data by keeping it all within your environment during scanning.

Watch a one-minute video to learn more about how Sentra discovers and classifies nuanced, sensitive data in a cloud environment.

<blogcta-big>

Roy Levine is the VP R&D at Sentra. He brings nearly 20 years of experience in engineering, data, AI, and a strong background in senior management across startups and enterprises.

Subscribe

Latest Blog Posts

Ofir Yehoshua
Ofir Yehoshua
November 17, 2025
4
Min Read

How to Gain Visibility and Control in Petabyte-Scale Data Scanning

How to Gain Visibility and Control in Petabyte-Scale Data Scanning

Every organization today is drowning in data - millions of assets spread across cloud platforms, on-premises systems, and an ever-expanding landscape of SaaS tools. Each asset carries value, but also risk. For security and compliance teams, the mandate is clear: sensitive data must be inventoried, managed and protected.

Scanning every asset for security and compliance is no longer optional, it’s the line between trust and exposure, between resilience and chaos.

Many data security tools promise to scan and classify sensitive information across environments. In practice, doing this effectively and at scale, demands more than raw ‘brute force’ scanning power. It requires robust visibility and management capabilities: a cockpit view that lets teams monitor coverage, prioritize intelligently, and strike the right balance between scan speed, cost, and accuracy.

Why Scan Tracking Is Crucial

Scanning is not instantaneous. Depending on the size and complexity of your environment, it can take days - sometimes even weeks to complete. Meanwhile, new data is constantly being created or modified, adding to the challenge.

Without clear visibility into the scanning process, organizations face several critical obstacles:

  • Unclear progress: It’s often difficult to know what has already been scanned, what is currently in progress, and what remains pending. This lack of clarity creates blind spots that undermine confidence in coverage.

  • Time estimation gaps: In large environments, it’s hard to know how long scans will take because so many factors come into play — the number of assets, their size, the type of data - structured, semi-structured, or unstructured, and how much scanner capacity is available. As a result, predicting when you’ll reach full coverage is tricky. This becomes especially stressful when scans need to be completed before a fixed deadline, like a compliance audit. 

    "With Sentra’s Scan Dashboard, we were able to quickly scale up our scanners to meet a tight audit deadline, finish on time, and then scale back down to save costs. The visibility and control it gave us made the whole process seamless”, said CISO of Large Retailer.
  • Poor prioritization: Not all environments or assets carry the same importance. Yet without visibility into scan status, teams struggle to balance historical scans of existing assets with the ongoing influx of newly created data, making it nearly impossible to prioritize effectively based on risk or business value.

Sentra’s End-to-End Scanning Workflow

Managing scans at petabyte scale is complex. Sentra streamlines the process with a workflow built for scale, clarity, and control that features:

1. Comprehensive Asset Discovery

Before scanning even begins, Sentra automatically discovers assets across cloud platforms, on-premises systems, and SaaS applications. This ensures teams have a complete, up-to-date inventory and visual map of their data landscape, so no environment or data store is overlooked.

Example: New S3 buckets, a freshly deployed BigQuery dataset, or a newly connected SharePoint site are automatically identified and added to the inventory.

Comprehensive Asset Discovery with Sentra

2. Configurable Scan Management

Administrators can fine-tune how scans are executed to meet their organization’s needs. With flexible configuration options, such as number of scanners, sampling rates, and prioritization rules - teams can strike the right balance between scan speed, coverage, and cost control.

For instance, compliance-critical assets can be scanned at full depth immediately, while less critical environments can run at reduced sampling to save on compute consumption and costs.

3. Real-Time Scan Dashboard

Sentra’s unified Scan Dashboard provides a cockpit view into scanning operations, so teams always know where they stand. Key features include:

  • Daily scan throughput correlated with the number of active scanners, helping teams understand efficiency and predict completion times.
  • Coverage tracking that visualizes overall progress and highlights which assets remain unscanned.
  • Decision-making tools that allow teams to dynamically adjust, whether by adding scanner capacity, changing sampling rates, or reordering priorities when new high-risk assets appear.
Real-Time Scan Dashboard with Sentra

Handling Data Changes

The challenge doesn’t end once the initial scans are complete. Data is dynamic, new files are added daily, existing records are updated, and sensitive information shifts locations. Sentra’s activity feeds give teams the visibility they need to understand how their data landscape is evolving and adapt their data security strategies in real time.


Conclusion

Tracking scan status at scale is complex but critical to any data security strategy. Sentra provides an end-to-end view and unmatched scan control, helping organizations move from uncertainty to confidence with clear prediction of scan timelines, faster troubleshooting, audit-ready compliance, and smarter, cost-efficient decisions for securing data.

<blogcta-big>

Read More
Ward Balcerzak
Ward Balcerzak
November 12, 2025
4
Min Read
Data Security

Best DSPM Tools: Top 9 Vendors Compared

Best DSPM Tools: Top 9 Vendors Compared

Enhanced DSPM Adoption Is the Most Important Data Security Trend of 2026

Over the past few years, organizations have realized that traditional security tools can’t keep pace with how data moves and grows today. Exploding volumes of sensitive data now flourish across multi-cloud environments, SaaS platforms, and AI systems, often without full visibility by the teams responsible for securing it. Unstructured data presents the greatest risk - representing over 80% of corporate data.

That’s why Data Security Posture Management (DSPM) has become a critical part of the modern security stack. DSPM tools help organizations automatically discover, classify, monitor, and protect sensitive data - no matter where it lives or travels.

But in 2026, the data security game is changing. Many DSPMs can tell you what your data is,  but more is needed. Leading DSPM platforms are going beyond visibility. They’re delivering real-time AI-enhanced contextual business insights, automated remediation, and AI-aware accurate protection that scales with your dynamic data.

AI-enhanced DSPM Capabilities in 2026

Not all DSPM tools are built the same. The top platforms share a few key traits that define the next generation of data security posture management:

Capability Why It Matters
Continuous discovery and classification at scale Real-time visibility into all sensitive data across cloud, SaaS, and on-prem systems. Efficiency, at petabyte scale, to allow for scanning frequency commensurate with business risk.
Contextual risk analysis Understanding what data is sensitive, who can access it, and how it’s being used. Understanding the business context around data so that appropriate actions can be taken.
Automated remediation Native capabilities and Integration with systems that correct risky configurations or excessive access automatically.
Integration and scalability Seamless connections to CSPM, SIEM, IAM, ITSM, and SOAR tools to unify data risk management and streamline workflows.
AI and model governance Capabilities to secure data used in GenAI agents, copilot assistants, and pipelines.

Top DSPM Tools to Watch in 2026

Based on recent analyst coverage, market growth, and innovation across the industry, here are the top DSPM platforms to watch this year, each contributing to how data security is evolving.

1. Sentra

As a cloud-native DSPM platform, Sentra focuses on continuous data protection, not just visibility. It discovers and accurately classifies sensitive data in real time across all cloud environments, while automatically remediating risks through policy-driven automation.

What sets Sentra apart:

  • Continuous, automated discovery and classification across your entire data estate - cloud, SaaS, and on-premises.
  • Business Contextual insights that understand the purpose of data, accurately linking data, identity, and risk.
  • Automatic learning to discern customer unique data types and continuously improve labeling over time.
  • Petabyte scaling and low compute consumption for 10X cost efficiency.
  • Automated remediation workflows and integrations to fix issues instantly.
  • Built-in coverage for data flowing through AI and SaaS ecosystems.

Ideal for: Security teams looking for a cloud-native DSPM platform built for scalability in the AI era with automation at its core.

2. BigID

A pioneer in data discovery and classification, BigID bridges DSPM and privacy governance, making it a good choice for compliance-heavy sectors.


Ideal for: Organizations prioritizing data privacy, governance, and audit readiness.

3. Prisma Cloud (Palo Alto Networks)

Prisma’s DSPM offering integrates closely with CSPM and CNAPP components, giving security teams a single pane of glass for infrastructure and data risk.


Ideal for: Enterprises with hybrid or multi-cloud infrastructures already using Palo Alto tools.

4. Microsoft Purview / Defender DSPM

Microsoft continues to invest heavily in DSPM through Purview, offering rich integration with Microsoft 365 and Azure ecosystems. Note: Sentra integrates with Microsoft Purview Information Protection (MPIP) labeling and DLP policies.

Ideal for: Microsoft-centric organizations seeking native data visibility and compliance automation.

5. Securiti.ai

Positioned as a “Data Command Center,” Securiti unifies DSPM, privacy, and governance. Its strength lies in automation and compliance visibility and SaaS coverage.


Ideal for: Enterprises looking for an all-in-one governance and DSPM solution.

6. Cyera

Cyera has gained attention for serving the SMB segment with its DSPM approach. It uses LLMs for data context, supplementing other classification methods, and provides integrations to IAM and other workflow tools.


Ideal for: Small/medium growing companies that need basic DSPM functionality.

7. Wiz

Wiz continues to lead in cloud security, having added DSPM capabilities into its CNAPP platform. They’re known for deep multi-cloud visibility and infrastructure misconfiguration detection.

Ideal for: Enterprises running complex cloud environments looking for infrastructure vulnerability and misconfiguration management.

8. Varonis

Varonis remains a strong player for hybrid and on-prem data security, with deep expertise in permissions and access analytics and focus on SaaS/unstructured data.


Ideal for: Enterprises with legacy file systems or mixed cloud/on-prem architectures.

9. Netwrix

Netwrix’s platform incorporates DSPM-related features into its auditing and access control suite.

Ideal for: Mid-sized organizations seeking DSPM as part of a broader compliance solution.

Emerging DSPM Trends to Watch in 2026

  1. AI Data Security: As enterprises adopt GenAI, DSPM tools are evolving to secure data used in training and inference.

  2. Identity-Centric Risk: Understanding and controlling both human and machine identities is now central to data posture.

  3. Automation-Driven Security: Remediation workflows are becoming the differentiator between “good” and “great.”

Market Consolidation: Expect to see CNAPP, legacy security, and cloud vendors acquiring DSPM startups to strengthen their coverage.

How to Choose the Right DSPM Tool

When evaluating a DSPM solution, align your choice with your data landscape and goals:

  • Cloud-Native Company Choose tools designed for cloud-first environments (like Sentra, Securiti, Wiz).
  • Compliance Priority Platforms like Sentra, BigID or Securiti excel in privacy and governance.
  • Microsoft-Heavy Stack Purview and Sentra DSPM offer native integration.
  • Hybrid Environment Consider Varonis, Prisma Cloud, or Sentra for extended visibility.
  • Enterprise Scalability Evaluate deployment ease, petabyte scalability, cloud resource consumption, scanning efficiency, etc. (Sentra excels here)

*Pro Tip: Run a proof of concept (POC) across multiple environments to test scalability, accuracy, and operational cost effectiveness before full deployment.

Final Thoughts: DSPM Is About Action

The best DSPM tools in 2026 share one core principle, they help organizations move from visibility to action.

At Sentra, we believe that the future of DSPM lies in continuous, automated data protection:

  • Real-time discovery of sensitive data @ scale
  • Context-aware prioritization for business insight
  • Automated remediation that reduces risk instantly

As data continues to power AI, analytics, and innovation, DSPM ensures that innovation never comes at the cost of security. See how Sentra helps leading enterprises protect data across multi-cloud and SaaS environments.

<blogcta-big>

Read More
Gilad Golani
Gilad Golani
November 6, 2025
4
Min Read

How SLMs (Small Language Models) Make Sentra’s AI Faster and More Accurate

How SLMs (Small Language Models) Make Sentra’s AI Faster and More Accurate

The LLM Hype, and What’s Missing

Over the past few years, large language models (LLMs) have dominated the AI conversation. From writing essays to generating code, LLMs like GPT-4 and Claude have proven that massive models can produce human-like language and reasoning at scale.

But here's the catch: not every task needs a 70-billion-parameter model. Parameters are computationally expensive - they require both memory and processing time.

At Sentra, we discovered early on that the work our customers rely on for accurate, scalable classification of massive data flows - isn’t about writing essays or generating text. It’s about making decisions fast, reliably, and cost-effectively across dynamic, real-world data environments. While large language models (LLMs) are excellent at solving general problems, it creates a lot of unnecessary computational overhead.

That’s why we’ve shifted our focus toward Small Language Models (SLMs) - compact, specialized models purpose-built for a single task - understanding and classifying data efficiently. By running hundreds of SLMs in parallel on regular CPUs, Sentra can deliver faster insights, stronger data privacy, and a dramatically lower total cost of AI-based classification that scales with their business, not their cloud bill.

What Is an SLM?

An SLM is a smaller, domain-specific version of a language model. Instead of trying to understand and generate any kind of text, an SLM is trained to excel at a particular task, such as identifying the topic of a document (what the document is about or what type of document it is), or detecting sensitive entities within documents, such as passwords, social security numbers, or other forms of PII.

In other words: If an LLM is a generalist, an SLM is a specialist. At Sentra, we use SLMs that are tuned and optimized for security data classification, allowing them to process high volumes of content with remarkable speed, consistency, and precision. These SLMs are based on standard open source models, but trained with data that was curated by Sentra, to achieve the level of accuracy that only Sentra can guarantee.

From LLMs to SLMs: A Strategic Evolution

Like many in the industry, we started by testing LLMs to see how well they could classify and label data. They were powerful, but also slow, expensive, and difficult to scale. Over time, it became clear: LLMs are too big and too expensive to run on customer data for Sentra to be a viable, cost effective solution for data classification.

Each SLM handles a focused part of the process: initial categorization, text extraction from documents and images, and sensitive entity classification. The SLMs are not only accurate (even more accurate than LLMs classifying using prompts) - they can run on standard CPUs efficiently, and they run inside the customer’s environment, as part of Sentra’s scanners.

The Benefits of SLMs for Customers

a. Speed and Efficiency

SLMs process data faster because they’re lean by design. They don’t waste cycles generating full sentences or reasoning across irrelevant contexts. This means real-time or near-real-time classification, even across millions of data points.

b. Accuracy and Adaptability

SLMs are pre-trained “zero-shot” language models that can categorize and classify generically, without the need to pre-train on a specific task in advance. This is the meaning of “zero shot” - it means that regardless of the data it was trained on, the model can classify an arbitrary set of entities and document labels without training on each one specifically. This is possible due to the fact that language models are very advanced, and they are able to capture deep natural language understanding at the training stage.

Regardless of that, Sentra fine tunes these models to further increase the accuracy of the classification, by curating a very large set of tagged data that resembles the type of data that our customers usually run into.

Our feedback loops ensure that model performance only gets better over time - a direct reflection of our customers’ evolving environments.

c. Cost and Sustainability

Because SLMs are compact, they require less compute power, which means lower operational costs and a smaller carbon footprint. This efficiency allows us to deliver powerful AI capabilities to customers without passing on the heavy infrastructure costs of running massive models.

d. Security and Control

Unlike LLMs hosted on external APIs, SLMs can be run within Sentra’s secure environment, preserving data privacy and regulatory compliance. Customers maintain full control over their sensitive information - a critical requirement in enterprise data security.

A Quick Comparison: SLMs vs. LLMs

The difference between SLMs and LLMs becomes clear when you look at their performance across key dimensions:

Factor SLMs LLMs
Speed Fast, optimized for classification throughput Slower and more compute-intensive for large-scale inference
Cost Cost-efficient Expensive to run at scale
Accuracy (for simple tasks) Optimized for classification Comparable but unnecessary overhead
Deployment Lightweight, easy to integrate Complex and resource-heavy
Adaptability (with feedback) Continuously fine-tuned, ability to fine tune per customer Harder to customize, fine-tuning costly
Best Use Case Classification, tagging, filtering Reasoning and analysis, generation, synthesis

Continuous Learning: How Sentra’s SLMs Grow

One of the most powerful aspects of our SLM approach is continuous learning. Each Sentra customer project contributes valuable insights, from new data patterns to evolving classification needs. These learnings feed back into our training workflows, helping us refine and expand our models over time.

While not every model retrains automatically, the system is built to support iterative optimization: as our team analyzes feedback and performance, models can be fine-tuned or extended to handle new categories and contexts.

The result is an adaptive ecosystem of SLMs that becomes more effective as our customer base and data diversity grow, ensuring Sentra’s AI remains aligned with real-world use cases.

Sentra’s Multi-SLM Architecture

Sentra’s scanning technology doesn’t rely on a single model. We run many SLMs in parallel, each specializing in a distinct layer of classification:

  1. Embedding models that convert data into meaningful vector representations
  2. Entity Classification models that label sensitive entities
  3. Document Classification models that label documents by type
  4. Image-to-text and speech-to-text models that are able to process non-textual data into textual data

This layered approach allows us to operate at scale - quickly, cheaply, and with great results. In practice, that means faster insights, fewer errors, and a more responsive platform for every customer.

The Future of AI Is Specialized

We believe the next frontier of AI isn’t about who can build the biggest model, it’s about who can build the most efficient, adaptive, and secure ones.

By embracing SLMs, Sentra is pioneering a future where AI systems are purpose-built, transparent, and sustainable. Our approach aligns with a broader industry shift toward task-optimized intelligence - models that do one thing extremely well and can learn continuously over time.

Conclusion: The Power of Small

At Sentra, we’ve learned that in AI, bigger isn’t always better. Our commitment to SLMs reflects our belief that efficiency, adaptability, and precision matter most for customers. By running thousands of small, smart models rather than a single massive one, we’re able to classify data faster, cheaper, and with greater accuracy - all while ensuring customer privacy and control.

In short: Sentra’s SLMs represent the power of small, and the future of intelligent classification.

<blogcta-big>

Read More
decorative ball
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

Gartner Certificate for Sentra