All Resources
In this article:
minus iconplus icon
Share the Blog

Why Legacy Data Classification Tools Don't Work Well in the Cloud (But DSPM Does)

September 7, 2023
5
Min Read
Data Security

Data security teams are always trying to understand where their sensitive data is. Yet this goal has remained out of reach for a number of reasons.

The main difficulty is creating a continuously updated data catalog of all production and cloud data. Creating this catalog would involve:

  1.  Identifying everyone in the organization with knowledge of any data stores, with visibility into its contents
  1. Connecting a data classification tool to these data stores
  1. Ensure there’s network connectivity by configuring network and security policies
  1. Confirm that business-critical production systems using each data source won’t be negatively affected, causing damage to performance or availability

Having a process this complex requires a major investment of resources, long workflows, and will still probably not provide the full coverage organizations are looking for. Many so-called successful implementations of such solutions will prove unreliable and too difficult to maintain after a short period of time.

Another pain with a legacy data classification solution is accuracy. Data security professionals are all too aware of the problem of false positives (i.e. wrong classification and data findings) and false negatives (i.e. missing classification of sensitive data that remains unknown). This is mainly due to two reasons.

 

  • Legacy classification solutions rely solely on patterns, such as regular expressions, to identify sensitive data, which falls short in both unstructured data and structured data. 
  • These solutions don’t understand the business context around the data, such as how it is being used, by whom, for what purposes and more.

Without the business context, security teams can’t get any actionable items to remove or protect sensitive data against data risks and security breaches.

Lastly, there’s the reason behind high operational costs. Legacy data classification solutions were not built for the cloud, where each data read/write and network operation has a price tag.

The cloud also offers a much more cost efficient data storage solution and advanced data services that causes organizations to store much more data than they did before moving to the cloud. On the other hand, the public cloud providers also offer a variety of cloud-native APIs and mechanisms that can extremely benefit a data classification and security solution, such as automated backups, cross account federation, direct access to block storage, storage classes, compute instance types, and much more. However, legacy data classification tools, that were not built for the cloud, will completely ignore those benefits and differences, making them an extremely expensive solution for cloud-native organizations.

DSPM: Built to Solve Data Classification in the Cloud 

These challenges have led to the growth of a new approach to securing cloud data - Data Security Posture Management, or DSPM. Sentra’s DSPM  is able to provide full coverage and an up-to-date data catalog with classification of sensitive data, without any complex deployment or operational work involved. This is achieved thanks to a cloud-native agentless architecture, using cloud-native APIs and mechanisms.

A good example of this approach is how Sentra’s DSPM architecture leverages the public cloud mechanism of automated backups for compute instances, block storage, and more. This allows Sentra to securely run its robust discovery and classification technology from within the customer’s premises, in any VPC or subscription/account of the customer’s choice.

This offers a number of benefits:

  1. The organization does not need to change any existing infrastructure configuration, network policies, or security groups.
  1. There’s no need to provide individual credentials for each data source in order for Sentra to discover and scan it.
  1. There is never a performance impact on the actual workloads that are compute-based/bounded, such as virtual machines, that run in production environments. In fact, Sentra’s scanning will never connect via network or application layers to those data stores.

Another benefit of a DSPM built for the cloud is classification accuracy.  Sentra’s DSPM provides an unprecedented level of accuracy thanks to more modern and cloud-native capabilities.This starts with advanced statistical relevance for structured data, enabling our classification engine to understand with high confidence that sensitive data is found within a specific column or field, without scanning every row in a large table.

Sentra leverages even more advanced algorithms for key-value stores and document databases. For unstructured data, the use of AI and LLM -based algorithms unlock tremendous accuracy in understanding and detecting sensitive data types by understanding the context within the data itself. Lastly, the combination of data-centric and identity-centric security approaches provides greater context that allows Sentra’s users to know what actions they should take to remediate data risks when it comes to classification.

Here are two examples of how we apply this context:

1. Different Types of Databases

Personal Identifiable Information (PII) that is found in a database in which only users from the Analytics team have access to, is often a privacy violation and a data risk. On the other hand, PII that is found in a database that only three production microservices have access to is expected,  but requires the data to be isolated within a secure VPC. 

2. Different Access Histories

If 100 employees have access to a sensitive shadow data lake, but only 10 people have actually accessed it in the last year. In this case, the solution would be to reduce permissions and implement stricter access controls. We’d also want to ensure that the data has the right retention policy, to reduce both risks and storage costs. Sentra’s risk score prioritization engine takes multiple data layers into account, including data access permissions, activity, sensitivity, movement and misconfigurations, giving enterprises greater visibility and control over their data risk management processes.

With regards to costs, Sentra’s Data Security Posture Management (DSPM) solution utilizes innovative features that make its scanning and classification solution about two or three orders of magnitude more cost efficient than legacy solutions. The first is the use of smart sampling, where Sentra is able to cluster multiple data units that share the same characteristics, and using intelligent sampling with statistical relevance, understand what sensitive data exists within such data assets that are grouped automatically. This is extremely powerful especially when dealing with data lakes that are often the size of dozens of petabytes, without compromising the solution coverage and accuracy.

Second, Sentra’s modern architecture leverages the benefits of cloud ephemeral resources, such as snapshotting and ephemeral compute workloads with a cloud-native orchestration technology that leverages the elasticity and the scale of the cloud. Sentra balances its resource utilization with the needs of the customer's business, providing advanced scan settings that are built and designed for the cloud. This allows teams to optimize cost according to their business needs, such as determining the frequency and sampling of scans, among more advanced features.

To summarize:

  1. Given the current macroeconomic climate, CISOs should find DSPMs like Sentra as an opportunity to increase their security and minimize their costs
  2. DSPM solutions like Sentra bring an important context - awareness to security teams and tools, allowing them to do better risk management and prioritization by focusing on whats important
  3. Data is likely to continue to be the most important asset of every business, as more organizations embrace the power of the cloud. Therefore, a DSPM will be a pivotal tool in realizing the true value of the data while ensuring it is always secured
  4. Accuracy is key and AI is an enabler for a good data classification tool

<blogcta-big>

Yair brings a wealth of experience in cybersecurity and data product management. In his previous role, Yair led product management at Microsoft and Datadog. With a background as a member of the IDF's Unit 8200 for five years, he possesses over 18 years of expertise in enterprise software, security, data, and cloud computing. Yair has held senior product management positions at Datadog, Digital Asset, and Microsoft Azure Protection.

Subscribe

Latest Blog Posts

Kristin Grimes
Kristin Grimes
David Stuart
David Stuart
March 5, 2026
3
Min Read

Meet Sentra at RSAC 2026: AI Data Readiness, Continuous Compliance, and Modern DLP in Action

Meet Sentra at RSAC 2026: AI Data Readiness, Continuous Compliance, and Modern DLP in Action

RSAC 2026 is shaping up to be one of the most important RSA Conferences to date, especially for security teams navigating AI adoption, Copilot readiness, and large-scale data governance. At RSA Conference 2026 in San Francisco, Sentra is bringing together security leaders from SoFi, Nestlé, and PennyMac to discuss how modern enterprises are preparing their data for AI, strengthening governance, and rethinking DLP in an AI-driven world.

If you’re attending RSAC 2026, here’s where to find us, and why it matters.

CISO AI Copilot Readiness Roundtables at RSAC 2026

March 23–26 | W Hotel | Steps from Moscone

AI assistants like Microsoft Copilot and Google Gemini are transforming how employees access enterprise data. What used to require manual searching across drives, mailboxes, and SaaS applications can now be surfaced instantly.

That shift is powerful, but it also forces CISOs to confront a difficult question: is our data actually AI-ready?

During RSAC 2026, Sentra is hosting closed-door CISO AI Copilot Readiness Roundtables featuring security leaders from SoFi, Nestlé, and PennyMac. These sessions are intentionally intimate, and designed for candid discussion rather than vendor presentations.

No slides. No marketing decks. Just real-world insights on what’s working, and what isn’t - as organizations operationalize AI securely. Register for a Roundtable.

AI Data Readiness for 70+ PB: SoFi at RSA Conference 2026

March 24 | 7:45 AM – 9:00 AM

Preparing data for AI at scale is not theoretical, especially when you’re dealing with more than 70 petabytes.

Join SoFi’s former Director of Product Security, Pritam Mungse, as he shares how SoFi approached AI data readiness using Sentra. The session will explore how large financial institutions can gain visibility into massive data environments, reduce exposure risk, and enable Copilot and ML adoption without compromising governance.

If you’re managing AI adoption in a complex, high-scale environment, this RSAC 2026 session offers practical lessons grounded in real-world execution. Register for the SoFi Session.

Continuous Compliance with AI Visibility: PennyMac at RSAC 2026

March 25 | 12:00 PM – 1:00 PM

For a $500B U.S. mortgage lender like PennyMac, compliance is not a one-time event, it’s a continuous obligation.

In this RSA Conference 2026 session, CISO Cyrus Tibbs will share how PennyMac uses Sentra to gain visibility into sensitive data, automate Jira masking workflows, and transform compliance from a reactive burden into a proactive advantage.

As regulatory expectations increase around AI systems and data governance, continuous compliance becomes a strategic capability rather than an audit checkbox. Register for the PennyMac Session.

Nestlé’s Blueprint for Modern DLP Compliance at RSAC 2026

Global enterprises face an even more complex challenge: governing data consistently across Azure, Snowflake, Microsoft 365, and Purview, while planning for AI and Copilot integration. At RSAC 2026, Nestlé’s Dean Rossouw and Manuel Garcia will share how they built a governance framework that integrates large data catalogs with modern DLP controls. The session explores how traditional policy-based DLP can evolve into a model that combines deep data intelligence with enforcement aligned to business context.

For organizations operating across regions and platforms, this blueprint offers a practical path forward. Register for the Nestlé Session.

Visit Sentra at Booth #N4607 at RSA Conference 2026

If you’re walking the floor at RSAC 2026, stop by Booth N4607 to explore how Sentra enables AI-ready data security.

Our team will be showcasing how organizations can:

  • Eliminate risk from AI agents and ML model adoption
  • Discover unknown sensitive data exposures
  • Add AI-powered intelligence to improve DLP precision

Rather than simply layering new policies on top of old systems, we’ll demonstrate how DSPM and DLP can work together in a unified architecture. Book a Demo at Booth N4607.

Executive Briefings at RSAC 2026

For security leaders looking to go deeper, Sentra is offering private briefings during RSA Conference 2026. These sessions provide the opportunity to discuss real-world data security challenges, proven best practices, and lessons learned from enterprise deployments.

Each discussion is tailored to your environment, whether your focus is AI readiness, exposure reduction, or continuous compliance. Schedule a Personal Briefing.

Special Events During RSAC 2026

The Women in Security Documentary

March 24 & 25 | AMC Metreon 16

Just steps from Moscone Center, join us for a special screening celebrating women redefining leadership in cybersecurity. The red carpet begins at 4:00 PM, with the screening starting at 4:45 PM.

Register Now

Sentra + Defensive Networks RSA Dinner

March 25 | 7:00 PM | The Tavern, San Francisco

We’re hosting an intimate, relationship-centered dinner for security leaders navigating today’s most pressing AI and data security challenges. Designed for meaningful dialogue and peer exchange, this event offers space for authentic conversation beyond the conference floor.

Why AI Data Security Defines RSAC 2026

The defining theme of RSA Conference 2026 is clear: AI has changed the security equation. AI systems do not create new data, but they dramatically increase its discoverability, accessibility, and movement. That reality exposes gaps between visibility and enforcement that many organizations have tolerated for years. To secure AI adoption, organizations need more than isolated tools. They need continuous data intelligence, context-aware enforcement, and feedback between the two. That is the architecture Sentra is bringing to RSAC 2026.

See You at RSA Conference 2026

If you’re attending RSAC 2026 in San Francisco, we’d love to connect.

📍 Booth N4607
📅 March 23–26, 2026
📍 Moscone Center

Join us to explore how AI-ready data security becomes practical, measurable, and operational- not just theoretical.

<blogcta-big>

Read More
David Stuart
David Stuart
March 4, 2026
4
Min Read

Microsoft Copilot Chat Incident: A Wake-Up Call for AI Assistant Security in Microsoft 365

Microsoft Copilot Chat Incident: A Wake-Up Call for AI Assistant Security in Microsoft 365

The recent Microsoft Copilot Chat incident, in which enterprise users reportedly saw AI-generated summaries that included confidential content from Drafts and Sent Items despite sensitivity labels and DLP policies, has reignited a critical conversation about AI assistant security.

Microsoft clarified that Copilot did not bypass underlying access controls. But that explanation only addresses part of the problem. The real issue isn’t whether Microsoft Copilot broke security controls. It's that Copilot inherits user permissions, and can apply its extensive abilities to uncover data the user may have long forgotten (or never properly secured in the first place).

Copilot didn’t create new risks, it surfaced existing exposure - instantly, at scale, and in a way that made it visible. For organizations deploying Microsoft Copilot, that distinction matters.

Why the Microsoft Copilot Incident Matters More Than It Appears

Microsoft Copilot operates within the permissions of the signed-in user. On paper, that sounds safe. In reality, it means Copilot can access everything the user can access - across years of accumulated data.

In a typical Microsoft 365 environment, that includes:

  • Emails stretching back years
  • Linked SharePoint Online documents
  • OneDrive folders shared broadly across teams
  • External guest-accessible sites
  • Archived projects no one has reviewed in years

When Copilot summarizes a mailbox, it can follow embedded links into SharePoint and OneDrive. If those linked files contain overshared financials, HR investigations, contracts, or regulated data, Copilot can surface insights from them in seconds.

Previously, this data exposure existed quietly in the background. AI assistants remove friction:

  • No need to manually search multiple systems
  • No need to remember file locations
  • No need to understand organizational silos

A single natural-language prompt can traverse it all.

That is the shift. And that is the risk.

AI Assistants Change the Data Risk Model

Traditional enterprise security assumes that risk is constrained by human effort. Data may technically be accessible, but if it requires time, institutional knowledge, or manual searching, exposure is limited.

AI assistants like Microsoft Copilot eliminate those barriers.

Instead of asking, “Who has access to this file?” organizations must now ask:

What can an AI assistant synthesize from everything a user can access?

This is a fundamentally different security model.

The Microsoft Copilot Chat incident demonstrated that even when sensitivity labels and DLP policies are in place, unexpected AI-generated outputs can undermine confidence. The concern is not only regulatory exposure, its reputational, operational, and executive trust in AI initiatives.

Why Sensitivity Labels and DLP Are Not Sufficient for Copilot Security

Many organizations rely on Microsoft Purview, sensitivity labels, and Data Loss Prevention (DLP) policies to control how information is handled in Microsoft 365.

Those tools are essential, but they are not enough on their own.

In real-world environments:

  • Labels are inconsistently applied
  • Legacy data predates modern classification policies
  • SharePoint sites remain broadly accessible long after projects end
  • OneDrive folders accumulate stale and redundant files
  • Linked documents inherit exposure from misconfigured parent sites

AI assistants operate on access reality, not policy intention. If sensitive data is accessible (even unintentionally) Copilot can surface it. The Copilot Chat incident did not reveal a failure of AI. It revealed a failure of data posture alignment.

Microsoft Copilot Requires AI Data Readiness

Before enabling Copilot broadly across Microsoft 365, organizations need what can be described as AI Data Readiness.

AI Data Readiness means achieving continuous visibility into:

  • Where sensitive data lives
  • How it is shared internally and externally
  • Which SharePoint and OneDrive assets are overshared
  • Whether classification matches actual content
  • What historical data remains unnecessarily accessible

Without this foundation, Copilot becomes a force multiplier for hidden exposure.

With it, Copilot becomes a productivity accelerator.

DSPM: The Missing Layer in Secure Microsoft Copilot Deployment

Data Security Posture Management (DSPM) provides the continuous, data-centric visibility required for secure AI adoption.

Rather than focusing solely on permissions or labels, DSPM answers deeper questions:

  • What sensitive and regulated data exists across Microsoft 365?
  • Where is it exposed?
  • What is its purpose? 
  • Who can access it?
  • How does it move?
  • Is it properly classified and governed?

Sentra’s DSPM-driven approach continuously discovers and classifies sensitive data across SharePoint Online, OneDrive, cloud storage, and SaaS platforms. Using AI-enhanced classification, it differentiates routine collaboration documents from high-risk assets such as HR investigations, financial statements, intellectual property, and regulated PII or PHI.

This creates a unified, context-rich map of enterprise data exposure, the exact context Copilot relies on when generating responses.

From Visibility to Remediation

Once visibility exists, security teams can act with precision.

Instead of broadly restricting Copilot access, which reduces productivity, organizations can surgically reduce risk by:

  • Identifying overexposed SharePoint sites containing sensitive data
  • Detecting OneDrive folders shared with large groups or external guests
  • Removing stale, redundant, and “ghost” data
  • Reconciling missing or misaligned sensitivity labels
  • Aligning MPIP and DLP controls with actual content reality

The result is not AI avoidance. It is controlled AI expansion.

The Strategic Shift: Treat Copilot Security as a Data Problem

The Microsoft Copilot Chat incident should not trigger panic. It should trigger maturity.

AI assistants reflect the state of your data. If your Microsoft 365 environment contains overshared, misclassified, or stale sensitive information, AI will surface it.

Organizations that succeed with Microsoft Copilot will be those that:

  • Audit their Microsoft 365 data exposure continuously
  • Reduce unnecessary access before enabling AI at scale
  • Align labels, policies, and actual content
  • Limit AI blast radius through data posture improvements
  • Treat AI adoption as a data governance transformation

The conversation should move from “Is Copilot safe?” to:

Is our data posture ready for Copilot?

When DSPM underpins AI adoption, Copilot shifts from potential liability to competitive advantage.

Final Thought: AI Assistants Don’t Create Risk - They Reveal It

The Microsoft Copilot incident is not an isolated anomaly. It is an early indicator of how AI assistants will reshape enterprise security assumptions. Copilot can only summarize what users already have access to. If access is overly broad, outdated, or misconfigured, AI will expose that reality faster than any audit ever could.

Organizations that invest in AI Data Readiness today will not only prevent future incidents, they will accelerate secure AI transformation across Microsoft 365.

<blogcta-big>

Read More
Nikki Ralston
Nikki Ralston
February 25, 2026
3
Min Read

SOC 2 Without the Spreadsheet Chaos: Automating Evidence for Regulated Data Controls

SOC 2 Without the Spreadsheet Chaos: Automating Evidence for Regulated Data Controls

SOC 2 has become table stakes for cloud‑native and SaaS organizations. But for many security and GRC teams, each SOC 2 cycle still feels like starting from scratch; hunting for the latest access reviews, exporting encryption settings from multiple consoles, proving backups and logs exist - per data set, per environment. If your SOC 2 evidence process is still a patchwork of spreadsheets and screenshots, you’re not alone. The missing piece is a data‑centric view of your controls, especially around regulated data.

Why SOC 2 Evidence Is So Hard in Cloud and SaaS Environments

Under SOC 2, trust service criteria like Security, Availability, and Confidentiality translate into specific expectations around data:

Is sensitive or regulated data discovered and classified consistently?

Are core controls (encryption, backup, access, logging) actually in place where that data lives?

Can you show continuous monitoring instead of point‑in‑time screenshots?

In a typical multi‑cloud/SaaS environment:

  • Sensitive data is scattered across S3, databases, Snowflake, M365/Google Workspace, Salesforce, and more.
  • Different teams own pieces of the puzzle (infra, security, data, app owners).
  • Legacy tools are siloed by layer (CSPM for infra, DLP for traffic, privacy catalog for RoPA).

So when SOC 2 comes around, you spend weeks assembling a story instead of being able to show a trusted, provable compliance posture at the data layer.

The Data‑First Approach to SOC 2 Evidence

Instead of treating SOC 2 as a separate project, leading teams are aligning it with their data security posture management (DSPM) strategy:

  1. Start from the data, not from the infrastructure
  • Build a unified inventory of sensitive and regulated data across IaaS, PaaS, SaaS, and on‑prem.
  • Enrich each store with sensitivity, residency, and business context.

  1. Attach control posture to each data store
  • For each regulated data store, track encryption status, backup configuration, access model, and logging/monitoring coverage as posture attributes.

  1. Generate SOC‑aligned evidence from the same system
  • Use the regulated‑data inventory plus posture engine to produce SOC 2‑friendly reports and CSVs, rather than collecting evidence manually for each audit cycle.

This is exactly the pattern that modern data security platforms like Sentra are implementing.

How Sentra Helps Security and GRC Teams Automate SOC 2 Evidence

Sentra sits across your data estate and focuses on regulated data, with capabilities that map directly onto SOC 2 evidence needs:

Comprehensive data‑store discovery and classification
Agentless discovery of data stores (managed and unmanaged) across multi‑cloud and on‑prem, combined with high‑accuracy classification for regulated and business‑critical data.

Data‑centric security posture
For each store, Sentra tracks security properties—including encryption, backup, logging, and access configuration, and surfaces gaps where sensitive data is insufficiently protected.

Framework‑aligned reporting
SOC 2 and other frameworks can be represented as report templates that pull directly from Sentra’s inventory and posture attributes, giving GRC teams “audit‑ready” exports without rebuilding evidence from scratch.

The result is you can prove control over regulated data, for SOC 2 and beyond, with far less manual overhead.

Mapping SOC 2 Criteria to Data‑Level Evidence

Here’s how a data‑first posture shows up in SOC 2:

CC6.x (Logical and Physical Access Controls)

Evidence: Identity‑to‑data mapping showing which users/roles can access which sensitive datasets across cloud and SaaS.

CC7.x (Change Management / Monitoring)

Evidence: Data Detection & Response (DDR) signals and anomaly analytics around access to crown‑jewel data; logs that tie back to sensitive data stores.

CC8.x (Risk Mitigation)

Evidence: Risk‑prioritized view of data stores based on sensitivity and missing controls, plus remediation workflows or automatic labeling/tagging to tighten upstream policies.

When this data‑level view is in place, SOC 2 becomes evidence selection rather than evidence construction.

A Repeatable SOC 2 Playbook for Security, GRC, and Privacy

To operationalize this approach, many teams follow a recurring pattern:

  1. Define a “regulated data perimeter” for SOC 2: Identify which clouds, SaaS platforms, and on‑prem stores contain in‑scope data (PII, PHI, PCI, financial records).

  1. Instrument with DSPM: Deploy a data security platform like Sentra to discover, classify, and map access to that data perimeter.

  1. Connect GRC to the same source of truth: Have GRC and privacy teams pull their SOC 2 evidence from the same inventory and posture views Security uses for day‑to‑day risk management.

  1. Continuously refine controls: Use posture and DDR insights to reduce exposure, close misconfigurations, and improve your next SOC 2 cycle before it starts.

The more you lean on a shared, data‑centric foundation, the easier it becomes to maintain a trusted, provable compliance posture across frameworks, not just SOC 2.

Turning SOC 2 From a Project Into a Capability

Ultimately, the goal is to stop treating SOC 2 as a once-a-year project and start treating it as an ongoing capability embedded into how your organization operates. Security, GRC, and privacy teams should work from a single, unified view of regulated data and controls. Evidence should always be a few clicks away - not the result of a month-long scramble. And every audit should strengthen your data security posture, not distract from it. If you’re still managing compliance in spreadsheets, it’s worth asking what it would take to make your SOC 2 posture something you can prove on demand.

Ready to end the fire drills and move to continuous compliance? Book a Demo 

<blogcta-big>

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

RSA 2026 Conference Logo
Going to RSA?

Meet with CISOs from Nestlé, SoFi, and PennyMac

Hear how they are making data AI ready

Join our exclusive RSA Roundtable 

Register Now