All Resources
In this article:
minus iconplus icon
Share the Article

How Sentra Built a Data Security Platform for the AI Era

October 21, 2024
5
 Min Read
Data Sprawl

In just three years, Sentra has witnessed the rapid evolution of the data security landscape. What began with traditional on-premise Data Loss Prevention (DLP) solutions has shifted to a cloud-native focus with Data Security Posture Management (DSPM). This marked a major leap in how organizations protect their data, but the evolution didn’t stop there.

The next wave introduced new capabilities like Data Detection and Response (DDR) and Data Access Governance (DAG), pushing the boundaries of what DSPM could offer. Now, we’re entering an era where SaaS Security Posture Management (SSPM) and Artificial Intelligence Security Posture Management (AI-SPM) are becoming increasingly important.

 

These shifts are redefining what we’ve traditionally called Data Security Platform (DSP) solutions, marking a significant transformation in the industry. The speed of this evolution speaks to the growing complexity of data security needs and the innovation required to meet them.

The Evolution of Data Security

What Is Driving The Evolution of Data Security?

The evolution of the data security market is being driven by several key macro trends:

  • Digital Transformation and Data Democratization: Organizations are increasingly embracing digital transformation, making data more accessible to various teams and users.
  • Rapid Cloud Adoption: Businesses are moving to the cloud at an unprecedented pace to enhance agility and responsiveness.
  • Explosion of Siloed Data Stores: The growing number of siloed data stores, diverse data technologies, and an expanding user base is complicating data management.
  • Increased Innovation Pace: The rise of artificial intelligence (AI) is accelerating the pace of innovation, creating new opportunities and challenges in data security.
  • Resource Shortages: As organizations grow, the need for automation to keep up with increasing demands has never been more critical.
  • Stricter Data Privacy Regulations: Heightened data privacy laws and stricter breach disclosure requirements are adding to the urgency for robust data protection measures.
Rapid cloud adoption

Similarly, there has been an evolution in the roles involved with the management, governance, and protection of data. These roles are increasingly intertwined and co-dependent as described in our recent blog entitled “Data: The Unifying Force Behind Disparate GRC Functions”. We identify that today each respective function operates within its own domain yet shares ownership of data at its core. As the co-dependency on data increases so does the need for a unifying platform approach to data security.

Sentra has adapted to these changes to align our messaging with industry expectations, buyer requirements, and product/technology advancements.

A Data Security Platform for the AI Era

Sentra is setting the standard with the leading Data Security Platform for the AI Era.

With its cloud-native design, Sentra seamlessly integrates powerful capabilities like Data Discovery and Classification, Data Security Posture Management (DSPM), Data Access Governance (DAG), and Data Detection and Response (DDR) into a comprehensive solution. This allows our customers to achieve enterprise-scale data protection while addressing critical questions about their data.

data security cycle - visibility, context, access, risks, threats

What sets Sentra apart is its connector-less, cloud-native architecture, which effortlessly scales to accommodate multi-petabyte, multi-cloud environments without the administrative burdens typical of connector-based legacy systems. These more labor-intensive approaches often struggle to keep pace and frequently overlook shadow data.

Moreover, Sentra harnesses the power of AI and machine learning to accurately interpret data context and classify data. This not only enhances data security but also ensures the privacy and integrity of data used in Gen- AI applications. We recognized the critical need for accurate and automated Data Discovery and Classification, along with Data Security Posture Management (DSPM), to address the risks associated with data proliferation in a multi-cloud landscape. Based on our customers' evolving needs, we expanded our capabilities to include DAG and DDR. These tools are essential for managing data access, detecting emerging threats, and improving risk mitigation and data loss prevention.

DAG maps the relationships between cloud identities, roles, permissions, data stores, and sensitive data classes. This provides a complete view of which identities and data stores in the cloud may be overprivileged. Meanwhile, DDR offers continuous threat monitoring for suspicious data access activity, providing early warnings of potential breaches.

We grew to support SaaS data repositories including Microsoft 365 (SharePoint, OneDrive, Teams, etc.), G Suite (Gdrive) and leveraged AI/ML to accurately classify data hidden within unstructured data stores.

Sentra’s accurate data sensitivity tagging and granular contextual details allows organizations to enhance the effectiveness of their existing tools, streamline workflows, and automate remediation processes. Additionally, Sentra offers pre-built integrations with various analysis and response tools used across the enterprise, including data catalogs, incident response (IR) platforms, IT service management (ITSM) systems, DLPs, CSPMs, CNAPPs, IAM, and compliance management solutions.

How Sentra Redefines Enterprise Data Security Across Clouds

Sentra has architected a solution that can deliver enterprise-scale data security without the traditional constraints and administrative headaches. Sentra’s cloud-native design easily scales to petabyte data volumes across multi-cloud and on-premises environments. 

The Sentra platform incorporates a few major differentiators that distinguish it from other solutions including:


  • Novel Scanning Technology: Sentra uses inventory files and advanced automatic grouping to create a new entity called “Data Asset”, a group of files that have the same structure, security posture and business function. Sentra automatically reduces billions of files into thousands of data assets (that represent different types of data) continuously, enabling full coverage of 100% of cloud data of petabytes to just several hundreds of thousands of files which need to be scanned (5-6 orders of magnitude less scanning required). Since there is no random sampling involved in the process, all types of data are fully scanned and for differentials on a daily basis. Sentra supports all leading IaaS, PaaS, SaaS and On-premises stores.
  • AI-powered Autonomous Classification: Sentra’s use of AI-powered classification provides approximately 97% classification accuracy of data within unstructured documents and structured data. Additionally, Sentra provides rich data context (distinct from data class or type) about multiple aspects of files, such as data subject residency, business impact, synthetic or real data, and more. Further, Sentra’s classification uses LLMs (inside the customer environment) to automatically learn and adapt based on the unique business context, false positive user inputs, and allows users to add AI-based classifiers using natural language (powered by LLMs). This autonomous learning means users don’t have to customize the system themselves, saving time and helping to keep pace with dynamic data.
  • Data Perimeters / Movement: Sentra DataTreks™ provides the ability to understand data perimeters automatically and detect when data is moving (e.g. copied partially or fully) to a different perimeter. For example, it can detect data similarity/movement from a well protected production environment to a less- protected development environment. This is important for highly dynamic cloud environments and promoting secure data democratization.
  • Data Detection and Response (DDR): Sentra’s DDR module highlights anomalies such as unauthorized data access or unusual data movements in near real-time, integrating alerts into existing tools like ServiceNow or JIRA for quick mitigation.
  • Easy Customization: In addition to ‘learning’ of a customer's unique data types, with Sentra it’s easy to create new classifiers, modify policies, and apply custom tagging labels.

As AI reshapes the digital landscape, it also creates new vulnerabilities, such as the risk of data exposure through AI training processes. The Sentra platform addresses these AI-specific challenges, while continuing to tackle the persistent security issues from the cloud era, providing an integrated solution that ensures data security remains resilient and adaptive.

Use Cases: Solving Complex Problems with Unique Solutions

Sentra’s unique capabilities allow it to serve a broad spectrum of challenging data security, governance and compliance use cases. Two frequently cited DSPM use cases are preventing data breaches and facilitating GenAI technology deployments. With the addition of data privacy compliance, these represent the top three.  

Let's dive deeper into how Sentra's platform addresses specific challenges:

Data Risk Visibility

Sentra’s Data Security Platform enables continuous analysis of your security posture and automates risk assessments across your entire data landscape. It identifies data vulnerabilities across cloud-native and unmanaged databases, data lakes, and metadata catalogs. By automating the discovery and classification of sensitive data, teams can prioritize actions based on the sensitivity and policy guidelines related to each asset. This automation not only saves time but also enhances accuracy, especially when leveraging large language models (LLMs) for detailed data classification.

Security and Compliance Audit

Sentra Data Security Platform can also automate the process of identifying regulatory violations and ensuring adherence to custom and pre-built policies (including policies that map to common compliance frameworks). 

The platform automates the identification of regulatory violations, ensuring compliance with both custom and established policies. It helps keep sensitive data in the right environments, preventing it from traveling to regions that violate retention policies or lack encryption. Unlike manual policy implementation, which is prone to errors, Sentra’s automated approach significantly reduces the risk of misconfiguration, ensuring that teams don’t miss critical activities.

Data Access Governance

Sentra enhances data access governance (DAG) by enforcing appropriate permissions for all users and applications within an organization. By automating the monitoring of access permissions, Sentra mitigates risks such as excessive permissions and unauthorized access. This ensures that teams can maintain least privilege access control, which is essential in a growing data ecosystem.

Minimizing Data and Attack Surface

The platform’s capabilities also extend to detecting unmanaged sensitive data, such as shadow or duplicate assets. By automatically finding and classifying these unknown data points, Sentra minimizes the attack surface, controls data sprawl, and enhances overall data protection.

Secure and Responsible AI

As organizations build new Generative AI applications, Sentra extends its protection to LLM applications, treating them as part of the data attack surface. This proactive management, alongside monitoring of prompts and outputs, addresses data privacy and integrity concerns, ensuring that organizations are prepared for the future of AI technologies.

Insider Risk Management

Sentra effectively detects insider risks by monitoring user access to sensitive information across various platforms. Its Data Detection and Response (DDR) capabilities provide real-time threat detection, analyzing user activity and audit logs to identify unusual patterns.

Data Loss Prevention (DLP)

The platform integrates seamlessly with endpoint DLP solutions to monitor all access activities related to sensitive data. By detecting unauthorized access attempts from external networks, Sentra can prevent data breaches before they escalate, all while maintaining a positive user experience.

Sentra’s robust Data Security Platform offers solutions for these use cases and more, empowering organizations to navigate the complexities of data security with confidence. With a comprehensive approach that combines visibility, governance, and protection, Sentra helps businesses secure their data effectively in today’s dynamic digital environment.

From DSPM to a Comprehensive Data Security Platform

Sentra has evolved beyond being the leading Data Security Posture Management (DSPM) solution; we are now a Cloud-native Data Security Platform (DSP). Today, we offer holistic solutions that empower organizations to locate, secure, and monitor their data against emerging threats. Our mission is to help businesses move faster and thrive in today’s digital landscape.

What sets the Sentra DSP apart is its unique layer of protection, distinct from traditional infrastructure-dependent solutions. It enables organizations to scale their data protection across ever-expanding multi-cloud environments, meeting enterprise demands while adapting to ever-changing business needs—all without placing undue burdens on the teams managing it.

And we continue to progress. In a world rapidly evolving with advancements in AI, the Sentra Data Security Platform stands as the most comprehensive and effective solution to keep pace with the challenges of the AI age. We are committed to developing our platform to ensure that your data security remains robust and adaptive.

 Sentra's Cloud-Native Data Security Platform provides comprehensive data protection for the entire data estate.
 Sentra Cloud-Native Data Security Platform provides comprehensive data protection for the entire data estate.

David Stuart is Senior Director of Product Marketing for Sentra, a leading cloud-native data security platform provider, where he is responsible for product and launch planning, content creation, and analyst relations. Dave is a 20+ year security industry veteran having held product and marketing management positions at industry luminary companies such as Symantec, Sourcefire, Cisco, Tenable, and ZeroFox. Dave holds a BSEE/CS from University of Illinois, and an MBA from Northwestern Kellogg Graduate School of Management.

Subscribe

Latest Blog Posts

Ron Reiter
Ron Reiter
May 1, 2026
3
Min Read

Source Code Secrets Scanning: The Missing Half of Your Cloud Data Security Strategy

Source Code Secrets Scanning: The Missing Half of Your Cloud Data Security Strategy

Key takeaway: Scanning only your Git repositories for secrets misses the majority of exposures. API keys, credentials, and private keys routinely escape into cloud storage, laptops, and CI pipelines — where no SCM scanner can find them. Comprehensive source code secrets scanning must cover your entire cloud estate, not just version control.

If you look at the root cause of most modern breaches, a depressingly common pattern appears: someone left a secret where it didn’t belong. An API key in a script. A database password in a config file. An SSH private key in a shared folder. We’ve all seen it, and we all know better — but knowing and seeing are two very different things.

Why Repository-Level Scanning Is Not Enough

The uncomfortable reality is that source code secrets scanning is still treated as a repository problem in most organizations. You wire up scanners to GitHub or GitLab, plug something into the CI pipeline, and feel like you’re covered. But that’s not where the real blind spot is.

Code spreads. Secrets spread with it.

Developers clone repos to laptops. They sync whole project directories — including .env files you carefully excluded from version control — to Box, Google Drive, or OneDrive. They drop configuration bundles into S3 for deployment scripts. They zip up “old” services and park them in cold storage “just in case.” None of your branch protection rules or repository‑level scanners apply to those copies anymore.

What Comprehensive Cloud-Wide Secrets Scanning Looks Like

That’s the gap we designed Sentra to close. Our DSPM platform doesn’t limit itself to SCMs; it treats code, configs, and secrets as data spread across your cloud estate. We natively support 600+ source file extensions across mainstream and niche languages — Python, JavaScript/TypeScript, Java, Go, C/C++, C#, Rust, Ruby, PHP, Swift, Kotlin, Scala, R, MATLAB, and hundreds more — because secrets don’t care what language you wrote them in.  We read those files with smart encoding detection and process them entirely in memory so scanning doesn’t create new copies of the very content you’re trying to protect.

We also go after the places secrets are supposed to live and still end up exposed. Environment files like .env, .prod, .dev, .qa are intentionally dense collections of connection strings, API keys, OAuth tokens, and cloud credentials.  They’re also routinely copied into CI buckets, checked into repos “temporarily,” synced from laptops to personal cloud storage, and left behind in old deployment folders. Sentra parses these as structured key–value stores and treats every value as a potential secret, not just as generic text.

On the higher‑impact end of the spectrum, we identify cryptographic keys and certificates — .pem, .ppk, .crt, .id_rsa, Java KeyStores, and more — wherever they show up in your cloud.  A single private key on a shared file system can be the difference between a contained incident and full cluster compromise; pretending those files don’t exist outside your “keys” repo is wishful thinking.

We apply the same lens to infrastructure‑as‑code and config files: Terraform (.tf, .hcl), Kubernetes YAML manifests, Helm charts, Dockerfiles, .config, .conf, .ini, .cfg. Those are exactly the artifacts that get copied into S3 for ops, packaged into artifacts, or left in CI logs. They frequently embed credentials, service account tokens, and internal endpoints.

Even “documentation” isn’t off the hook. I’ve lost count of README files with “example” API keys that turned out to be real, markdown runbooks with production connection strings, or onboarding guides that still contain “temporary” passwords issued months ago. Sentra scans these right alongside code, because attackers don’t care whether a secret lives in .py or .md.

And it’s not just secrets. Source trees are full of embedded PII and regulated data: test data seeded with real customer records, SQL seed scripts with actual phone numbers and SSNs, debug dumps committed alongside the code that created them.  Sentra’s classifiers treat this like any other data source and flag those exposures so compliance teams can act.

Secrets Scanning and Compliance: SOC 2, ISO 27001, and Supply Chain Security

Frameworks like SOC 2 and ISO 27001 already expect you to have serious secrets management; supply‑chain security expectations are pushing in the same direction.  But you can’t manage what you can’t see. There’s a huge difference between “we scan our main repos” and “we know where every secret lives across our cloud.” That gap — all the code, configs, and keys that leaked into storage outside of Git — is where real breaches happen.

If you want to see what comprehensive source code secrets scanning looks like when it’s treated as part of data security, not just DevSecOps hygiene, you can request a demo or explore our DSPM overview at sentra.io.

Read More
Ron Reiter
Ron Reiter
May 1, 2026
3
Min Read

Jupyter Notebook Scanning: The Data Science Blind Spot Leaking Your Sensitive Data

Jupyter Notebook Scanning: The Data Science Blind Spot Leaking Your Sensitive Data

Key takeaway: Jupyter notebooks silently embed query results, PII, credentials, and model training data directly into .ipynb files — making them a high-risk, largely invisible data exposure vector that traditional DSPM tools miss entirely.

As a CTO, I love what Jupyter notebooks have done for data science. They made experimentation faster and more accessible. But they also created a data security problem almost nobody in the industry wanted to talk about — and one that most DSPM platforms still don’t address.

Why Jupyter Notebooks Are a Hidden Data Security Risk

A notebook is not just “some JSON.” It’s a living environment where data scientists write code, run queries against production systems, visualize results, and document what they did — all in a single .ipynb file. Crucially, notebooks persist their outputs. Every DataFrame you print, every SQL query you run, every chart you render is embedded back into the notebook and travels with it when you commit to Git, upload to S3, or share it through JupyterHub.

That means a quick “SELECT * FROM customers LIMIT 1000” during an exploration session can turn into a permanent snapshot of real customer data — names, emails, addresses, account IDs — now stored in a file that’s often outside your formal data governance boundary. Multiply that by thousands of notebooks spread across repos and buckets, and you get a very large, largely invisible problem.

Why Traditional Data Security Scanning Misses Notebook Content

Traditional scanning approaches don’t help much here. If you treat notebooks as raw JSON and run regexes over them, you’ll drown in false positives from code syntax and structural noise, while still missing sensitive data rendered as HTML tables, base64‑encoded images, or attachments in cell outputs.  Effective Jupyter notebook scanning for data security has to understand the format and the different kinds of content it holds.

How Sentra Scans Jupyter Notebooks for Sensitive Data

In Sentra, we built a dedicated Jupyter reader that decomposes notebooks into code cells, markdown cells, and outputs, then processes each with the right extraction strategy.  Code cells are analyzed as text so we can detect hard‑coded database credentials, API keys, cloud tokens, and connection strings — all the “just for testing” shortcuts that never got cleaned up.  Markdown cells go through a markdown‑aware reader, because they often contain commentary about datasets, customers, or experiments that’s sensitive in its own right.

Most importantly, we treat cell outputs as a first‑class data source. We scan text and HTML outputs for PII, PHI, and financial data; we decode embedded images and run them through OCR to catch sensitive content in charts and screenshots; and we extract and analyze any attachments sitting inside outputs using the full Sentra parsing stack.  Everything is done in memory, and we support both v3 and v4 notebook formats so legacy notebooks aren’t exempt.

Jupyter Notebooks, AI Governance, and Compliance Risk

This isn’t just a nice‑to‑have. Notebooks are often the only place where you can see which data was used to train a model, how it was accessed, and what transformations were applied. As AI governance and regulations tighten, having a way to systematically scan and catalog notebook content becomes a prerequisite for answering basic questions about your ML pipelines.  From a compliance perspective, notebooks that contain EU customer data and end up in a US‑hosted Git repo can also create data residency problems you’ll never spot without automated discovery.

At the end of the day, the Jupyter notebook problem is a visibility problem. Security teams can’t protect data they can’t see, and notebooks have historically been invisible to DSPM tools.  Our goal with Sentra is to make notebooks as governable as any other data store — so your data scientists don’t have to choose between moving fast and staying compliant. You can see how this fits into our broader AI data readiness story at sentra.io.

Read More
Ron Reiter
Ron Reiter
May 1, 2026
3
Min Read

Email DLP Beyond the Gateway: Why Email Archive Scanning Has to Be Part of Your DSPM

Email DLP Beyond the Gateway: Why Email Archive Scanning Has to Be Part of Your DSPM

Key takeaway: Gateway DLP only inspects email at send time. MSG, PST, EML, and OST archives — stored on file shares, desktops, and cloud storage — contain years of PII, PHI, and financial data that most DSPM tools never scan. Email archive scanning is a required component of any complete data security posture management strategy.

If you walk into most security teams today and ask how they “protect email,” you’ll hear a familiar story: secure gateway, phishing filters, transport DLP, maybe some sandboxing. All of that matters. But it’s solving the wrong half of the problem.

The real risk is not email in transit. It’s email at rest.

The Email Data Security Gap: What Lives in PST, MSG, and EML Archives

Every organization I’ve worked with has the same pattern: MSG files saved to desktops, PST archives dumped onto file shares, EML files zipped and uploaded to cloud storage. Those archives contain years of attachments, forwarded threads, and exported mailboxes. They also contain some of the densest concentrations of PII, PHI, financial data, and confidential conversations anywhere in the company — and for most data security tools, they’re completely invisible.

Gateway DLP inspects a message once, at send time. It has no idea what happens when that message is saved, exported, forwarded, archived, or bundled into a PST file on someone’s last day at the company.  If your data security posture management (DSPM) strategy doesn’t include deep, format‑aware email archive scanning, you’re blind to where email data actually lives.

How Sentra Scans Email Archives: MSG, EML, PST, and OST

At Sentra, we treat MSG, EML, PST, and OST as composite data stores that deserve the same depth of analysis as a database or a data lake table. Our extraction engine understands Outlook message files, standard RFC 822 emails, and full mailbox data files. We pull out headers, HTML and plain‑text bodies, and every attachment, then recursively follow the chain as far as it goes — attached emails, nested ZIPs, the spreadsheets and PDFs hiding inside those ZIPs, and so on.  All of that processing happens in memory, so we’re not creating new, unmanaged copies of sensitive content while we scan.

Three Risks That Email Archive Scanning Directly Addresses

From a risk perspective, this matters in three concrete ways. First, insider exfiltration doesn’t always look like a big transfer to an external file‑sharing service. More often, it looks like months of forwarding sensitive files to a personal account, followed by a mailbox export to PST. That one file now contains everything they walked out with, in a format most tools can’t inspect.  Second, accidental exposure is endemic: people send spreadsheets with customer PII, lab results, or financial reports to the wrong recipients all the time. Those messages live in archives long after anyone remembers they exist.  Third, every major privacy and sectoral framework — GDPR, HIPAA, SEC/FINRA rules — assumes you can actually find personal and regulated data in email when you need to respond to a deletion request, an investigation, or legal discovery.

Email archives are one of the largest ungoverned data lakes in most enterprises. Treating them as “solved” because you have a good gateway is how you end up explaining to regulators why a PST on a public share contained ten years of customer attachments. Deep email archive scanning is exactly the kind of capability we built Sentra’s DSPM platform to deliver. If you’re serious about closing real‑world data gaps, you have to go where the data actually lives — and a staggering amount of it still lives in email.

Learn more about how Sentra discovers and classifies sensitive data across your cloud — including inside email archives — at sentra.io.

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.