Minimizing your Data Attack Surface in the Cloud
The cloud is one of the most important developments in the history of information technology. It drives innovation and speed for companies, giving engineers instant access to virtually any type of workload with unlimited scale.
But with opportunity comes a price - moving at these speeds increases the risk that data ends up in places that are not monitored for governance, risk and compliance issues. Of course, this increases the risk of a data breach, but it’s not the only reason we’re seeing so many breaches in the cloud era. Other reasons include:
- Systems are being built quickly for business units without adequate regard for security
- More data is moving through the company as teams use and mine data more efficiently using tools such as cloud data warehouses, BI, and big data analytics
- New roles are being created constantly for people who need to gain access to organizational data
- New technologies are being adopted for business growth which require access to vast amounts of data - such as deep learning, novel language models, and new processors in the cloud
- Anonymous cryptocurrencies have made data leaks lucrative.
- Nation state powers are increasing cyber attacks due to new conflicts
Ultimately, there are only two methods which can mitigate the risk of cloud data leaks - better protecting your cloud infrastructure, and minimizing your data attack surface.
Protecting Cloud Infrastructure
Companies such as Wiz, Orca Security and Palo Alto provide great cloud security solutions, the most important of which is a Cloud Security Posture Management tool. CSPM tools help security teams to understand and remediate infrastructure related cloud security risks which are mostly related to misconfigurations, lateral movements of attackers, and vulnerable software that needs to be patched.
However, these tools cannot mitigate all attacks. Insider threats, careless handling of data, and malicious attackers will always find ways to get a hold of organizational data, whether it is in the cloud, in different SaaS services, or on employee workstations. Even the most protected infrastructure cannot withstand social engineering attacks or accidental mishandling of sensitive data. The best way to mitigate the risk for sensitive data leaks is by minimizing the “data attack surface” of the cloud.
What is the "Data Attack Surface"?
Data attack surface is a term that describes the potential exposure of an organization’s sensitive data in the event of a data breach. If a traditional attack surface is the sum of all an organization’s vulnerabilities, a data attack surface is the sum of all sensitive data that isn’t secured properly.
The larger the data attack surface - the more sensitive data you have - the higher the chances are that a data breach will occur.
There are several ways to reduce the chances of a data breach:
- Reduce access to sensitive data
- Reduce the number of systems that process sensitive data
- Reduce the number of outputs that data processing systems write
- Address misconfigurations of the infrastructure which holds sensitive data
- Isolate infrastructure which holds sensitive data
- Tokenize data
- Encrypt data at rest
- Encrypt data in transit
- Use proxies which limit and govern access to sensitive data of engineers
Reduce Your Data Attack Surface by using a Least Privilege Approach
The less people and systems have access to sensitive data, the less chances a misconfiguration or an insider will cause a data breach.
The most optimal method of reducing access to data is by using the least privilege approach of only granting access to entities that need the data. The type of access is also important - if read-only access is enough, then it’s important to make sure that write access or administrative access is not accidentally granted.
To know which entities need what access, engineering teams need to be responsible for mapping all systems in the organization and ensuring that no data stores are accessible to entities which do not need access.
Engineers can get started by analyzing the actual use of the data using cloud tools such as Cloudtrail. Once there’s an understanding of which users and services access infrastructure with sensitive data, the actual permissions to the data stores should be reviewed and matched against usage data. If partial permissions are adequate to keep operations running, then it’s possible to reduce the existing permissions within existing roles.
Reducing Your Data Attack Surface by Tokenizing Your Sensitive Data
Tokenization is a great tool which can protect your data - however it’s hard to deploy and requires a lot of effort from engineers.
Tokenization is the act of replacing sensitive data such as email addresses and credit card information with tokens, which correspond to the actual data. These tokens can reside in databases and logs throughout your cloud environment without any concern, since exposing them does not reveal the actual data but only a reference to the data.
When the data actually needs to be used (e.g. when emailing the customer or making a transaction with their credit card) the token can be used to access a vault which holds the sensitive information. This vault is highly secured using throttling limits, strong encryption, very strict access limits, and even hardware-based methods to provide adequate protection.
This method also provides a simple way to purge sensitive customer data, since the tokens that represent the sensitive data are meaningless if the data was purged from the sensitive data vault.
Reducing Your Data Attack Surface by Encrypting Your Sensitive Data
Encryption is an important technique which should almost always be used to protect sensitive data. There are two methods of encryption: using the infrastructure or platform you are using to encrypt and decrypt the data, or encrypting it on your own. In most cases, it’s more convenient to encrypt your data using the platform because it is simply a configuration change. This will allow you to ensure that only the people who need access to data will have access via encryption keys. In Amazon Web Services for example, only principals with access to the KMS vault will be able to decrypt information in an S3 bucket with KMS encryption enabled.
It is also possible to encrypt the data by using a customer-managed key, which has its advantages and disadvantages. The advantage is that it’s harder for a misconfiguration to accidentally allow access to the encryption keys, and that you don’t have to rely on the platform you are using to store them. However, using customer-managed keys means you need to send the keys over more frequently to the systems which encrypt and decrypt it, which increases the chance of the key being exposed.
Reducing Your Data Attack Surface by using Privileged Access Management Solutions
There are many tools that centrally manage access to databases. In general, they are divided into two categories: Zero-Trust Privilege Access Management solutions, and Database Governance proxies. Both provide protection against data leaks in different ways.
Zero-Trust Privilege Access Management solutions replace traditional database connectivity with stronger authentication methods combined with network access. Tools such as StrongDM and Teleport (open-source) allow developers to connect to production databases by using authentication with the corporate identity provider.
Database Governance proxies such as Satori and Immuta control how developers interact with sensitive data in production databases. These proxies control not only who can access sensitive data, but how they access the data. By proxying the requests, sensitive data can be tracked and these proxies guarantee that no sensitive information is being queried by developers. When sensitive data is queried, these proxies can either mask the sensitive information, or simply omit or disallow the requests ensuring that sensitive data doesn’t leave the database.
Reducing the data attack surface reflects the reality of the attackers mindset. They’re not trying to get into your infrastructure to breach the network. They’re doing it to find the sensitive data. By ensuring that sensitive data always is secured, tokenized, encrypted, and with least privilege access, they’ll be nothing valuable for an attacker to find - even in the event of a breach.