Data Masking for PostgreSQL Databases
Data Masking is a widely employed approach to safeguarding sensitive data, like credit card details, Social Security Numbers (SSNs), and addresses. And sometimes, masking data is much more than just keeping your and your customers' data secure – in some cases it is required by law, the most famous example is GDPR.
Various methods of data masking, such as substitution, shuffling, and redaction, exist and serve different purposes. Masking sensitive data enables organizations to reduce the likelihood of data breaches and unauthorized access, while still maintaining the ability to work with realistic data for tasks like development, testing, and analytics.
PostgreSQL Anonymizer is a community extension that can add data masking capabilities with different masking options and methods to PostgreSQL.
It stores masking configuration in PostgreSQL Security Label.
Dynamic Masking works by declaring a role as a "MASKED" one as well as the masking rules. The users granted the "MASKED" role won't be able to access original data, while other roles can still do so. There are various masking functions available, and you can even write your own rules.
There are certain limitations to this method, for examle, as mentioned in their docs, there could be issues if you use GUIs such as DBeaver or pgAdmin, and Dynamic Masking could be very slow with certain queries. Additionally, different views are needed for different masking variations, which again, quickly becomes unmanageable as the roles change or underlying tables and vairations increase.
PostgreSQL Anonymizer also supports Static Masking, which directly transforms the original dataset directly. You can replace original data with fake ones, add noise, or shuffle data to hide sensitive data.
Note that this method will destroy the original data and is a slow process. So think twice before you use static masking. The principle of static masking is to update all lines of all tables containing at least one masked column. This basically means that PostgreSQL will rewrite all the data on disk.
Bytebase Dynamic Data Masking
Bytebase Dynamic Data Masking doesn't depend on PostgreSQL views and users. It manages the masking policies and grants inside Bytebase. Masking policy is applied when user queries from the SQL Editor.
Bytebase Dynamic Data Masking consists of the following components:
- Global Masking Rule:
DBAcan apply masking levels in batch, e.g. all columns named as "email" are masked at "Partial" masking level. You can also easily change masking policy without having to reapply the masking policy to thousands of columns, and the hassle of maintaining views is saved.
- Column Masking Rule:
DBAcan set table columns as different masking levels. Column masking rule takes precedence over the global masking rule.
- Access Unmasked data: for the masked content,
DBAcan grant specific users permission to access unmasked data.
Workspace Admin and
DBA here are roles in Bytebase.
|PostgreSQL Anonymizer||Bytebase Dynamic Data Masking|
|Compatibility||Requires ||All PostgreSQL distributions ⭐️|
|Enforced at||Database self ⭐️||Bytebase SQL Editor|
|Features||Basic||Advanced with granular masking policy and access grants ⭐️|
PostgreSQL Anonymizer's advantage is that it can directly be implemented in the database itself. Thus data masking rules are enforced regardless of how queries are sent to the database. For Bytebase Dynamic Data Masking, queries must go through SQL Editor to be enforced.
The advantage of Bytebase Dynamic Data Masking is its compatibility with all PostgreSQL distributions, feature-rich masking policy and access grants. As long as team can be enforced to query databases via Bytebase SQL Editor (which is desired from the management perspective), then Bytebase Dynamic Data Masking is a perfect choice.