Explanation

Data Masking for PostgreSQL Databases

Mila
Mila5 min read
Data Masking for PostgreSQL Databases

Data Masking is a widely employed approach to safeguarding sensitive data, like credit card details, Social Security Numbers (SSNs), and addresses. And sometimes, masking data is much more than just keeping your and your customers' data secure – in some cases it is required by law, the most famous example is GDPR.

Various methods of data masking, such as substitution, shuffling, and redaction, exist and serve different purposes. Masking sensitive data enables organizations to reduce the likelihood of data breaches and unauthorized access, while still maintaining the ability to work with realistic data for tasks like development, testing, and analytics.

PostgreSQL Anonymizer

_

PostgreSQL Anonymizer is a community extension that can add data masking capabilities with different masking options and methods to PostgreSQL.

It stores masking configuration in PostgreSQL Security Label.

Dynamic Masking

Dynamic Masking works by declaring a role as a "MASKED" one as well as the masking rules. The users granted the "MASKED" role won't be able to access original data, while other roles can still do so. There are various masking functions available, and you can even write your own rules.

_

There are certain limitations to this method, for examle, as mentioned in their docs, there could be issues if you use GUIs such as DBeaver or pgAdmin, and Dynamic Masking could be very slow with certain queries. Additionally, different views are needed for different masking variations, which again, quickly becomes unmanageable as the roles change or underlying tables and vairations increase.

Static Masking

PostgreSQL Anonymizer also supports Static Masking, which directly transforms the original dataset directly. You can replace original data with fake ones, add noise, or shuffle data to hide sensitive data.

Note that this method will destroy the original data and is a slow process. So think twice before you use static masking. The principle of static masking is to update all lines of all tables containing at least one masked column. This basically means that PostgreSQL will rewrite all the data on disk.

Bytebase Dynamic Data Masking

_

Bytebase Dynamic Data Masking doesn't depend on PostgreSQL views and users. It manages the masking policies and grants inside Bytebase. Masking policy is applied when user queries from the SQL Editor.

_

Bytebase Dynamic Data Masking consists of the following components:

  1. Global Masking Rule: Workspace Admin and DBA can apply masking levels in batch, e.g. all columns named as "email" are masked at "Partial" masking level. You can also easily change masking policy without having to reapply the masking policy to thousands of columns, and the hassle of maintaining views is saved.

_

  1. Column Masking Rule: Workspace Admin and DBA can set table columns as different masking levels. Column masking rule takes precedence over the global masking rule.

_

  1. Access Unmasked data: for the masked content, Workspace Admin and DBA can grant specific users permission to access unmasked data.

_

Workspace Admin and DBA here are roles in Bytebase.

Comparison Table

PostgreSQL AnonymizerBytebase Dynamic Data Masking
CompatibilityRequires postgresql_anonymizer extensionAll PostgreSQL distributions ⭐️
Enforced atDatabase self ⭐️Bytebase SQL Editor
FeaturesBasicAdvanced with granular masking policy and access grants ⭐️
PriceFree ⭐️Paid

PostgreSQL Anonymizer's advantage is that it can directly be implemented in the database itself. Thus data masking rules are enforced regardless of how queries are sent to the database. For Bytebase Dynamic Data Masking, queries must go through SQL Editor to be enforced.

The advantage of Bytebase Dynamic Data Masking is its compatibility with all PostgreSQL distributions, feature-rich masking policy and access grants. As long as team can be enforced to query databases via Bytebase SQL Editor (which is desired from the management perspective), then Bytebase Dynamic Data Masking is a perfect choice.


You can try Bytebase Dynamic Data Masking following this tutorial. If you encounter any issues, or need a helping hand, feel free to join our Discord channel!

Jointhe community

At Bytebase, we believe in the power of collaboration and open communication, and we have a number of communities that you can join to connect with other like-minded.

Subscribe to Newsletter

By subscribing, you agree with Bytebase's Terms of Service and Privacy Policy.