Skip to main content

BigQuery Dynamic Data Masking

Tianzhou · May 24, 2026

Sensitive columns — SSNs, credit cards, emails, addresses — must stay queryable for support, analytics, and development. Broad cleartext access is not the answer. Data masking is. For some workloads (GDPR, HIPAA, PCI), masking is also a legal requirement.

BigQuery ships dynamic data masking as part of column-level security, built on policy tags managed in Dataplex Universal Catalog. Bytebase Dynamic Data Masking sits in front. One policy model across every BigQuery dataset and every other engine in the fleet. Request. Review. Approve. This post compares them.

BigQuery Dynamic Data Masking

Dynamic data masking is GA. It extends column-level security and carries no separate license charge — every BigQuery project has it. (Contrast Snowflake, which gates masking behind an Enterprise upgrade.) One caveat: masking may not apply under reservations created with certain BigQuery editions, so verify against your slot configuration.

Masking is configured in three steps:

  1. Create a taxonomy of policy tags.
  2. Attach a policy tag to a column.
  3. Define a data policy that binds a masking rule and a set of principals to that tag.

At read time, BigQuery rewrites the column for principals who hold the Masked Reader role. The stored data is unchanged.

Loading diagram…

BigQuery ships a set of built-in masking rules, plus custom routines for anything they don't cover. The same three steps in commands:

# 1. Create a taxonomy and a policy tag (Dataplex Universal Catalog).
gcloud data-catalog taxonomies create \
  --location=us --display-name=pii

gcloud data-catalog taxonomies policy-tags create \
  --location=us --taxonomy=$TAXONOMY_ID --display-name=ssn

# 2. Attach the policy tag to a column through the table schema.
#    schema.json:
#    [{ "name": "ssn", "type": "STRING",
#       "policyTags": { "names": ["projects/$PROJECT/locations/us/taxonomies/$T/policyTags/$P"] } }]
bq update --schema schema.json $PROJECT:sales.customers

# 3. Create a data policy that binds a masking rule to the policy tag.
#    There is no SQL DDL or bq command for this — use the Data Policy API.
curl -X POST \
  "https://bigquerydatapolicy.googleapis.com/v1/projects/$PROJECT/locations/us/dataPolicies" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  -d '{
    "dataPolicyId": "ssn_mask",
    "dataPolicyType": "DATA_MASKING_POLICY",
    "policyTag": "projects/'"$PROJECT"'/locations/us/taxonomies/'"$T"'/policyTags/'"$P"'",
    "dataMaskingPolicy": { "predefinedExpression": "LAST_FOUR_CHARACTERS" }
  }'

# 4. Grant the principal the Masked Reader role.
gcloud projects add-iam-policy-binding $PROJECT \
  --member="user:analyst@example.com" \
  --role="roles/bigquerydatapolicy.maskedReader"

The console drives the same flow under Manage Data Policies; for policy-as-code, the Terraform google_bigquery_datapolicy_data_policy resource wraps step 3.

Permissions: three states

Access to a policy-tagged column resolves to one of three states, decided by the IAM role the principal holds:

  • Fine-Grained Reader (datacatalog.categoryFineGrainedReader) — sees cleartext. The role sits above masking. Grant it sparingly and audit changes.
  • Masked Reader (bigquerydatapolicy.maskedReader) — sees the masked value the data policy defines.
  • Neither — the query is denied. Column-level security blocks the read outright.

A column carries one policy tag, and that tag maps to a single masking rule. Masking is per column, per tag — there is no second rule to layer on top.

What BigQuery data masking does not do

  • Stop Fine-Grained Readers. The role returns unmasked values. It is a plain IAM grant, so limit who holds it and audit the grants.
  • Apply more than one rule per column. One policy tag per column; the tag's data policy selects one masking rule. No composition.
  • Subject direct or service access to review. BI tools, scheduled jobs, and application service accounts query the warehouse with whatever role they carry. A service account granted Fine-Grained Reader reads cleartext, with no request or review.
  • Provide an approval workflow or an unmask audit trail. Granting Fine-Grained Reader is an IAM edit. There is no Request–Review–Approve path, and the grant is not a first-class, audited masking event.
  • Work with legacy SQL, wildcard (*) table queries, or copy jobs. Masking is incompatible with all three.
  • Mask partitioning or clustering columns via custom routines. A custom masking routine does not apply to a column used for partitioning or clustering.
  • Filter rows. Masking is column-level. For row-level control, use row-level security (row access policies).

Bytebase Dynamic Data Masking

_

Native masking has a documented gap. Fine-Grained Reader grants return cleartext, and they are plain IAM edits — no request, no review, no audited unmask event. Granting access and proving who saw what are separate, manual steps. The cause is structural: masking rewrites the column inside the warehouse, but the role grant that bypasses it lives upstream, in IAM. Closing the gap requires governing the query itself — not just the column.

Bytebase Dynamic Data Masking governs the query. Queries route through Bytebase's SQL Editor. Bytebase masks results before they leave the editor. A Fine-Grained Reader grant on the backing project does not bypass the policy. Unmasking becomes an access decision: granting Query rights runs through a built-in workflow — Request. Review. Approve. — every step audited.

Policies compose from three layers, evaluated in fixed precedence: Masking Exemption > Global Masking Rule > Column Masking.

  1. Global Masking Rule. Workspace-level. Rules evaluate top-down. First match wins. Match conditions span environment, project, database, and data classification. Each match applies a Semantic Type, which selects a masking algorithm — full, partial, MD5, range, or custom.

_

  1. Column Masking. Project-level override on a specific column when the global rule does not apply.

_

  1. Masking Exemption. Named users receive time-bound Query or Export exemptions to specific databases or tables. Service accounts are not eligible. Every grant logged. Every access logged.

_

Masking propagates. When a column is masked, the policy extends to every view and derived structure that depends on it. Expressions over masked columns stay masked.

_

Policies can also be codified via GitOps.

Masking decisions are recorded in the audit log. Every SQL execution entry carries per-column masking metadata — masked columns, Semantic Type, matching rule — alongside user, source IP, statement, and row count. Granted exemptions, used exemptions, and policy edits are first-class audit events. The access decision and the proof of enforcement share the same record.

Enforcement boundary: Bytebase masks queries routed through the SQL Editor. Traffic that hits BigQuery directly bypasses it — BI tools, scheduled jobs, and service accounts, covered there by native data masking and IAM. The pattern is symmetric: native masking at the warehouse; Bytebase on the human query path, where approval and audit matter. One policy applies across BigQuery and the Postgres, MySQL, SQL Server, Oracle, and Snowflake instances next to it.

Comparison

BigQuery Dynamic Data MaskingBytebase Dynamic Data Masking
CompatibilityBigQuery onlyAll engines including BigQuery ⭐️
MechanismPolicy tag + data policy per column ⭐️Policy in Bytebase, applied at SQL Editor
Enforced atWarehouse, every read path ⭐️SQL Editor
Masking rulesBuilt-in rules + custom routine ⭐️Full, partial, MD5, range, custom
Policy mgmtConsole / Data Policy API / TerraformCentralized UI, grants, audit log ⭐️
Permission scopeColumn (one policy tag)Project, database, table, column ⭐️
WorkflowIAM grant onlyRequest. Review. Approve. ⭐️
Row-level filterNo (pair with row access policies)No (pair with access policy)
CostIncluded with BigQuery ⭐️Bytebase Enterprise

Picking one

  • Single BigQuery estate. Masking must hold across every client. Use native data masking. In-warehouse, every read path — including the BI tools, scheduled jobs, and service accounts that connect directly. Pair with row access policies for row-level control. Keep Fine-Grained Reader grants few and logged.
  • You need an approval workflow and an unmask audit trail. Native masking grants are plain IAM edits. Use Bytebase on the human query path — Request. Review. Approve. — with every exemption and access logged.
  • Mixed fleet — BigQuery alongside Postgres, MySQL, SQL Server, Snowflake. Use Bytebase. One policy model. Every engine. Audited grants for every unmask, recorded in the same place as your access logs.
  • Both. Native masking at the warehouse — direct connections, BI tools, service accounts. Bytebase governs the human query path through the SQL Editor with approval and audit. They compose.

Try Bytebase Dynamic Data Masking with this tutorial.

Back to blog

Explore the standard for database development