# ERROR 22001: String Data Right Truncation in Postgres

Source: https://www.bytebase.com/reference/postgres/error/22001-string-data-right-truncation/

---

## Error Message

```sql
ERROR: value too long for type character varying(50)
SQLSTATE: 22001
```

You may also see the type name spelled differently depending on the column definition:

```
ERROR: value too long for type character(10)
ERROR: value too long for type bit varying(8)
ERROR: value too long for type bit(8)
```

The underlying SQLSTATE is always `22001` ("string_data_right_truncation"), regardless of which fixed-width type was the target.

## What Triggers This Error

22001 fires whenever PostgreSQL is asked to store a string that is longer than the column's declared maximum length, and the implicit truncation is not allowed. Unlike MySQL in non-strict mode, PostgreSQL never silently truncates — it always errors. The most common triggers:

- **Application writes a string longer than the column allows** — the simplest case, usually from an unvalidated form field or an upstream system loosening its own validation
- **`VARCHAR(n)` chosen without thinking about real-world data length** — `VARCHAR(255)` is a MySQL convention, not a PostgreSQL one; in Postgres `TEXT` has no length cost
- **ORM round-trip over a `CHAR(n)` column** — PostgreSQL stores `CHAR(n)` values space-padded to the column width; some ORMs/drivers trim trailing spaces on read, so the round-tripped value plus any application-side concatenation can exceed the column on write
- **`COPY` or bulk-load from CSV with an unexpectedly long value** — fails the entire batch, not just the bad row
- **Migration adds a `NOT NULL DEFAULT` with a value longer than the column** — the implicit backfill of existing rows fails
- **String concatenation produces a result that exceeds the target column** — `UPDATE t SET name = name || '_archived'` fails when `name` is already near the limit
- **Normalization or templating increases the character count** — trimming + prefix/suffix insertion, transliteration that adds combining marks, or generated labels can turn a previously valid value into one that exceeds `VARCHAR(n)` or `CHAR(n)`

## Fix by Scenario

### Application writes a string longer than the column allows

Validate at the application layer before the insert. If the input genuinely can be long, widen the column instead of truncating in code:

```sql
-- Widen from VARCHAR(50) to VARCHAR(255) — no rewrite needed
ALTER TABLE users ALTER COLUMN bio TYPE VARCHAR(255);

-- Or remove the limit entirely
ALTER TABLE users ALTER COLUMN bio TYPE TEXT;
```

In PostgreSQL, `TEXT` and `VARCHAR` without a length specifier have **identical storage cost** and **identical performance**. The only reason to use `VARCHAR(n)` over `TEXT` is to enforce a length constraint at the database layer.

If you want length enforcement without picking an arbitrary number, use a `CHECK` constraint:

```sql
ALTER TABLE users
ALTER COLUMN bio TYPE TEXT,
ADD CONSTRAINT bio_length_check CHECK (length(bio) <= 280);
```

### VARCHAR(255) inherited from MySQL conventions

If you ported a schema from MySQL, every `VARCHAR(255)` is suspect — it almost always reflects MySQL's old row-length limit, not a real business rule. PostgreSQL has no equivalent constraint.

Audit which columns actually need length limits and convert the rest to `TEXT`. Bytebase or any schema-diff tool will flag the change as low-risk because the on-disk format is identical.

### ORM mass-assignment over CHAR(n)

`CHAR(n)` pads stored values with spaces to width `n`, but most ORMs trim trailing spaces on read. The next save round-trips the trimmed value back, then concatenates or appends in application code, and exceeds `n`.

```sql
-- Country code stored as CHAR(2) padded to 2 chars on read
-- Application sees 'US' (no padding) and concatenates ' (legacy)'
-- Resulting 'US (legacy)' fails 22001
```

Fix: convert `CHAR(n)` to `VARCHAR(n)` or `TEXT`. `CHAR(n)` is rarely the right choice in modern schemas — it costs the same as `VARCHAR(n)` on disk but has surprising semantics on read.

### COPY or bulk-load from CSV

A single overlong value fails the whole batch:

```sql
COPY orders FROM '/tmp/orders.csv' CSV HEADER;
-- ERROR: value too long for type character varying(20)
-- CONTEXT: COPY orders, line 4827, column reference
```

Find the bad row before re-running. PostgreSQL tells you the line number — quickest path is to look at it directly:

```bash
sed -n '4827p' /tmp/orders.csv
```

If the data is genuinely correct and the column is too narrow, widen the column once and re-run COPY. If the data is wrong (truncated upstream, encoding bug), fix at the source.

For ongoing bulk loads where occasional bad rows are acceptable, stage to a wider table first, validate, then insert into the production table:

```sql
CREATE TEMP TABLE orders_staging (LIKE orders INCLUDING ALL);
ALTER TABLE orders_staging ALTER COLUMN reference TYPE TEXT;
COPY orders_staging FROM '/tmp/orders.csv' CSV HEADER;

-- Validate, then insert only good rows
INSERT INTO orders
SELECT * FROM orders_staging WHERE length(reference) <= 20;
```

### Migration adds NOT NULL DEFAULT longer than the column

A migration like this fails on the implicit backfill, not on the schema change itself:

```sql
ALTER TABLE users
ADD COLUMN status VARCHAR(10) NOT NULL DEFAULT 'pending_activation';
-- ERROR: value too long for type character varying(10)
```

Fix the default value or widen the column before adding the constraint:

```sql
ALTER TABLE users ADD COLUMN status VARCHAR(20) NOT NULL DEFAULT 'pending_activation';
```

Schema-review tooling catches this class of failure before the migration runs — [Bytebase's SQL Review](https://docs.bytebase.com/sql-review/review-rules/) compares default-value length against column length during change review.

### String concatenation exceeds the column

```sql
UPDATE products SET sku = sku || '_archived' WHERE archived = true;
-- ERROR: value too long for type character varying(20)
```

Either widen the column or split the data into two columns (`sku` + `status`) so the SKU itself doesn't need the suffix:

```sql
ALTER TABLE products ADD COLUMN status VARCHAR(20) DEFAULT 'active';
UPDATE products SET status = 'archived' WHERE archived = true;
```

This is also the more queryable design — `WHERE status = 'archived'` beats `WHERE sku LIKE '%_archived'`.

### Normalization or templating quietly increases the character count

`VARCHAR(n)` in PostgreSQL limits **characters**, not bytes — but application logic that rewrites the value before insert can push a previously-valid string past `n`. Common patterns:

- **Unicode normalization** — converting NFC ↔ NFD changes how combining marks are counted. `café` in NFC is 4 characters; in NFD it's 5 (`c`, `a`, `f`, `e`, ◌́).
- **Transliteration that adds characters** — `straße` (6) → `strasse` (7) when the German `ß` is expanded.
- **Templated identifiers** — `SET sku = 'PROD-' || sku` adds 5 characters to every row; if any row was within 5 of the column limit, the update fires 22001.
- **Trim-then-pad cycles in ETL pipelines** — trim leading/trailing whitespace, then re-pad with a different convention, can change length non-obviously.

Fix: identify the transformation step that's widening the value, then either widen the column (`ALTER COLUMN x TYPE VARCHAR(n)` with a larger `n`, or switch to `TEXT`) or change the transformation to preserve length.

```sql
-- Find rows that would overflow before running the update
SELECT id, length(sku), length('PROD-' || sku) AS new_length
FROM products
WHERE length('PROD-' || sku) > 20
ORDER BY new_length DESC;
```

## Prevention

- Default to `TEXT` over `VARCHAR(n)`. Use `VARCHAR(n)` only when there is a real business constraint on length (country code = 2, US ZIP = 5, ISO currency = 3). Pick the number from the constraint, not from MySQL muscle memory.
- Enforce length with `CHECK` constraints when the real rule is a maximum that might change (`length(bio) <= 280`) — these are easier to relax than column-type changes.
- Validate input at the application layer before the insert. By the time PostgreSQL is rejecting the write, you've already paid for the round-trip.
- Run schema review against the target column lengths during change review — Bytebase, sqlfluff, and similar tools flag default values that exceed declared column lengths.
- For bulk loads, stage to a wider table, validate, then insert into production. Don't bet a whole COPY on the cleanest row of the input file.

> **Note:** [Bytebase's SQL Review](https://docs.bytebase.com/sql-review/review-rules/) flags `ADD COLUMN ... NOT NULL DEFAULT` migrations whose default values exceed the column length before the change runs, catching the most common 22001 source during change review. See also [ERROR 22P02: Invalid Input Syntax](/reference/postgres/error/22p02-invalid-input-syntax-postgres) for the closely-related type-conversion failure.