Naming, types, money, identifiers, soft-delete, denormalisation — the schema-shaping decisions that get expensive to reverse. The defaults the AI should treat as constraints.

Data modelling

A sourdough starter for the data-shape decisions that bind everything downstream. The AI's default behaviour is to invent per-table patterns; this document is what makes them consistent.

Naming

Thing	Convention	Example
Tables	snake_case, plural	`users`, `tool_runs`, `audit_log_entries`
Columns	snake_case	`created_at`, `email_verified_at`
Primary keys	`id` (UUID by default)	`id uuid primary key default gen_random_uuid()`
Foreign keys	`<singular_referenced>_id`	`user_id`, `project_id`
Booleans	positive verb, no negation	`is_active` (not `is_disabled`), `is_admin` (not `not_admin`)
Timestamps	`<event>_at`, always with timezone	`created_at timestamptz`, `published_at timestamptz`
Enums	snake_case values	`status: 'pending'
Indexes	`<table>_<columns>_idx`	`tool_runs_user_id_created_at_idx`

Plural vs singular for table names is the only debate worth having (some teams prefer singular: user, tool_run). Pick one, lock it. Mixing is the bug.

Money

Rule: integers, base unit (pence / cents), never floats.

-- Right
price_pence integer not null default 0
-- Wrong (floating point arithmetic is lossy on money)
price_pounds numeric(10, 2)

Why: float arithmetic introduces rounding errors that compound. 0.1 + 0.2 ≠ 0.3 in IEEE 754 floats. Money in floats is a known bug class.

At the display boundary: convert pence → display. Never store the display value.

function formatPrice(pence: number): string {
  return `£${(pence / 100).toFixed(2)}`;
}

Currency: if you're multi-currency, store amount_minor (integer) + currency (ISO 4217 code), and never mix. Conversion happens at presentation time, with FX rates frozen at transaction time.

Identifiers

UUIDs by default (uuid type, gen_random_uuid() default). Don't expose sequential integers as primary keys to the public — they leak order-of-creation, count, and let attackers enumerate.
External IDs (e.g. Stripe customer ID, GitHub user ID) stored as columns alongside, never as the primary key. The external system can churn; your primary key shouldn't.
Slug columns for user-facing URL identifiers. Indexed unique; lowercase, kebab-case; immutable once published (or with explicit redirect handling on change).

create table tools (
  id uuid primary key default gen_random_uuid(),
  slug text not null unique,
  stripe_product_id text unique,
  created_at timestamptz not null default now()
);

Timestamps

Every table has created_at. Most tables have updated_at. Both timestamptz (not timestamp).

create table whatever (
  -- ...
  created_at timestamptz not null default now(),
  updated_at timestamptz not null default now()
);

-- Trigger to maintain updated_at:
create trigger whatever_updated_at
  before update on whatever
  for each row execute function moddatetime('updated_at');

Deletion: soft-delete with deleted_at timestamptz instead of hard DELETE for any table where the row matters historically (orders, audit logs, user-generated content). Hard-delete for ephemeral tables (sessions, idempotency keys, cache).

RLS posture (Supabase / Postgres)

This belongs in security-posture primarily, but it shapes every table you write.

RLS enabled by default on every table. Even tables not yet user-facing.
Scope by auth.uid() for user-owned rows. auth.role() = 'authenticated' is NOT scoping — it just checks the user is logged in.
WITH CHECK on INSERT + UPDATE. Without it, a row can be reassigned to another user (the row-reassignment bug).
Tables without an obvious owner (lookups, public catalogues) get using (true) for reads — explicit, not by omission.

rls_checker covers 8 misconfiguration patterns.

Denormalisation

Default: don't. Normal forms exist for a reason.

When to break the rule:

Computed fields used in 90%+ of reads (e.g. comment_count on a posts table). Maintain via triggers, document the staleness window.
Pre-joined data for performance (e.g. embedding the author's display name on a posts row to avoid the JOIN). Worth it when read latency is the bottleneck and the duplicated field is rarely changed.
Audit-trail tables that capture state-at-time (e.g. order_line_items.product_name_at_purchase — the product can change, the historical purchase shouldn't).

If you do denormalise, document why in a comment on the table (or in an ADR). The next session needs to know it's intentional.

JSON columns

jsonb (not json) when you use them. They're useful for:

Per-row variable shape (e.g. event payloads, tool outputs, config blobs).
Append-only logs where the shape evolves over time.

They're not useful for:

Anything you'll query/filter regularly. Use real columns; you get indexes and type safety.
Anything joined to other tables. Foreign keys can't reference JSON.

Rule: if you query a JSON field with a WHERE clause more than once, promote it to its own column.

Migrations

Numbered, sequential, never reused. 001_initial_schema.sql, 002_add_user_avatar.sql. Same convention as ADRs.
One concern per migration. Don't bundle "add column + rename column + drop table" into one file.
Reversible where possible. Down-migrations are good hygiene; some changes are one-way (dropping columns with data, splitting tables) and that's fine — just be explicit.
Idempotent where possible. create table if not exists, add column if not exists. Makes re-running a migration safe.
No data migrations mixed with schema migrations. Schema goes in supabase/migrations/. Data backfills go in scripts (scripts/backfill-X.ts) that you run once, version, and commit.

How to feed this starter

Add to it when:

You introduce a new pattern (first auditable table, first JSON-heavy table, first FX-converted money table). Document the choice.
You hit a data-modelling bug worth recording. ("We used text for currency codes, then someone stored 'GBP ' with trailing space — switched to a CHECK constraint.")

Remove from it when:

A convention turns out wrong. Better to delete than to have a dead rule.

Companion starters

security-posture — the RLS layer that sits on every table
adr — schema-shaping decisions are exactly what ADRs are for
project-starter — parent shape