Multi-Tenant Data Isolation

Spotlight is a multi-tenant platform where each tenant's data must be completely isolated from all other tenants. This is achieved through DynamoDB partition key isolation, tenant-scoped queries at the application layer, and infrastructure-level controls.

Isolation Model

Every DynamoDB table in the system uses the tenant_id as part of its partition key structure. This ensures that DynamoDB's internal data distribution physically separates tenant data across different partitions:

+-----------------------------------+
|  Spotlight.Tours.Definitions      |
|                                   |
|  Partition: tenant_abc            |
|    tour_001, tour_002, tour_003   |
|                                   |
|  Partition: tenant_xyz            |
|    tour_100, tour_101             |
|                                   |
|  (separate partitions, no overlap)|
+-----------------------------------+

Partition Key Design

Direct Tenant Partitioning

Tables where tenant_id is the partition key:

Table	PK	SK
`Tenants.Config`	`tenant_id`	--
`Tours.Definitions`	`tenant_id`	`tour_id`
`Content.Definitions`	`tenant_id`	`content_id`
`Content.Checklists`	`tenant_id`	`checklist_id`
`Audiences.Rules`	`tenant_id`	`audience_id`
`Themes.Definitions`	`tenant_id`	`theme_id`
`Surveys.Definitions`	`tenant_id`	`survey_id`
`Admin.Users`	`tenant_id`	`user_id`
`Audit.AdminActions`	`tenant_id`	`timestamp_event_id`
`Tenants.ApiKeys`	`tenant_id`	`api_key_prefix`

With tenant_id as the partition key, a DynamoDB Query operation on one tenant cannot return results from another tenant. This is enforced by DynamoDB at the storage layer.

Composite Key Partitioning

Tables that embed tenant_id in a composite partition key:

Table	PK Format	SK
`Tours.Versions`	`tenant_abc#tour_456`	`version` (N)
`Progress.UserState`	`tenant_abc#user_123`	`content_id`
`Events.Interactions`	`tenant_abc#tour_456`	`timestamp#event_id`
`Events.Aggregates`	`tenant_abc#tour_456`	`date#metric`
`Surveys.Responses`	`tenant_abc#survey_789`	`user_timestamp`
`Activity.UserEvents`	`tenant_abc#user_123`	`timestamp_event_id`

The composite key format {tenant_id}#{entity_id} ensures that queries are always scoped to a single tenant. Even if an attacker discovers another tenant's entity_id, they cannot access it because the partition key would resolve to {their_tenant_id}#{entity_id} -- a different partition entirely.

API Keys

The Tenants.ApiKeys table is partitioned on tenant_id with api_key_prefix as the sort key. The full plaintext key format is sk_<env>_<32-hex-tenant-uuid>_<random>, so the validation path extracts the tenant UUID from the key itself and issues a direct GetItem against (tenant_id, api_key_prefix). No cross-tenant scan is ever required, and a bearer presenting a key can only ever touch their own partition.

Cross-tenant exception: Platform.Memberships

There is exactly one table where a row's partition key is not tenant_id: Spotlight.Platform.Memberships. It maps a Clerk platform user (platform_user_id) to the tenants they can switch into, and is read before a tenant is chosen — so partitioning by tenant_id would make the "which tenants am I on?" lookup impossible.

PK: platform_user_id
SK: tenant_id
Attributes: role, created_at

The trust model is:

Only /v1/platform/tenants reads this table, and only with a validated platform JWT. The caller's sub claim is pinned as the partition key — a user can only ever list their own memberships.
Once a tenant is picked, the regular AuthContext.tenant_id takes over and every downstream repo call is tenant-scoped as usual.
Row creation happens only through super-admin tenant-create, invite-accept, or the seed script. There is no user-facing endpoint that can mint a membership outside those flows.

This table intentionally sits outside the tenant-isolation invariant because it's the pivot that makes multi-tenant admin possible in the first place. Treat any new code that reads it with the same care as authentication itself.

Application-Layer Enforcement

Tenant Context Injection

After authentication, the tenant ID is injected into every route handler via the AuthContext dependency:

python

@router.get("/{tour_id}")
async def get_tour(
    tour_id: str,
    auth: AuthContext = Depends(require_admin),
    tour_repo: TourRepository = Depends(get_tour_repo),
):
    # auth.tenant_id is set by the authentication middleware
    # The repository uses it to scope the query
    tour = await tour_repo.get(auth.tenant_id, tour_id)

The tenant_id comes from the authenticated API key -- never from user input. There is no X-Tenant-Id header or query parameter that allows callers to specify a different tenant.

Repository Pattern

All repository methods require tenant_id as the first parameter and construct DynamoDB keys using it:

python

class DynamoDBContentRepository:
    async def get(self, tenant_id: str, content_id: str) -> Content | None:
        resp = await asyncio.to_thread(
            self._client.get_item,
            TableName=self._table,
            Key={
                "tenant_id": {"S": tenant_id},
                "content_id": {"S": content_id},
            },
        )
        item = resp.get("Item")
        # ...

    async def list_by_tenant(self, tenant_id: str, limit: int = 20) -> list[Content]:
        resp = await asyncio.to_thread(
            self._client.query,
            TableName=self._table,
            KeyConditionExpression="tenant_id = :tid",
            ExpressionAttributeValues={
                ":tid": {"S": tenant_id},
            },
            Limit=limit,
        )
        # ...

This design makes cross-tenant data access structurally impossible -- there is no code path that queries DynamoDB without a tenant-scoped key.

No Cross-Tenant Queries

The system never performs table scans or queries that span multiple tenants. Every DynamoDB operation is one of:

GetItem with a key containing tenant_id.
Query with tenant_id as the partition key expression.
PutItem / UpdateItem / DeleteItem with a key containing tenant_id.
TransactWriteItems where all items include tenant_id.

No table scans

DynamoDB Scan operations are prohibited in production code. A scan would read all items across all tenants, violating data isolation. The only exception is the local development seed script.

Global Secondary Index Isolation

GSIs follow the same tenant-scoped pattern:

hcl

# Tours.Definitions: query by status within a tenant
global_secondary_index {
  name     = "gsi-tenant-status"
  hash_key = "tenant_status"     # "tenant_abc#published"
  range_key = "updated_at"
}

The GSI partition key is a composite of tenant_id and status, so querying tenant_abc#published cannot return results from tenant_xyz.

python

# Safe: tenant-scoped GSI query
resp = client.query(
    TableName="Spotlight.Tours.Definitions",
    IndexName="gsi-tenant-status",
    KeyConditionExpression="tenant_status = :ts",
    ExpressionAttributeValues={
        ":ts": {"S": f"{tenant_id}#published"},
    },
)

The same pattern applies to all GSIs:

GSI	PK Format	Tenant Scope
`gsi-tenant-status`	`{tenant_id}#{status}`	Isolated
`gsi-content-users`	`{tenant_id}#{content_id}`	Isolated
`gsi-content`	`{tenant_id}#{content_id}`	Isolated
`gsi-session`	`{tenant_id}#{session_id}`	Isolated
`gsi-actor`	`{tenant_id}#{actor_id}`	Isolated
`gsi-entity`	`{tenant_id}#{entity_type}#{entity_id}`	Isolated

Event Isolation

Domain events include tenant_id in the event payload. Event handlers use this to route data to the correct tenant partitions:

json

{
  "event_type": "TourCompleted",
  "tenant_id": "tenant_abc",
  "data": {
    "tour_id": "tour_456",
    "user_id": "user_123"
  }
}

When the analytics handler processes this event, it writes to Events.Aggregates with a partition key of tenant_abc#tour_456 -- scoped to the originating tenant.

Outbox Table

Events.Outbox is the second deliberate exception to the tenant-partitioning invariant (the first being Platform.Memberships). Partitioning on tenant_id would force the delivery worker to know every active tenant's id ahead of time and round-robin queries; partitioning on event_id lets a single sparse gsi-status GSI (status HASH, timestamp RANGE) drain every tenant's pending events in chronological order.

PK: event_id
SK: timestamp
GSI gsi-status: sparse — only PENDING events are indexed, so the GSI stays small no matter how much history accrues.

Tenant isolation here is enforced at the application layer: every event payload carries its tenant_id, and downstream handlers route into tenant-scoped tables (which DO partition on tenant_id). The outbox itself sees the cross-tenant stream because it has to.

Infrastructure-Level Controls

DynamoDB Encryption

All tables have server-side encryption enabled:

hcl

server_side_encryption {
  enabled = true  # AWS-managed KMS key
}

This protects data at rest. For tenants requiring customer-managed keys, a dedicated KMS key can be specified per table.

Point-in-Time Recovery

PITR is enabled in production to protect against accidental data loss:

hcl

point_in_time_recovery {
  enabled = var.enable_pitr  # true in production
}

IAM Least Privilege

Lambda functions are granted access only to the specific DynamoDB tables they need:

hcl

Action = [
  "dynamodb:GetItem", "dynamodb:PutItem",
  "dynamodb:Query", "dynamodb:BatchGetItem",
  "dynamodb:BatchWriteItem", "dynamodb:TransactWriteItems",
  "dynamodb:DeleteItem"
]
Resource = concat(
  values(var.table_arns),
  [for arn in values(var.table_arns) : "${arn}/index/*"]
)

No Lambda function has dynamodb:* permissions. Table-level access is explicitly enumerated.

Isolation Verification

What prevents cross-tenant access?

Layer	Control	Mechanism
Authentication	API key resolves `tenant_id`	Key is cryptographically bound to one tenant
Application	`AuthContext.tenant_id` injected from API key	Cannot be overridden by user input
Repository	All queries use `tenant_id` in key	DynamoDB enforces partition boundary
DynamoDB	Partition key isolation	Physical data separation
IAM	Least-privilege Lambda roles	No wildcard permissions

Threat scenarios

Threat	Mitigation
Attacker sends another tenant's API key	Only the key holder's tenant is accessible
Attacker modifies `tenant_id` in request	Not possible -- `tenant_id` comes from API key validation, not user input
Attacker guesses another tenant's `tour_id`	The query uses `{attacker_tenant_id}#{tour_id}`, which is a different partition
Attacker performs a DynamoDB scan	The application never performs scans in production
Insider accesses DynamoDB directly	Audit trail in `Audit.AdminActions`, IAM logging, DynamoDB encryption
API key compromise	Revoke the key, issue a new one. The compromised key only accesses one tenant's data

Table Naming Convention

All tables follow the naming pattern {prefix}.{feature}.{sub_table}:

python

def table_name(self, feature: str, sub_table: str) -> str:
    return f"{self.dynamodb_table_prefix}.{feature}.{sub_table}"

# Examples:
# "Spotlight.Tours.Definitions"
# "Spotlight.Events.Aggregates"
# "Spotlight.Admin.Users"

The prefix is configured via DYNAMODB_TABLE_PREFIX and varies by environment:

Local: Spotlight
Dev: Spotlight.Dev
Production: Spotlight.Prod

This prevents accidental cross-environment access while keeping the table structure consistent.

Multi-Tenant Data Isolation ​

Isolation Model ​

Partition Key Design ​

Direct Tenant Partitioning ​

Composite Key Partitioning ​

API Keys ​

Cross-tenant exception: Platform.Memberships ​

Application-Layer Enforcement ​

Tenant Context Injection ​

Repository Pattern ​

No Cross-Tenant Queries ​

Global Secondary Index Isolation ​

Event Isolation ​

Outbox Table ​

Infrastructure-Level Controls ​

DynamoDB Encryption ​

Point-in-Time Recovery ​

IAM Least Privilege ​

Isolation Verification ​

What prevents cross-tenant access? ​

Threat scenarios ​

Table Naming Convention ​

Multi-Tenant Data Isolation

Isolation Model

Partition Key Design

Direct Tenant Partitioning

Composite Key Partitioning

API Keys

Cross-tenant exception: Platform.Memberships

Application-Layer Enforcement

Tenant Context Injection

Repository Pattern

No Cross-Tenant Queries

Global Secondary Index Isolation

Event Isolation

Outbox Table

Infrastructure-Level Controls

DynamoDB Encryption

Point-in-Time Recovery

IAM Least Privilege

Isolation Verification

What prevents cross-tenant access?

Threat scenarios

Table Naming Convention