Skip to content

DynamoDB Single-Table Pattern: Analytics Events

A naive analytics schema with PK = SOURCE#my-site funnels every event through one partition. That partition maxes out at ~1,000 writes/second. A moderately trafficked website generates that in minutes.

The solution is write-sharding - distributing events across multiple partition keys and merging them at read time. This pattern handles event ingestion, session tracking, and pre-computed aggregations for dashboards.

Access patterns

#Access PatternOperationNotes
AP1Ingest an eventPutItemHigh throughput, write-sharded
AP2Get events for a sessionQuery across shardsDebug / session replay
AP3Get session detailsGetItemSession overview
AP4List recent sessions for a sourceQueryLive dashboard
AP5List sessions for a userQuery (GSI1)User journey
AP6Get aggregations for a sourceQueryDashboard charts
AP7Get aggregations for a date rangeQueryReporting

Seven access patterns. Three entity types. One table, one GSI.

Entities

Three entity types share one table:

  • Event: a single analytics event, write-sharded across 100 partition keys to absorb high ingest throughput
  • Session: a user visit, updated atomically as events arrive
  • Aggregation: pre-computed hourly and daily summaries, the source of truth for dashboards
Analytics Events entity diagram showing Event, Session, Aggregation relationships
3 entities · 1 GSI · raw events sharded for ingest, aggregations for reads

Table design

Write-sharded events

The Event entity uses a sharded partition key:

PK = SOURCE#my-site#SHARD#42
SK = EVENT#01HW9A2B3C...  (ULID)

When writing an event, assign a random shard number (0–99):

const SHARD_COUNT = 100;

async function ingestEvent(sourceId: string, event: EventInput) {
  const shard = Math.floor(Math.random() * SHARD_COUNT);
  const eventId = ulid();
  
  await EventEntity.put({
    sourceId,
    shard,
    eventId,
    ...event,
    ttl: Math.floor(Date.now() / 1000) + (7 * 24 * 60 * 60),  // 7-day TTL
  }).go();
}

100 shards × 1,000 writes/second per partition = 100,000 writes/second theoretical throughput. More than enough for most analytics workloads.

The tradeoff: reading events for a session requires fan-out across all shards. This is acceptable because raw event reads are infrequent (debug/replay only) while writes are constant.

Sessions and aggregations are not sharded

Sessions and aggregations have much lower write throughput - one session update per user action, one aggregation update per time bucket. They use the unsharded SOURCE#<sourceId> partition key.

EntityPKSKTTLPurpose
EventSOURCE#<id>#SHARD#<n>EVENT#<ulid>7 daysRaw events
SessionSOURCE#<id>SESSION#<sessionId>90 daysSession summaries
AggregationSOURCE#<id>AGG#<period>#<timestamp>-Dashboard data
Analytics Events DynamoDB schema: Entity, PK, SK, TTL columns for all three entity types
Sharded events, unsharded sessions and aggregations — one table, bounded storage

Reading events across shards

For session replay or debugging, fan out across all shards:

async function getEventsForSession(sourceId: string, sessionId: string) {
  const results = await Promise.all(
    Array.from({ length: SHARD_COUNT }, (_, shard) =>
      EventEntity.query.primary({ sourceId, shard })
        .where(({ sessionId: sid }, op) => op.eq(sid, sessionId))
        .go()
    )
  );
  
  return results
    .flatMap(r => r.data)
    .sort((a, b) => a.eventId.localeCompare(b.eventId));  // ULID sort = chronological
}

100 parallel queries, merged client-side. This is expensive but infrequent. Dashboards use aggregations instead.

Computing aggregations

A DynamoDB Streams-triggered Lambda processes events and updates aggregations:

// Stream handler: for each new event
async function handleEventStream(event: DDBStreamEvent) {
  for (const record of event.Records) {
    if (record.eventName !== "INSERT") continue;
    const item = unmarshall(record.dynamodb.NewImage);
    
    const hourStart = item.createdAt.substring(0, 13) + ":00:00Z";
    const dayStart = item.createdAt.substring(0, 10);
    
    // Update hourly aggregation
    await AggregationEntity.update({ sourceId: item.sourceId, period: "hourly", timestamp: hourStart })
      .add({ events: 1, pageViews: item.type === "page_view" ? 1 : 0 })
      .go();
    
    // Update daily aggregation
    await AggregationEntity.update({ sourceId: item.sourceId, period: "daily", timestamp: dayStart })
      .add({ events: 1, pageViews: item.type === "page_view" ? 1 : 0 })
      .go();
    
    // Update session
    await SessionEntity.update({ sourceId: item.sourceId, sessionId: item.sessionId })
      .add({ eventCount: 1 })
      .set({ lastSeenAt: item.createdAt, exitUrl: item.url })
      .go();
  }
}

Dashboards query aggregations - never raw events. “Show me page views for the last 7 days” is a single Query on AGG#daily# with a date range.

Sample data

pkskgsi1pkgsi1skttlEntity Data
SOURCE#my-site#SHARD#42EVENT#01HW...--1747…{ type: "page_view", sessionId: "01HW...", url: "/blog" }
SOURCE#my-site#SHARD#07EVENT#01HW...--1747…{ type: "click", sessionId: "01HW...", properties: { buttonId: "cta" } }
SOURCE#my-siteSESSION#01HW...USER#u_01SESSION#01HW...1755…{ eventCount: 12, entryUrl: "/", exitUrl: "/pricing", duration: 340 }
SOURCE#my-siteAGG#hourly#2026-05-12T14:00:00Z---{ pageViews: 1240, sessions: 380, events: 3100 }
SOURCE#my-siteAGG#daily#2026-05-12---{ pageViews: 15600, sessions: 4200, events: 41000 }

ElectroDB entity definitions

export const EventEntity = new Entity({
  model: { entity: "event", version: "1", service: "analytics" },
  attributes: {
    sourceId:   { type: "string", required: true },
    shard:      { type: "number", required: true },  // 0-99
    eventId:    { type: "string", required: true },   // ULID
    type:       { type: "string", required: true },   // page_view, click, signup, etc.
    sessionId:  { type: "string", required: true },
    userId:     { type: "string" },                   // null for anonymous
    url:        { type: "string" },
    referrer:   { type: "string" },
    properties: { type: "any" },                      // custom event data
    userAgent:  { type: "string" },
    ttl:        { type: "number" },                   // auto-expire raw events
    createdAt: {
      type: "string", required: true,
      default: () => new Date().toISOString(), readOnly: true,
    },
  },
  indexes: {
    primary: {
      pk: { field: "pk", composite: ["sourceId", "shard"], template: "SOURCE#${sourceId}#SHARD#${shard}" },
      sk: { field: "sk", composite: ["eventId"],           template: "EVENT#${eventId}" },
    },
  },
}, { client, table });

Why this design

Write-sharding is the core technique here. Without it, a single source’s events all hit one partition, a hard ceiling at ~1,000 writes/second. With 100 shards, that ceiling is 100,000 writes/second. Adjust SHARD_COUNT based on your peak traffic.

Raw events expire via TTL after 7 days, which is enough for debugging and session replay. Aggregations persist indefinitely and are what dashboards actually query. This keeps storage costs bounded regardless of traffic volume. TTL plus tight projection is one of the biggest cost optimization levers in DynamoDB.

Sessions bridge raw events and aggregations. A session groups a user’s events into a single visit. The session record is updated atomically on each event (via Streams), giving you entry URL, exit URL, event count, and duration without scanning raw events.

This is a purpose-built analytics store, not a replacement for a data warehouse. For complex analytical queries (funnel analysis, cohort retention, custom segmentation), export events to S3 via DynamoDB Streams and query with Athena. The DynamoDB schema handles real-time ingestion and dashboard queries. The data warehouse handles ad-hoc analysis.

Unique user counting is approximated at the aggregation level. Exact unique counts across time periods require HyperLogLog or similar probabilistic data structures, which you’d compute in the aggregation Lambda and store as a binary attribute.


Pattern #10 of 10 in the SingleTable pattern library. It brings together techniques from across the library: write-sharding from the E-Commerce status GSI, TTL from IoT Time-Series, and Streams-based aggregation from the same. I’m building singletable.dev to make these patterns visual.

Get the patterns →

Tejovanth N

These patterns come from real apps - rasika.life, rekha.app, rrmstays - all running single-table DynamoDB with ElectroDB.

LinkedIn codeculturecob.com

Related

Schema review

Want a second pair of eyes before you ship?

Async DynamoDB schema review. PK/SK design, GSI strategy, ElectroDB entity code. Fixed price, 5 business days.