Snowplow + Snowflake: The Stack Is Fine, The People Are The Problem

2026-05-24 · by Matthew van Bird

The setup

We spent a few weeks wiring up what is, on paper, one of the strongest behavioural data stacks you can put together. Snowplow for collection. Snowflake as the warehouse. dbt in the middle. Core system data (orders, accounts, subscriptions) landing alongside every click, scroll and route change a user has ever produced.

Architecture: web, mobile and backend send events through Snowplow trackers into Snowflake (raw, staged, marts via dbt), joined with CDC of core system data; bad events go to a dead-letter stream; the warehouse feeds BI and notebooks.

On a good day this stack answers questions the business has been guessing at for years. On a normal day it sits there, mostly unused, while someone in a meeting asks for "a single signup number".

That gap is the article.

What Snowplow gives you

A contract. Every event is a self-describing JSON document with a versioned schema and contexts (user, session, device) attached.

{
  "schema": "iglu:com.acme/order_completed/jsonschema/1-2-0",
  "data": { "order_id": "ORD-918273", "currency": "GBP", "subtotal": 4250 }
}

Two things follow from that. Bad events don't pollute the good ones; they go to a dead-letter stream. And you don't need to pre-decide what's interesting; you instrument behaviour and decide what's interesting in the warehouse, after the fact. The events are facts. The interpretations are SQL.

If you've grown up on Google Analytics or anything that ships pre-aggregated reports, that inversion is disorienting. It is also where the people problem starts.

What Snowflake adds

Joining behavioural data to core system data becomes one query instead of four pipelines:

select u.user_id, s.event_name, s.event_timestamp
from analytics.fct_downgrades d
join analytics.events s
  on s.user_id = d.user_id
 and s.event_timestamp between d.downgraded_at - interval '30 days' and d.downgraded_at;

Ninety seconds to write. It would have taken four meetings six weeks ago.

The second thing is that the political cost of asking a question drops to roughly zero. That sounds like an unalloyed good, until you realise the political cost of asking a question was the only thing keeping the bad questions out.

Anecdote: "Can you just give me the signup event?"

A PM on our team, sharp and well-intentioned, asked us in earnest, "can you just give me the signup event?"

We had spent the previous two weeks instrumenting signup. Page views, form focuses, an accountcreated event when the API returned 201, an emailverified event when the link was clicked, a firstsessionstarted when they came back. Maybe forty events across the funnel, each with a schema.

She wanted a Boolean. Did this user sign up?

We didn't have a signups table. We had every attempt, success, abandon and resume. Signup wasn't a row; it was a definition. Did account_created count if they never verified? Did social signups count?

The right answer is that signup is a SQL definition the team writes down once, in dbt, and everyone uses that one. The wrong answer, the one we ended up with, is that two weeks later there was a hand-rolled signed_up backend event firing on a hand-crafted condition, and it was the only one anyone referenced in the all-hands.

Two truths now. The granular events, correct but requiring SQL. The milestone event, approximate but pre-aggregated. Guess which one the slide deck used.

Anecdote: three definitions of activation

Same shape, different week. Leadership wanted to know "how many users are activated this month". Nobody had defined activation, so an analyst wrote one in Looker and shipped a dashboard.

Three weeks later there were three forks:

Marketing: viewed three pages and stayed >90 seconds.
Product: completed onboarding step 4 and returned within 7 days.
Finance: made a purchase or hit the paywall twice.

All defensible. None agreed. All three referenced in slide decks as if they were the same metric.

The behavioural data was perfect. It had been perfect the whole time. It was the only thing in the entire architecture that could tell the truth, because it was the substrate from which any honest definition would be derived.

Nobody wanted to spend an afternoon agreeing the definition. So we shipped three.

Why this is a people problem

The architecture rewards organisations that can hold a definition steady and punishes ones that can't. That is the game.

Most non-technical stakeholders I've worked with have been trained by twenty years of pre-aggregated dashboards to think of data as a thing you look up. What was revenue last month? There's a number. How many signups? There's a number. The model is data-is-the-answer, not data-is-the-substrate-from-which-an-answer-is-derived.

Snowplow + Snowflake inverts that. Events are evidence. The answer requires a definition, and the definition is the politically charged bit. Once you've agreed what a "signed-up user" is, the SQL is twenty minutes. Until you've agreed, the SQL is impossible. And agreeing is hard, because somebody's preferred number gets smaller.

I've watched capable, well-paid people look at a query result that disagreed with a number on a slide and conclude the query was wrong, because the slide had been signed off by a senior person. The events were correct. The slide was a definition nobody had written down. We built a new milestone event so the slide could be reproduced.

The stack got more complex. The truth got further away.

The failure mode: the canonical dbt model is skipped because it is "too much work", and three hand-rolled milestone events are written instead, each one feeding a separate dashboard for marketing, product and finance.

Three sources of truth, none of them the events that were already correct.

What works

A few things, none technical:

Canonical definitions live in dbt, with an owner and tests. If activation isn't a model in dbt, activation doesn't exist. People can fork it for ad-hoc work; they can't fork the canonical one without a PR.
Teach the vocabulary, not the SQL. The win condition isn't that everyone writes SQL. It's that everyone in a meeting can use the words event, definition and metric correctly. A surprising amount of dysfunction is people saying "the signup data is wrong" when they mean "we haven't agreed what a signup is".
Refuse to build milestone events as a first response. When someone asks for accountcompletedonboarding, the first answer is "let's write the SQL definition first". If it gets used three times, then materialise it as a dbt model. Never as a hand-rolled backend event.
Make the canonical query embarrassingly easy to run. Pre-bake the joins as dbt models (fctusersession, fctpurchasefunnel) and expose them through the BI tool. Removes the technical excuse for not engaging with the definitions.

Closing

Snowplow + Snowflake is one of the most capable data stacks you can ship without losing your mind. It does not abstract away the hard part. It makes it visible. The hard part is getting an organisation to agree on what a user is, what a conversion is, what an activated account is, and to write those agreements down somewhere that can be audited.

Events are a substrate. Metrics are a contract. The contract is a political artefact.

Every team I've watched succeed with this stack treats the contract as the deliverable and the SQL as plumbing. Every team I've watched struggle falls back to the same defensive pattern: build a milestone, plot it, refuse to look at the granular data, hope nobody asks the awkward question.

The stack is fine. The stack has been fine. The people are the problem, and the people have always been the problem, and the people will continue to be the problem in any architecture you replace this one with.

Build the stack anyway. Go in clear-eyed about which battle is the real one.