Part One

Strategy & Foundations

Before any tool is chosen or any pipeline is built, a data programme needs a reason to exist, a few principles it will live by, a strategy that fits on one page, and an operating model that makes ownership visible. This part covers all four, in that order.

1.1Why data matters: the business case

Most data programmes are sold on tools. A new warehouse, a streaming engine, a catalogue. A purchase order goes through, dashboards multiply, and a year later leadership asks the uncomfortable question: where is the value?

The honest answer is almost always that the buyer skipped what sits beneath the waterline. Tools are the visible tip of the iceberg. Strategy is the mass underneath: the problem worth solving, the value proposition, the people who will do the work, the systems that will support them, the cost it takes, and the outcomes that justify it.

Tools are the tip. Strategy is the mass beneath the waterline.

A useful test before any procurement decision: can you describe the elements below the waterline for the use case in front of you? If you cannot, the tool is not the right next step.

The Pitfall

The data team built a sophisticated platform nobody asked for. When the CFO asked "what did this get us?", the only answer was a list of pipelines and tools. The budget was cut the next quarter.

The Practice

Lead with the business problem the data solves. Tie every layer of investment to a visible outcome someone in the business actually cares about.

The Decision Test

Tools live above the waterline. Strategy lives beneath it. If you cannot name the problem, the people, and the outcome, the tool will not save you.

1.2Principles: what "good" looks like

Principles are how a programme stays coherent when the people running it change. They are tool-agnostic. They give engineers, analysts, and leaders a shared answer to "why are we doing it this way?"

Print them. Put them on the wall of the room where data decisions get made.
Reference them by number in design reviews. "This proposal conflicts with #3 and #7" is a faster conversation than re-arguing first principles every time.
Revisit them annually. Two or three will need refining as the programme matures; the rest will hold.

The Pitfall

A company adopted ten data principles from a consulting deck. They were printed, framed, and ignored, because nobody could say what any of them changed about a Tuesday.

The Practice

Adopt only the principles you will actually enforce. A principle that does not change a decision is decoration. Cut it.

1.3Your data strategy on one page

Long strategy documents tend to die in shared drives. A one-page version survives, and more importantly, it can be reviewed. A few notes on the sequence:

The first steps are diagnostic. Skipping them is the single most common reason data programmes lose momentum in year two.
The first real value lands on a use case where the decision is obvious and the data is roughly available, not the hardest problem in the company.
Governing data nobody uses yet is theatre. Govern after the first win, not before it.

The Decision Test

Goal first, tools later. Start tiny. Share results in plain language. Repeat what works; drop what does not.

1.4The operating framework

Strategy says what to do. The operating framework says who decides, who does, and who is accountable when things go wrong. Reading the layers from the centre outward:

Data Foundation: the raw material. Source systems, master data, transactions, events. Most companies inherit this layer; few control it as well as they think.
Data Management: what turns raw data into something usable. Quality, metadata, lineage, cataloguing. This is where governance does its day-to-day work.
Decision Authority: who has the right to define a metric, change a definition, or override a control. The layer most often missing, and most often the cause of conflict.
Analytics & AI: the visible output. Dashboards, advanced analytics, AI models.

Apply Part One

Reading is half of it. Now test your own ground.

You have the principles. See where your programme actually sits, and put a real request to the test.

Rate your Strategy maturity → Run a request through the Decision Test →

Part Two

Governance & Accountability

Governance is what turns a strategy into something that survives change of personnel, change of tools, and change of mood. Done well it is invisible, decisions happen quickly because everyone knows who decides. Done badly it becomes a folder of policies nobody reads, and an inbox of approval requests nobody acts on. This part covers the four things to get right: the implementation sequence, the three roles that get confused, the people who actually do the work, and the multi-year roadmap.

2.1An eight-step implementation playbook

Governance programmes fail in predictable ways. The most common failure is starting with policy instead of with the business case. The second most common is starting with the framework instead of with the sponsor. The eight steps below sequence the work so that each one earns the right to the next.

Steps 1 and 2 are sales, first to the wider business, then to a single named executive. If you cannot complete step 2, do not proceed to step 3. Without a sponsor, the programme will be killed by the first conflict.
Step 4, benchmarking maturity, is also a politically useful artefact. It gives leadership a tangible "where we are now" and "where we want to be in 18 months" without anyone losing face. We return to maturity in Part 8.
Step 7 is where most programmes go wrong. Guardrails should empower decisions, not block them. If your governance documents are a list of things you can't do, you'll be ignored within a quarter. Write what you can do, and what the path is to do something new.

The Pitfall

A team spent six months writing a beautiful governance policy. It lives on SharePoint. Nobody has opened it since the launch email, and the same quality issues keep recurring, because no policy ever changed what anyone actually does.

The Practice

Start with one painful, visible problem. Fix it with a lightweight rule that changes a real decision. Point to that win when you ask for the mandate to do more.

The Decision Test

Govern at the level of decisions, not at the level of documents. If a policy doesn't change a decision, it doesn't exist.

2.2Three roles that keep getting confused

Most governance friction comes from a single confusion. The words governance, ownership, and stewardship are used as if they were synonyms, and the result is owners who don't have authority, stewards who are expected to decide, and governance that has no accountability. The fix is a shared definition of which is which.

The mental model that works in practice: a steward catches a quality issue this morning. They follow up with the affected consumers and document what happened. If the issue requires a policy change, a new threshold, a different SLA, they take it to the owner of that domain. If the change affects more than one domain, the owner takes it to the governance forum. Governance writes the new rule. The owner accepts it. The steward enforces it.

Stewards being held responsible for decisions they have no authority to make.
Owners signing off on decisions but not knowing who actually executes.
Governance meetings that last 90 minutes and produce no decision.
Three different definitions of "active customer" in the same business.

The Pitfall

A steward gets blamed in a steering meeting for a wrong number. But she was never given authority to change the definition, that sits with an owner who's been in back-to-back meetings for three weeks. Nothing gets fixed, and everyone quietly concludes the data team has failed again.

The Practice

Separate the three roles explicitly. The steward maintains and escalates, the owner decides, governance sets the rule. When a number is wrong, everyone already knows whose call it is to fix it.

The Decision Test

Governance = who decides. Ownership = who is accountable. Stewardship = who executes. Mix them up and you get endless meetings.

2.3Owners, custodians, and stewards in practice

Once governance, ownership, and stewardship are separate concepts, the next step is naming the practical roles inside ownership. The three that matter most are the data owner, the data custodian, and the data steward. All three are required. They have different decision rights, different backgrounds, and different failure modes.

Three roles, three kinds of authority. Name a specific human to each.

When a quality issue lands, ask: is this a steward problem (execution), a custodian problem (technical reliability), or an owner problem (decision-making)? The answer routes the work.
When hiring or assigning, pay attention to the typical background row. Putting an IT background into a data owner role usually fails, the role needs business authority, not technical depth.
The failure-impact row is also useful for risk reporting. It maps directly to the kind of business consequence each role's failure produces, which makes risk conversations more concrete than the usual "data quality is bad."

The Pitfall

An IT manager was made "data owner" for the customer domain because he ran the database. Six months on, marketing and sales are still arguing about what "active customer" means, because the owner has all the technical authority and none of the business authority to settle it.

The Practice

Put a business leader in the owner seat and IT in the custodian seat. The owner decides what the data means; the custodian keeps it safe and available. Match the role to the kind of authority it actually needs.

The Decision Test

Owner decides. Custodian protects. Steward maintains. Name a specific human to each, for each critical dataset. Without a named human, the role is theoretical.

2.4A 24-month governance roadmap

Eight steps tell you what to do in what order. A roadmap tells you how long it should take, and what "good" looks like at each milestone. The phases below are drawn from regulated industry practice, financial services in particular, but the shape applies more broadly. Two-year programmes are typical; faster is possible but rare; slower usually means the sponsor has lost interest.

Phase 1 (0–6 months) optimises for clarity. Who owns what, what's risky, what the operating model looks like. The deliverables are mostly documents, but the substance is naming people.
Phase 2 (6–12 months) optimises for embedded controls. The goal is moving governance from a folder of policies to lineage, contracts, and quality checks that live in the platform.
Phase 3 (12–18 months) optimises for continuous assurance. The shift is from "we reviewed it last quarter" to "we know now whether it's working."
Phase 4 (18–24 months) optimises for invisibility. Machine-readable policies, embedded guardrails, executive dashboards. At this point governance has become part of the architecture rather than a function bolted on top.

The Pitfall

Leadership was promised "enterprise data governance" in six months. At month five, a tool bought, a policy written, nothing actually embedded, the sponsor lost patience and quietly redirected the budget. The programme died not because it failed, but because it promised the wrong timeline.

The Practice

Phase it over 18–24 months with a visible win at each stage. Anchor ownership first, embed controls second, automate last. Show the sponsor progress they can see before asking them to keep believing.

The Decision Test

Governance maturity is a 24-month story. Programmes that promise it in 6 are selling you a document. Programmes that take 5 years have lost their sponsor.

Apply Part Two

Governance is who decides. Now name yours.

See where your accountability actually sits, then put a real request to the test.

Take the scorecard → Run the Decision Test →

Part Three

Architecture & Engineering

Architecture is where strategy meets reality. The right architecture is not the one with the most components or the trendiest names, it is the one that matches the bottleneck you actually have. This part covers the four architectural decisions that matter most: the layered platform model, the difference between a stack and a platform, the governed view of what sits between source data and the board deck, and how to choose between mesh, fabric, and lakehouse when the question is which paradigm to invest in.

3.1The layered model of a modern data platform

Almost every modern data platform, regardless of vendor, cloud, or scale, has the same seven layers. The names vary; the function does not. Understanding the layers is useful because it gives a vocabulary for diagnosing problems ("the issue is in curation, not in storage") and because it makes clear which layers are optional and which are not.

Seven layers, each serving the one above; cross-cutting capabilities run the full height.

The top layer (Experience & Consumption) is where the business sees value. Every layer below it exists to serve that one. If a programme cannot draw a line from a layer-7 investment to layer-1 outcomes, the investment is wrong-sized.
Layer 7 (Source Systems) is usually inherited. You don't control it, but you inherit all its problems. A meaningful share of data quality issues are actually source system issues. Naming this honestly saves arguments.
Layers 4 (Processing & Orchestration) is where most modernisation projects get stuck. The pipelines work in dev, the schedules drift, the dependencies are unclear, error handling becomes an inbox of alerts nobody acts on.
The cross-cutting capabilities on the right, governance, metadata, DataOps, are not add-ons. Treating them as Phase Two work is the single most reliable way to produce an unusable platform.

The Pitfall

The team poured a year into a slick consumption layer, dashboards, a semantic model, the lot, on top of ingestion that silently dropped rows. The dashboards were beautiful, and wrong.

The Practice

Build the lower layers before the visible ones. A flawless consumption layer sitting on broken ingestion is just a confident way to be wrong.

The Decision Test

Every modern data platform has these layers. The difference between a good one and a swamp is whether someone owns each layer, or just hopes it works.

3.2Stack vs platform

Two companies can buy exactly the same tools and end up in very different places. One has a reliable data platform people trust. The other has a stack, a chain of tools held together with glue and weekend work. The difference is not the tools. It is what was designed around them.

Same tools, different outcome. A platform is what you engineer around the stack.

A modern data stack is a flexible, fast way to get started. Pick a tool for ingestion, one for storage, one for transformation, one for analytics. Each tool does one job well. New teams can be productive within weeks. This is genuinely valuable, most companies should start as a stack.

What changes is what the stack becomes after a year. Once there are 30 pipelines, 200 tables, and 50 consumers, the gaps that didn't matter at the start start to matter a lot. Who owns the customer table? Why is the dashboard wrong? How much is this costing us? A stack does not answer those questions. A platform does.

A senior leader stops trusting a dashboard because they spotted a discrepancy.
Cloud spend grows faster than business value and someone in finance starts asking questions.
A compliance event forces a question of who has access to what, and the answer is unclear.
A new analyst joins and can't find the data they need, what worked for 10 people doesn't scale to 50.

The Pitfall

They bought best-in-class tools for every job. Eighteen months later, three engineers spent most of their week as human glue, nobody could say which numbers were trusted, and the cloud bill arrived as a surprise every month.

The Practice

Design how the tools run together, ownership, contracts, lineage, cost visibility, not just which tools you own. A stack becomes a platform when you engineer the seams.

The Decision Test

A stack is what you build with. A platform is what makes it reliable at scale. Don't just buy tools, design how they run.

3.3From source system to board deck

The seven-layer platform model is engineering-focused. The view that follows is the same system seen through a governance lens, six layers between the raw data leadership inherits and the numbers leadership puts in board decks. Each layer is a place where ownership can be clear or unclear. Most organisations stop being explicit about ownership at the third layer up. They then wonder why the numbers at the top don't reconcile.

Six layers through a governance lens. Can you trace any number at the top back to its source?

The diagnostic question that comes out of this view is simple: can you trace any number at the top all the way back to its source? Not in theory, in practice, within the next thirty minutes, with documented links, named owners, and a quality check at each transition.

Source Systems exist whether you admit it or not. The ERP doesn't care about your data programme.
Lineage & Transparency is where many organisations stop and think they're governed. They have a lineage tool, therefore they have lineage. The tool is necessary; on its own it is not sufficient.
Data Quality is what makes change safe. Without it, every schema change in the source becomes a fire drill downstream. With it, broken contracts trigger alerts before broken dashboards.
Definitions & Ownership is the governance layer. Business definitions, metric logic, named owners, escalation paths. It is an operating model, not a policy folder.
Trust at Scale is what leadership thinks they're paying for when they buy a platform. They're not wrong to want it, they're wrong to think it can exist without the layers underneath.
AI, Analytics & Automation is only possible when every layer below exists. Putting an LLM on top of layer-1 data is a known way to fail expensively.

The Pitfall

Asked to trace a board-deck number back to its source, the team needed four days and three Slack threads. The number had been wrong for two quarters, and nobody could prove where it broke.

The Practice

Make every number traceable from board deck to source, named owners and a quality check at each layer. If you can't trace it, you can't defend it.

The Decision Test

Every company has this stack. The difference is whether someone owns each layer, or just hopes it works.

3.4Mesh, fabric, or lakehouse, pick by bottleneck

Three architectural paradigms get talked about as if they were competitors. They are not. Each answers a different question, and the right one for an organisation depends on which question is actually being asked. The framing that works: start with the bottleneck, not with the brand.

Each paradigm changes a different dimension. Data Mesh is an organisational shift, it changes who is accountable. Data Fabric is an architectural shift, it changes how you find and access data without moving it. Data Lakehouse is a technological shift, it changes the underlying storage model. They can coexist. A company can run a lakehouse, expose it through a fabric, and govern it as a mesh. What matters is which constraint you are actually trying to relieve.

If the central data team is the bottleneck, work piles up, domains can't move, the answer is organisational. Mesh.
If integration and discovery are the bottleneck, data is everywhere but hard to find or join, the answer is architectural. Fabric.
If storage cost, duplication, or the BI-vs-ML split is the bottleneck, multiple copies, slow analytics on lake data, expensive warehouse, the answer is technological. Lakehouse.
If you cannot name the bottleneck specifically, the answer is not any of them. The answer is to fix governance and quality first. None of these paradigms will save a programme that hasn't done the work in Parts 1 and 2.

The Pitfall

Leadership read an article and mandated a data mesh. Eighteen months on, the central bottleneck was untouched, they had reorganised the org chart but never fixed the actual constraint, which was data quality.

The Practice

Name the real bottleneck first, then pick the paradigm that relieves it. Mesh, fabric, and lakehouse each fix a different problem, buying the wrong one is expensive.

The Decision Test

If your strategy starts with "which tool," it's already wrong. Start with which bottleneck. The paradigm follows.

Apply Part Three

A platform is engineered, not assembled. Check yours.

See how your architecture scores, and pressure-test a request before you build it.

Take the scorecard → Run the Decision Test →

Part Four

Quality, Trust & Observability

A data platform that is well-architected but full of unreliable data is worse than no platform at all, it gives leadership the confidence to act on numbers they should be questioning. This part covers what "good" looks like in practice: the five pillars every dataset has to hold up across, the framework that turns those pillars into a measurable system, the observability loop that catches and prevents incidents, and the data contracts that make change safe instead of scary.

4.1The five pillars of data quality

Data quality is one of those phrases that means everything and therefore nothing. The five pillars below are the shared vocabulary that makes the phrase actionable. Every dataset the business depends on must hold up across all five, and when it doesn't, the pillar tells you what kind of problem you have and what kind of check would have caught it.

Five pillars, one foundation. When data is bad, name the pillar and the fix follows.

When someone says "the data is bad," make them name a pillar. "The numbers don't match Salesforce" is an accuracy problem. "Half the rows are missing" is completeness. "The dashboard says 24 hours stale" is timeliness. Naming the pillar narrows the fix.
Treat the five as a set, not a ranking. A table can be perfectly accurate and useless because it's two weeks stale. A real-time table is useless if it conforms to a schema that nobody understands.
Each pillar has a typical technical check that catches its failure mode. Those checks are not optional, they are how a programme moves from manual eyeballing to automated trust.

The Pitfall

"The data is bad" had been the standing complaint for a year. Nobody could say which way it was bad, so every fix was a guess, and none of them stuck.

The Practice

Name the pillar. "The numbers don't match Salesforce" is accuracy; "half the rows are missing" is completeness. Naming the failure mode points straight at the fix.

The Decision Test

All five pillars rest on the same foundation: trust. The day stakeholders stop trusting a dashboard, every layer underneath becomes wasted work.

4.2The modern data quality framework

The pillars tell you what to check. The framework that follows tells you how to organise the checking so it becomes a system. At the centre is data trust, the actual goal. Around it are four quadrants: what the business gets when trust is high, what is checked technically, what is measured as a programme, and what governance capabilities the system depends on.

Business value is the outcome you are paying for. High adoption, confident decisions, reduced risk, trusted AI. These are the things leadership signed up for.
Technical audits are what your platform does automatically. Freshness checks, volume anomaly detection, schema drift, null rates. The cost of running them is trivial; the cost of not running them shows up in incidents.
Quality KPIs are how you measure your own programme. Data SLAs, time-to-detect, time-to-resolve, issue volume. Without these, you can't tell whether you're getting better.
Governance stack is what makes all the above possible: catalog, lineage, contracts, stewardship. Without these layers, technical audits are noise and KPIs are guesses.

The arrows in the framework all point inward because the relationship is causal, not incidental. Business value is produced by the other three working together. Removing any one collapses the model.

The Pitfall

Quality was "owned" by everyone and therefore no one. Issues were caught by whoever happened to notice, fixed in a panic, and forgotten, until the same thing broke again.

The Practice

Run quality as a system: automated audits, a measured KPI set, and a governance stack underneath. Trust is produced by the machine, not by vigilance.

The Decision Test

Data Pipelines × Governance → Reliability → Trust → Adoption → Value. Each arrow is earned, not assumed.

4.3The observability cycle

Every incident is data. Caught early, it is a learning event. Caught late, it is a credibility event. The observability cycle is the loop that turns incidents into permanent improvements instead of recurring fires.

A loop that turns incidents into improvements. The CEO should never be your monitoring system.

Detect, the goal is shorter time-to-detect. "We found out when the CEO emailed" is the failure mode. Real-time alerts on freshness, volume, schema, and quality thresholds are the floor, not the ceiling.
Diagnose, the goal is shorter time-to-resolve (MTTR). Trace upstream through lineage, identify which job, source, or contract broke. Without lineage this step takes hours; with lineage it takes minutes.
Prevent, the goal is non-recurrence. After every incident, ask: what test, contract, or guardrail would have caught this earlier? Add it. The same incident happening twice is a process failure, not a data failure.

The metrics that matter, time-to-detect, time-to-resolve, incident recurrence, should appear in your quality KPI set. They are leading indicators of trust. If they're moving in the right direction, trust will follow. If they're flat or worsening, no amount of communication will fix the perception, because the perception is correct.

The Pitfall

The team found out the pipeline had been broken for nine days when the CEO emailed asking why his dashboard still showed last week's numbers.

The Practice

Detect with real-time alerts, diagnose through lineage, prevent with a test after every incident. The CEO should never be your monitoring system.

The Decision Test

The same incident twice is a process failure, not a data failure. Every fire should leave behind a test or a contract that prevents the next one.

4.4Data contracts

A data contract is a written agreement between the team that produces a dataset and the teams that consume it. It looks like an API contract because it functions like one. The producer commits to a schema, an SLA, and a meaning. The consumer commits to depending only on what the contract guarantees. When either side wants to change the contract, that conversation happens before the change ships, not after a dashboard breaks.

Schema, field names, types, nullability. Strict enough to fail loudly when a producer tries to ship a breaking change.
SLAs, freshness, latency, availability. "Updated by 6am, less than two hours behind source, 99.5% uptime."
Semantics, what each field means in business terms. The field is called customer_id; the contract says which system of record it points to and what "customer" means here.
Quality, allowed null rates, valid value ranges, referential rules. These are the technical audits from section 4.2 codified.
Ownership, the producer team, the on-call rota, the escalation path. Named humans.
Versioning, how breaking changes are signalled, how consumers are notified, how the deprecation window works.

Contracts work because they are enforced in CI/CD, not in PDF files. A schema drift breaks the build, not production. An SLA violation triggers an incident workflow, not a Slack argument. This is the difference between a contract that lives in the platform and a contract that lives in a folder.

The Pitfall

An upstream team renamed a column on a Friday. By Monday four dashboards were silently wrong, and the analytics team lost a week tracing which change broke what.

The Practice

Put a contract between producer and consumer, enforced in CI. A breaking change should fail the build, not the board report.

The Decision Test

A contract is what turns "who broke the dashboard?" into "the producer's CI caught it before deployment." Enforce in code, not in conversation.

Apply Part Four

Trust is earned by what you measure. Measure it.

See where your data quality stands, then test whether the next request earns its place.

Take the scorecard → Run the Decision Test →

Part Five

Metadata, Privacy & Lifecycle

Data without context is noise. Metadata is the context that makes data findable, understandable, and trustworthy. Privacy is the discipline that ensures context survives contact with regulation. Lifecycle is the recognition that data does not live forever, and that managing what happens at the ends is as important as managing what happens in the middle. This part covers the four anchors: the lifecycle every dataset moves through, the layered metadata that gives data meaning, the classification model that drives privacy controls, and the GDPR principles every privacy programme has to satisfy.

5.1The data lifecycle

Every dataset moves through six stages, whether you manage them deliberately or not. Where programmes typically focus their attention is on Create and Use, getting data in, then getting value out. The expensive failures happen at the stages that get less attention: Share (who got what), Archive (where did it go), and Delete (is it actually gone).

Six stages. Classify at Create, watch Share, and prove you can actually Delete.

Classification happens at Create. Trying to classify retroactively is expensive and unreliable. Build classification into the ingestion process so every new dataset starts with a known tier.
Share is where most privacy incidents originate. Data leaves the boundary of one team and ends up somewhere it shouldn't be. Contracts (Part 4) and audit trails are the controls that matter here.
Delete is the stage where regulators ask hard questions. Right-to-erasure under GDPR is not a feature, it is a process that has to demonstrate what was deleted, when, and from how many systems. If you can't answer those questions in writing, you don't actually have a Delete stage.

The Decision Test

Manage all six stages or hope the regulators don't ask about the three you ignored.

5.2The metadata stack

Metadata is often discussed as a single thing. It isn't. There are at least four kinds, each valuable for different reasons, and most data catalogs ship the foundational layer and call the job done. The value compounds upward, and the higher layers are what turn a catalog from a directory into a tool people actually use.

Four layers. Technical is the floor; business and social are where adoption lives.

Technical metadata is the foundation, schemas, types, file formats, lineage edges. This is what most catalogs auto-ingest. Without it, nothing above it is possible.
Business metadata is what makes the data understandable to anyone outside the engineering team. Definitions, glossaries, metric logic, named owners. This is the layer that takes deliberate human effort and that delivers the most value when it's well-maintained.
Operational metadata is the evidence that the platform is alive. Pipeline runs, refresh times, job status, cost per query. Most teams have this data somewhere, surfacing it in the catalog is what turns it into a trust signal.
Social metadata is the most undervalued layer. Who endorsed this dataset? Who reported an issue last week? Who actually uses it? In practice this is the metadata consumers care most about, because it's the closest thing to a recommendation engine.

The Decision Test

Most catalogs ship technical metadata and call it a day. The investment that matters is in the business and social layers, that's where adoption lives.

5.3Privacy classification

Privacy controls cannot be uniform. Treating every dataset like it contains PII is expensive and wasteful; treating no datasets like they contain PII is reckless. The fix is classification: every dataset belongs to exactly one tier, and the tier drives the controls.

Classify at point of creation. Retroactive classification of a large estate is a thankless multi-quarter project. New data gets tagged correctly from day one; legacy data gets cleaned up opportunistically.
Unclassified data defaults to the most restrictive tier. This protects against the default failure mode of "we hadn't decided yet, so we let everyone see it."
Classification is a property of the dataset, not the system. Restricted data in a public BI tool is still Restricted. Internal data in an external API is still Internal. The controls follow the tier across boundaries.
Re-classification needs a process. Data that was Internal last year may be Confidential this year because the business has changed. Build a review cadence in, annual minimum for Confidential and above.

The Decision Test

Classification is the input to every other privacy control. Get it wrong and every downstream control fires the wrong way.

5.4GDPR's seven principles

The European Union's General Data Protection Regulation has set the global default for how personal data should be handled. Even organisations outside the EU find themselves answering to it, through customers, partners, regulators, or successor laws modelled on it. The seven principles below are the foundation. They are not optional, and they are not bureaucratic, they are a coherent design philosophy for how a programme treats people whose data it holds.

Six principles, held up by a seventh: accountability is the duty to prove the rest.

Lawfulness, fairness, transparency is the foundation. If a data subject would be surprised to learn you held this data, you have a problem.
Purpose limitation prevents scope creep. Data collected to fulfil an order should not silently become marketing fuel.
Data minimisation is the discipline of asking "do we actually need this field?" before adding it to a form. The cheapest field to protect is the one you didn't collect.
Accuracy is shared with Part 4's quality pillars. Wrong data about a person can cause real-world harm, credit decisions, healthcare, employment.
Storage limitation is the policy hook for the Delete stage in section 5.1. If you cannot delete, you cannot comply.
Integrity and confidentiality is where information security meets data governance. Encryption, access control, breach response.
Accountability is the meta-principle. Demonstrating compliance is itself a deliverable, documentation, audit trails, decisions on the record.

The Decision Test

Privacy is not about avoiding fines. It is about earning the right to keep operating with data that belongs, ultimately, to other people.

Apply Part Five

Context and lifecycle decide whether data is safe. Check both.

See how your metadata and privacy score, and test a request before it creates risk.

Take the scorecard → Run the Decision Test →

Part Six

AI & Analytics Enablement

This is the part of the playbook leadership is most excited about, and the part that depends most on everything before it being in place. Generative AI has produced more demand for trustworthy data in the last two years than the previous two decades combined. The companies that can supply it have an advantage that compounds. The companies that cannot are quietly being lapped. This part covers the four anchors: where an organisation sits on the AI maturity curve, the governance landscape every AI programme has to navigate, the lifecycle that turns a model into a product, and the BI journey that turns raw data into a decision worth making.

6.1The AI maturity ladder

Most organisations describe themselves as "AI-ready." Very few are. The maturity ladder below is a more honest diagnostic, five levels from firefighting chaos to genuine AI competitiveness, with a sharp threshold in the middle that very few cross.

Five levels. Level 3 is the tipping point, and very few organisations cross it.

Level 1, Chaos & Firefighting. Excel-driven analytics. Different teams report different numbers for the same metric. A senior analyst leaves and three months of institutional knowledge leaves with them.
Level 2, Defined Foundations. Standards are documented but enforcement is uneven. Owners are named on paper. Data quality fixes are ongoing, and ongoing, and ongoing.
Level 3, Governed & Trusted. This is the tipping point. Trusted data products exist. Metadata and lineage are live. Stewards are active. The business has stopped treating data as a problem to be managed and started treating it as a capability to be invested in.
Level 4, Scalable & Automated. Self-service analytics is the norm. Data contracts are enforced in CI. New use cases ship in weeks rather than quarters.
Level 5, Strategic & AI-Ready. LLMs and RAG run reliably on production data. Autonomous agents act on trusted information. The company has a measurable advantage that competitors find hard to copy.

The Pitfall

The board wanted an AI strategy, so the team shipped a chatbot on top of data three departments didn't trust. It confidently made things up, and the one demo that mattered went badly.

The Practice

Be honest about your maturity level. Fix trusted data first; the AI that runs on it will be the part that works. Level 5 ambitions on Level 1 data fail expensively.

The Decision Test

Don't build Level 5 models on Level 1 data. Most failed AI programmes are not AI failures, they are foundation failures. Fix the foundation first.

6.2The AI governance landscape

AI governance has stopped being optional. Three regimes, one regulatory, one voluntary framework, one certifiable management system, between them shape how serious AI work has to be governed today. They are not competitors. The EU AI Act says what is required by law. NIST AI RMF says how to organise the work. ISO 42001 provides a way to demonstrate it externally. Most programmes operating at any scale will need to satisfy all three.

The EU AI Act applies extraterritorially. If you serve EU customers, or your model does, the risk classification applies. Unacceptable-risk uses are prohibited outright. High-risk uses come with substantial documentation, conformity assessment, and ongoing monitoring obligations.
NIST AI RMF is voluntary in the US, but its Govern–Map–Measure–Manage structure has become a de facto international vocabulary. Many organisations use it as their internal operating model regardless of jurisdiction.
ISO/IEC 42001 is the management-system standard, Plan-Do-Check-Act for AI. It is certifiable, which matters for customers, regulators, and procurement teams who want assurance beyond a self-attestation.
Other regimes worth knowing about: OECD AI Principles (high-level, influential), UNESCO AI Ethics Recommendation (rights-based), NIST AI 600-1 (generative AI specifically). Most of these reinforce rather than contradict the three above.

The Pitfall

A model went live scoring loan applications. Nobody had classified its risk, documented the training data, or checked for bias, until a regulator asked, and there were no answers.

The Practice

Classify risk before building. Map the use case to the EU AI Act tier and the NIST functions early, so governance is designed in, not reconstructed under audit.

The Decision Test

AI governance is not red tape. It is the operating model that lets a company deploy AI without exposing itself to consequences it has not thought through.

6.3The AI / model lifecycle

Models are not products you ship and forget. They are systems that drift, decay, and need active management. The lifecycle below applies as much to a fraud-detection model as to a RAG-powered customer service bot. The stages are not optional, what changes between use cases is the rigour applied at each, which is in turn determined by the risk classification set at the Design stage.

A model is a system, not a shipment. Monitor for drift and retrain on a cadence.

Skipping Design and the risk classification is wrong from the start. You spend validation effort on the wrong things, and you discover at deployment that the use case was actually High Risk under the EU AI Act.
Rushing Collect leads to bias problems that surface in production. "We trained on the data we had" is the most expensive sentence in machine learning.
Cutting corners on Validate is how models with strong test-set accuracy fail in deployment. Fairness, robustness, and explainability tests are not optional, they are how you discover the problems your accuracy metric doesn't show.
Skipping Monitor is the most common failure. A model performing well on day one is not the same as a model performing well on day 90. Drift is real, and unmonitored drift is the source of most quiet AI failures.

The Pitfall

A fraud model launched with 95% test accuracy and was never looked at again. A year later it was quietly missing half the fraud, because the patterns had shifted and nobody was watching.

The Practice

Treat the model as a system with a full lifecycle. Monitor for drift, retrain on a cadence, validate every redeploy. Shipping is the start, not the finish.

The Decision Test

"Just put GPT on it" is not a lifecycle. Treat every model, including LLM-based applications, as a system that needs design, validation, deployment, and monitoring.

6.4The BI journey: raw to actionable

Not every analytics workload is AI. Most of the value most organisations get from their data comes from older, less glamorous disciplines, descriptive analytics, dashboards, reporting, and the slow work of helping people make better decisions. The journey below is the path data takes from arrival to action. Each stage is a discipline of its own, and the difference between a high-functioning BI capability and a low-functioning one is whether all six stages are taken seriously.

Stages 1–4 are the technical journey. Most BI teams handle these well. The data lands, gets sorted, gets arranged into a model, gets shown on a dashboard.
Stage 5, Story, is where most programmes underinvest. A dashboard without a narrative is a wall of numbers. The analyst's job is not just to show the data but to explain what it means, what changed, and what action follows. Data storytelling is a real skill, and it is teachable.
Stage 6, Actionable, is the measure of whether any of this matters. If the dashboard does not change a decision, the dashboard is decoration. The most useful question a BI team can ask its consumers: "what would you do differently if this number changed by 20%?" If the answer is "nothing," the report is on its way to being retired.
Self-service does not skip these stages, it distributes them. The platform supplies stages 1–4 as infrastructure; the consumer supplies stages 5–6 themselves. This works only if the consumer has the literacy to use the platform well. That is the subject of Part 7.

The Pitfall

The team shipped a 30-tab dashboard and called it self-service. Nobody used it, there was no story and no obvious action, just a wall of numbers people quietly went back to ignoring.

The Practice

Carry the data all the way to a decision: a clear story and an obvious next action. If a dashboard doesn't change what someone does, it's decoration.

The Decision Test

If a dashboard does not change a decision, the dashboard is decoration. Measure your BI programme by decisions affected, not reports produced.

Apply Part Six

AI is only as good as the ground beneath it. Test the ground.

See where your AI readiness sits, then put a request through the Decision Test.

Take the scorecard → Run the Decision Test →

Part Seven

People, Culture & Adoption

Everything in the first six parts can be built correctly and still fail, because data work is, in the end, done by people for people. A perfect platform that nobody trusts is worthless. A flawless model that nobody acts on is a science project. This part is about the human side: how literacy is built, where a culture sits on the spectrum from instinct to evidence, the two opposite ways data teams approach their work, and how to communicate so the work actually lands.

7.1The data literacy ladder

Data literacy is not a switch that flips. It is a ladder people climb, and the organisation's job is to help them climb it. The four rungs below describe a progression from simply knowing the data exists to being able to answer your own questions without help. Most self-service initiatives fail because they hand Level-2 people Level-4 tools and wonder why adoption stalls.

Four rungs. Only Fluent people genuinely self-serve, so match the tool to the rung.

Assess where the bulk of your organisation actually sits. Be honest, most people in most companies are at Aware or Skilled, not Fluent.
Match the intervention to the rung. Awareness campaigns move people from unaware to Aware. Tool training moves Aware to Skilled. Critical-thinking and interpretation coaching moves Skilled to Fluent. Only Fluent people genuinely self-serve.
Don't confuse tool access with literacy. Giving everyone a BI licence does not make them literate any more than giving everyone a word processor makes them writers.
The concrete test questions for each rung are deliberately practical. "Can a marketer answer a new question without filing a ticket?" is a better measure of literacy than any survey.

The Decision Test

Self-serve analytics works only when people are Fluent. Tools don't create literacy, training and culture do.

7.2The data culture scale

Every person relates to evidence in a particular way, and so does every organisation. The scale below runs from the gut-feeler who decides on instinct and uses data only to justify decisions already made, to the scientist who treats their own beliefs as hypotheses to be tested. Most people, and most companies, sit somewhere in the middle.

A spectrum from instinct to evidence. Most people sit in the middle.

Diagnosis. Where does your leadership team actually sit? A data programme reporting to a gut-feeler CEO faces a different challenge than one reporting to a scientist.
Realistic targets. The goal is not to turn everyone into a scientist, that's neither achievable nor desirable. The goal is to move the median one step to the right. A company of reporters who become analysts is transformed.
Spotting the trap. The sceptic is not the enemy, healthy scepticism is part of good analysis. The trap is when scepticism becomes a permanent excuse to ignore inconvenient evidence.
Self-awareness. The most useful application is personal. Where do you sit? When was the last time data changed your mind about something you believed?

The Decision Test

The goal is not to make everyone a scientist. It is to move the median one step to the right.

7.3Two ways to build a data team

Give two leaders the same headcount, the same budget, and the same tools, and you can still get opposite outcomes. The difference is mindset. The comparison below contrasts the "dashboard factory builder", who optimises for output and treats governance as an obstacle, with the "modern data leader" who optimises for impact and treats governance as an enabler.

Same budget, same tools, opposite outcomes. The difference is mindset.

The dashboard factory measures itself by what it produces. The modern data leader measures by what changes. A team can ship a hundred dashboards and change zero decisions, and the factory builder will call that a good quarter.
The factory builder treats governance as red tape that slows them down. The modern leader treats governance as the thing that lets users find data, trust it, and act on it without friction. Same word, opposite meaning.
The most telling row is the last one. When friction appears, the factory builder blames the culture, the stakeholders, the lack of resources. The modern leader takes control, delivers value first, earns trust, and uses that trust to scale.
This is not about talent. The factory builder is often more technically skilled. It is about where they point that skill.

The Decision Test

Culture is not built by declaring it. It is built by delivering value before demanding literacy. Quick wins first, then the right to ask for more.

7.4Communication patterns by audience

The same finding needs four different deliveries depending on who is listening. This is the skill that most distinguishes analysts whose work gets used from analysts whose work gets filed. Most analysis dies in translation, not in calculation, the number was right, but the delivery didn't fit the listener.

One finding, four deliveries. The framing is the job.

Executives want the decision. Lead with the recommendation, support it with the one number that matters, and leave the method in the appendix. A single slide beats a dashboard.
Peers and analysts want the method. They will and should challenge the approach, so show the assumptions, make it reproducible, and state the confidence levels honestly.
Business users want to know what to do next. Plain language, tied to their actual workflow, with the next action obvious. Jargon is where you lose them.
The wider organisation wants to know why it matters. This is the domain of narrative, a story with a before and after, carried by visuals, repeated across channels until it sticks.

Notice that the analytical work is identical across all four. What changes is the framing, and the framing is the job. An analyst who can do this well multiplies the value of every other capability in this playbook.

The Decision Test

The finding is the same. The framing is the job. Most analysis dies in translation, not in calculation.

Apply Part Seven

Tools used beat tools bought. See if yours are.

See where your culture and adoption score, and test whether a request will actually land.

Take the scorecard → Run the Decision Test →

Part Eight

Measurement & Maturity

The final discipline is knowing whether any of it is working. A data programme that cannot measure itself cannot improve itself, cannot defend its budget, and cannot tell its sponsor a credible story about progress. This part covers the four anchors of measurement: the maturity model that tracks where the programme is, the metrics that fit each level of maturity, the KPI hierarchy that serves different audiences, and the hard discipline of proving return on investment.

8.1The maturity pyramid

Maturity models are useful for one reason above all: they let you have an honest, non-personal conversation about where the programme actually is. "We're at Level 2" is a more productive sentence than "the data is bad." The five-level pyramid below runs from a programme that exists on paper to one that delivers measurable strategic value.

Five cumulative levels. You climb through them; you cannot buy your way to the top.

Assess honestly. Most programmes overestimate their level. Having owners named on paper (Level 1–2) is not the same as having active, accountable stewards (Level 3+).
Don't skip levels. A programme at Level 2 cannot leap to Level 5 by buying an AI tool. The levels are cumulative, each depends on the one below being solid.
Match metrics to level. This is the most practical use of the pyramid, and the subject of the next section. Measuring strategic-value metrics at a Level-1 programme produces nothing but demoralisation.
Use it for the sponsor conversation. "We were at Level 2 a year ago, we're at Level 3 now, here's what Level 4 requires" is exactly the kind of narrative that keeps a programme funded.

The Pitfall

A brand-new programme set itself a target of "reduce time-to-insight by 30%." With no stable foundation to move the number, the metric bounced around meaninglessly, and the team felt like failures by month three.

The Practice

Assess your maturity honestly and measure what's real for your level. A Level-1 programme counts owners and complaints, and that's exactly the right thing to count.

The Decision Test

Maturity is cumulative. You cannot buy your way to Level 5, you have to climb through the levels below it.

8.2Metrics by maturity level

The single most common measurement mistake is using metrics that don't fit your maturity. A Level-1 programme measuring "reduction in time-to-insight for analytics teams" will report noise, because there isn't yet a stable enough foundation to move that number. The matrix below matches three kinds of metric, activity, quality output, and business impact, to each maturity level.

Read across a row to get a coherent metric set for your current level. At Level 1, you measure whether assets have owners, count quality complaints, and track manual fixes. That's honest and it's achievable.
Read up a column to see how the bar rises. The activity metric goes from "% of assets with an owner" (L1) to "data product reuse and adoption rate" (L5). Same dimension, vastly more sophisticated question.
Notice the shift in emphasis. Lower levels are dominated by activity metrics, things you do. Upper levels are dominated by impact metrics, outcomes you produce. The transition happens around Level 3, the governance threshold from Part 6.
Set targets one level up, not five. The goal of a Level-2 programme is to hit Level-3 metrics, not Level-5 ones. Ambition is good; measuring against an unreachable bar is not.

The Pitfall

The dashboard tracked twenty metrics because twenty looked thorough. Half couldn't be moved at the team's maturity, so nobody acted on any of them, and the dashboard became wallpaper.

The Practice

Pick the row that matches your level. A coherent handful of metrics you can actually move beats a wall of numbers that only looks comprehensive.

The Decision Test

Measure what's real for your level. Activity metrics at the bottom, business impact at the top, and a deliberate climb between them.

8.3The KPI hierarchy

Different audiences need different metrics. The board does not want to hear about pipeline success rates; the data team cannot meaningfully report on revenue influenced. The three-tier hierarchy below sorts metrics by the question they answer and the audience they serve.

Three tiers of metric. Match the tier to the audience, never the other way round.

Operational metrics answer "is the machine running?", pipeline success, freshness, quality pass rates, incident volume. These are for the data team, reviewed daily or weekly.
Tactical metrics answer "is the programme working?", SLA compliance, resolution time, adoption, certified products. These are for data leadership, reviewed monthly.
Strategic metrics answer "why does this programme exist?", revenue influenced, risk reduced, cost avoided, decisions accelerated. These are for the board, reviewed quarterly.
The cardinal sin is reporting the wrong tier to the wrong audience. A board deck full of pipeline success rates signals that the programme doesn't understand its own purpose. A data-team standup focused on revenue attribution wastes everyone's time.

The Pitfall

The board deck opened with pipeline success rates and refresh times. The directors' eyes glazed; one asked what any of it had to do with the business. The programme looked like it didn't understand its own purpose.

The Practice

Match the metric tier to the audience. Operational for the team, tactical for leadership, strategic for the board. Show directors revenue, risk, and decisions, not refresh times.

The Decision Test

Match the tier to the audience. Operational for the team, tactical for leadership, strategic for the board.

8.4Proving return on investment

The hardest question a data programme faces is also the most important: was it worth it? The value chain below traces the path from spend to outcome through four links. The uncomfortable truth is that most programmes can prove the first link, they spent the money, and struggle with everything after it.

Investment is trivially measurable. You know what you spent on platform, people, governance, and tooling.
Capability is measurable with some effort. Trusted data, faster access, working self-service, these show up in the tactical metrics from section 8.3.
Behaviour is where attribution gets hard. Are more decisions actually being made with evidence? This requires instrumenting how decisions get made, which most organisations don't do. It is also where the value actually begins.
Outcome is where the value lands and where attribution is hardest. Revenue went up, but was it the data programme, the new product, the market, or the sales team? The honest answer is usually "some combination," and the discipline is to claim a defensible share rather than the whole thing or nothing.

The practical move is to instrument behaviour change, not just spend. Track decisions that data influenced. Capture the before-and-after of specific use cases. Build a portfolio of documented wins, each one a small, defensible story, rather than reaching for a single grand ROI number that no one believes. A credible collection of "this decision was made differently because of this data, and here's what happened" beats a spurious precise figure every time.

The Pitfall

Asked to prove ROI, the team reached for one grand number: "£4M of value created." Finance picked it apart in ten minutes, and the credibility of the whole programme went with it.

The Practice

Build a portfolio of documented wins, specific decisions made differently, with measured before-and-after. A defensible collection of small stories beats one spurious big number.

The Decision Test

Attribution gets harder the closer you get to value. Instrument behaviour change, build a portfolio of documented wins, and claim a defensible share, not the whole thing, not nothing.

Apply Part Eight

If you cannot measure it, you cannot prove it. Start here.

See your overall maturity across all eight domains, then test your next request.

Take the scorecard → Run the Decision Test →