Engineering Metrics · Knowledge Graph Infographic

Why We Measure Tickets

A deep exploration of why engineering teams default to ticket counts, how Goodhart's Law corrupts the signal, and how to move toward meaningful measurement.

4
DORA Metrics that actually predict performance
10
Defined terms in this knowledge graph
7
Steps to transition beyond ticket metrics
12
FAQ items covering the full picture
1
Law that explains why every metric eventually fails
Ways engineers will game whatever you measure

Why Tickets? 🔗

Software engineering work is invisible until it ships. Issue trackers were invented to make that invisible work legible to teams and managers — a coordination tool that gradually became a performance measurement instrument.

👁️
The Visibility Problem

Unlike manufacturing, engineering work-in-progress is invisible. Managers can't walk the factory floor and see what's happening. Issue trackers emerged as a way to make the work legible — a reasonable solution to a real problem.

📊
Legibility ≠ Measurement

Making work visible is not the same as measuring productivity. Ticket data is operationally useful for coordination and bottleneck detection. The mistake is treating ticket throughput as a signal of engineering productivity.

🔄
Accountability Proxies

In many organizations, ticket metrics serve a social function: signaling to management that engineers are working and money is being spent productively. This social role makes them sticky even when everyone acknowledges they're imperfect.

🔧
Cheap to Collect

Ticket data is always available and auto-generated. It requires no instrumentation investment. It produces daily updates. Alternative metrics — DORA, outcome signals — take time and infrastructure to build. Convenience wins by default.

Goodhart's Law at Work 🔗

The moment ticket count becomes a performance target, Goodhart's Law kicks in and the metric stops measuring what it was designed to measure.

"When a measure becomes a target, it ceases to be a good measure."
— Charles Goodhart, economist · Applied to every engineering velocity dashboard since 2005

The Gaming Behaviors That Follow

One large ticket becomes ten small ones to inflate throughput numbers.
Premature Closure
Tickets are marked done before the work is truly complete to hit the weekly count.
Complexity Avoidance
Engineers avoid hard problems that take weeks to resolve — they depress velocity.
Estimate Inflation
Story points are inflated to keep velocity stable when work gets harder.
Shadow Work
Architectural improvements, documentation, and testing disappear into invisible effort.
High velocity dashboards that signal busyness but deliver diminishing user value.

Output vs. Outcome 🔗

The root confusion underlying ticket metrics: conflating what engineers produce (output) with what users and the business experience as a result (outcome). Organizations that optimize for output systematically underinvest in quality.

⚠️ Output Metrics (what we count)
Tickets closed per sprint — number of work items resolved
Story points velocity — estimated effort units completed
Lines of code — raw volume of code produced
Features shipped — count of feature releases per period
PR count / merge rate — pull request volume and merge frequency
✅ Outcome Metrics (what we change)
Feature adoption rate — % of users actively using a shipped feature
Error/incident rate — frequency and severity of production failures
User retention — whether users return and engage over time
Page load / latency — real-world performance experienced by users
Revenue / conversion impact — measurable business value delivered

DORA Metrics 🔗

The four DORA (DevOps Research and Assessment) metrics are empirically validated predictors of software delivery performance and organizational outcomes — developed through multi-year research by Google. Unlike ticket counts, they measure the full delivery pipeline and are hard to game because they reflect real system behavior.

D1
Deployment Frequency
How often does your organization successfully release to production? Elite teams deploy on demand — multiple times per day. This measures batch size and delivery pipeline maturity.
D2
Lead Time for Changes
How long does it take to go from code committed to code running in production? Lead time captures the full delivery pipeline, not just execution speed.
D3
Change Failure Rate
What percentage of deployments cause a failure that requires rollback, hotfix, or incident response? Elite teams have failure rates of 0–15%. This balances speed against stability.
D4
MTTR
Mean Time to Recovery — how long does it take to restore service after a production incident? This measures system resilience, observability maturity, and team effectiveness under pressure.

Engineering Culture and Metrics 🔗

Metric choices powerfully shape engineering culture. What you measure is what you value. What you value is what your team optimizes for.

🚨 Ticket-Metric Culture
Rewards busyness over impact
Discourages complex, high-value work
Creates incentives to hide difficulty
Punishes honest estimates of hard problems
Underinvests in quality and architecture
Generates velocity theater, not delivery
Erodes trust between engineering and product
✅ Outcome-Metric Culture
Rewards impact on users and business
Encourages tackling root causes
Creates psychological safety for honest reporting
Values quality, stability, and craftsmanship
Connects engineering work to user experience
Builds ownership and engineering accountability
Supports continuous learning from failure

7 Steps to Move Beyond Ticket Metrics 🔗

Transitioning to better measurement is an organizational change project, not just a tooling change. These steps are designed to be sequential — each builds on the trust established by the previous one.

1
Audit What You Currently Measure and Why

Inventory all metrics your team tracks in your issue tracker. For each metric, ask: what decision does this inform? What behavior does it incentivize? This audit surfaces which metrics coordinate work and which substitute for trust.

2
Define What You Actually Want to Know

Name the underlying question before choosing new metrics. 'Is the team productive?' is too vague. Better: 'How quickly do we deliver value?' · 'How often does our software fail?' · 'How fast do we recover?' Metrics are only meaningful when tied to an explicit question.

3
Instrument DORA Metrics

Set up data pipelines for all four DORA metrics: Deployment Frequency, Lead Time for Changes, Change Failure Rate, and MTTR. These require CI/CD integration and incident tracking but are highly resistant to gaming because they reflect actual system behavior.

4
Introduce Outcome Metrics Alongside Engineering Metrics

Partner with product and data to identify user-facing outcome metrics you can connect to engineering work. Even imperfect outcome signals reframe the conversation from 'how many tickets did we close?' to 'what changed for users?'

5
Build Psychological Safety Before Increasing Transparency

Have explicit conversations about how data will — and will not — be used. Commit to treating problems revealed by metrics as systemic opportunities, not individual failures. Without this foundation, any metric will be gamed or hidden. Psychological safety is a prerequisite.

6
Gradually Shift Stakeholder Conversations

Use new metrics in stakeholder updates alongside existing ones. Narrate the story: 'We shipped X features. Our lead time improved by Y days. Our change failure rate is Z%.' Prove that new metrics are richer and more actionable before asking stakeholders to give up the familiar velocity chart.

7
Iterate and Revisit Your Metric System Quarterly

No metric system is permanent. Schedule quarterly reviews: Are our metrics still answering the right questions? Are any being gamed? Have we introduced new work types needing new measurement? Healthy metric systems evolve as organizations and products mature.

FAQ 🔗

Why do engineering teams default to measuring tickets in the first place?

Tickets are visible and countable. In the absence of clear outcome signals, managers reach for whatever data is readily available — and issue-tracking systems produce a continuous stream of quantifiable events: tickets opened, tickets closed, points completed. This makes them tempting as productivity proxies even when they don't accurately capture the value engineering is delivering.

What is Goodhart's Law and why does it matter for ticket metrics?

Goodhart's Law states that when a measure becomes a target, it ceases to be a good measure. When ticket count or velocity is used as a performance target, engineers naturally (and rationally) optimize for closing tickets rather than solving underlying problems. This corrupts the metric and creates perverse incentives that reduce actual engineering effectiveness.

What behaviors emerge when ticket count becomes a performance target?

Teams exhibit metric gaming behaviors: splitting large tickets into many small ones, avoiding hard problems that don't resolve quickly, inflating story-point estimates, closing tickets prematurely, and investing in low-impact but easily ticketed work over high-impact architectural improvements that are difficult to represent in a tracker.

What are DORA metrics and why are they considered more meaningful?

DORA metrics — Deployment Frequency, Lead Time for Changes, Change Failure Rate, and MTTR — were developed through multi-year research by Google's DevOps Research and Assessment team. They are empirically validated predictors of software delivery performance and organizational outcomes. Unlike ticket counts, they measure the full delivery pipeline and are harder to game because they reflect real system behavior.

How do ticket metrics affect engineering culture over time?

Sustained ticket metrics as performance signals cultivates cultures of busyness over impact. Engineers learn to look busy rather than be impactful. It discourages investment in quality, documentation, testing, and architectural work. It also creates incentives to hide complexity and avoid honest conversations about what's actually hard.

Are there situations where tracking ticket data is useful?

Yes — ticket data is valuable for operational coordination, identifying bottlenecks in workflow, and capacity planning. The problem is not the data itself but using ticket throughput as a performance target. Descriptive use is fine; prescriptive use creates dysfunction.

What are outcome metrics and how do they differ from ticket metrics?

Outcome metrics measure what users or the business actually experience as a result of engineering work: error rates, feature adoption, retention, latency, revenue impact. They are harder to collect and lag the work that caused them, but they are resistant to gaming because they reflect reality in the world rather than activity inside the team's tooling.

How does lead time differ from cycle time?

Lead time measures from when work is requested until it ships; cycle time measures from when active development begins until it ships. Lead time captures queue and prioritization delays; cycle time isolates execution speed. Together they reveal where bottlenecks live — in the backlog or in delivery.

What role does psychological safety play in metrics?

It's a prerequisite. In low-safety environments, metrics become weapons. Engineers hide problems, avoid surfacing risk, and game whatever is being measured to avoid punishment. Better metrics only work in contexts where engineers trust that honest reporting about difficulty, failure, or slowness will be met with curiosity rather than blame.

Why is it so hard to stop using ticket metrics even when leaders know they're flawed?

Ticket metrics are cheap to collect and produce daily updates. Executives feel pressure to demonstrate accountability through numbers. Velocity charts provide a reassuring (if misleading) sense of control. Changing metrics means changing the implicit accountability contract — uncomfortable even when clearly justified.

How can engineering leaders begin the transition to better measurement?

Start by being transparent about why current metrics are being used and what question they're actually answering. Introduce complementary metrics alongside ticket data rather than replacing it overnight. Build the data infrastructure. Tell a story with the new metrics before retiring the old ones. Earn organizational trust in the new system gradually.

What does Integrated by Design add to this discussion?

'Integrated by Design' explores how engineering organizations can be structured to deliver effectively through platform thinking, clear ownership, and aligned incentives. Its lens on organizational design is directly relevant: you cannot measure your way to good outcomes without first designing the system in which measurement will live.

Glossary 🔗

The practice of measuring engineering productivity using counts, rates, or throughput of tickets in issue-tracking systems.
When a measure becomes a target, it ceases to be a good measure — named after economist Charles Goodhart.
The antipattern of using sprint velocity as a proxy for productivity, leading to inflated estimates, ticket splitting, and busyness theater.
The capacity to deliver value reliably, sustainably, and at speed — encompassing quality, architecture, team learning, and customer outcomes.
Deployment Frequency, Lead Time for Changes, Change Failure Rate, and MTTR — empirically validated engineering performance predictors.
Rational but counterproductive behavior of optimizing for whatever is being measured rather than the underlying goal.
Measurements focused on business or user results produced by engineering work — resistant to gaming because they reflect reality.
Elapsed time from when work is requested until it ships — captures queue delays and full delivery pipeline efficiency.
Time from when active development begins until the work ships — isolates execution speed within the team's control.
Shared values, norms, and behaviors that shape how a team works. Metric choices directly and powerfully influence engineering culture.

Further Reading 🔗