Most engineering leaders know the feeling of staring at a long feature roadmap and quietly asking, “Do we know if any of this will matter to customers?” Evidence-driven engineering is about refusing to guess, and insisting on proof.
In a recent live episode of the Scaling Tech Podcast, host Arin Sime and AgilityFeat COO Mariana Lopez spoke with David Bland, co-author of Testing Business Ideas and founder of Precoil. David works with leadership teams as what he calls an “independent arbiter of reality”, helping them test growth strategy against facts, not wishful thinking.
This post distills that conversation for CTOs who want their teams to ship less code that nobody uses and more products that actually move the needle. Scroll down to keep reading, or watch the full episode:Â
What Is Evidence-Driven Engineering?
David Bland’s core point is simple: most companies still treat engineering as a delivery function, not as a partner in deciding what to build.
Strategies are decided in a small room, then “waterfalled” down into roadmaps and backlogs. Teams are told what to build, then praised for “iterating” inside a tiny box. As David put it, teams are “iterating within a really confined space.”
Evidence-driven engineering changes that. It pulls engineering leadership into the strategy conversation, and it frames that strategy as a set of testable risks.
The Three Circles: Desirability, Viability, Feasibility
Bland uses three overlapping circles to describe the types of risk every new product carries:
Desirability (Do they want or need this?)
- Customer problems, unmet needs, demand, willingness to change. Usually led by research, design, marketing, sales.
Viability (Should we do this?)
- Revenue, cost, margins, business model, fit with company strategy. Often led by product, finance, and leadership.
Feasibility (Can we do this?)
- Technology, architecture, compliance, security, legal, operations. This is where engineering “shines,” in his words.
In many companies, these circles are separate silos. Evidence-driven engineering expects them to overlap. CTOs and engineering leaders are not only guardians of feasibility. They are partners in challenging desirability and viability assumptions as well.
“Engineering shines on ‘can we build this?’ But people need to hear that answer sooner, not after promises are made.”
Learn from Experiments, Not Opinions
A lot of expensive failures start with a simple pattern: someone decides “we know the market,” builds for a year, then discovers they were wrong.
Arin gave an example where a developer that he knows, and that developer’s co-founder friend, built a clickable prototype for a B2B construction marketplace. The co-founder, a domain expert, refused to show it to actual customers. For more than a year the developer built the co-founder’s idea in isolation, then finally they showed it only to investors, with no users, no metrics, and only a few verbal “yeah, I’d probably use this” comments.
Investors asked the only question that matters at that point: “Where’s your traction?” They had none. The startup died.
David’s rule of thumb: raise from customers before you raise from investors. Investors want proof that customers care. That proof only comes from testing with the market early.
A Better Pattern: Test with Clear Success Criteria
Mariana shared a counterexample from an AgilityFeat client. The client wanted users to fill out extra forms and proposed a modal that would nag them until they complied.
She resisted, since it hurt the experience and likely would not fix the real problem. They compromised by turning it into an explicit experiment:
- Hypothesis: a modal prompt will get at least 10% of users to click and start the forms.
- Test: run the modal to production users for two weeks.
- Success criteria: keep the modal only if click-through is at least 10%.
Actual result: click-through was under 0.5%. The team removed the modal and used that as evidence to get the budget for proper user research instead. David highlighted the structure behind this:
“You have your hypothesis, you have your test, you have your metrics, and then you have your success criteria. That last box is the hardest to fill in.”
Without that success line drawn in advance, teams fall into “after the fact rationalization.” You say you need 10%, get 0.5%, then convince yourself that 0.5% is fine because the feature looks good in a demo. Evidence-driven engineering demands that teams write success criteria down up front, agree on them with leadership, and honor the result even when it hurts egos.
Testing with Customers, Not on Them
Many leaders resist experiments because they fear hurting the brand. They worry that unpolished ideas will “confuse users” or make the company look uncertain. David reframed this as testing with customers:
- Be transparent that you are testing an early idea.
- Recruit users into a partnership, rather than surprising them.
- In B2B, shift from always pitching to listening and co-creating.
This is a culture shift, especially where sales has taught teams never to show anything imperfect. Evidence-driven engineering expects teams to be honest about uncertainty and invite customers into the learning process.
Stop Zombie Projects Before They Drain Your Org
David used a vivid phrase most CTOs will recognize: zombie projects.
Key Characteristics of Zombie ProjectsÂ
- Keep pivoting and changing scope.
- Never hit meaningful revenue or impact.
- Never get killed, because nobody owns the decision.
- Consume time, morale, and budget indefinitely.
He often hears leadership say, “What if we give it a little more time, or a little more money, or a few more people?” Sunk cost fallacy keeps the project undead. In one recent engagement, David helped a client assess their portfolio and kill 43% of the work in progress. That freed up about 12 million dollars of spend. The pattern he recommends is simple:
- Define metrics and review dates before the work starts.
- Decide in advance what “pivot”, “park”, or “kill” looks like.
- Treat stopping as a valid and healthy outcome.
Most companies have a process for starting work, and almost none for stopping it. Evidence-driven engineering needs both.
Use EMT to Turn Strategy into Testable Risks
When a roadmap comes to the table, David does not attack it. He neutralizes the discussion with one question:
“What would have to be true for this to work?”
EMT: Extract, Map, Test
David steers teams to start with desirability and viability, not feasibility.Â
- Extract assumptions: Ask leaders and teams what must be true about customers, revenue, cost, tech, and partners for the idea to succeed. Capture all those beliefs.
- Map assumptions to risk areas: Cluster the answers into desirability, viability, and feasibility. You will often find many “do they want it and will they pay enough” risks, even when people think tech risk is the hardest part.
- Test in a planned way: Build a simple experiment plan, often in a spreadsheet: Assumption, Risk type (D/V/F), Test you will run, Owner, When you will run it, Priority
“You can build almost anything, but a lot of it fails because customers do not want it or will not pay enough for it.”
Leadership reviews should then ask different questions:
- What are our riskiest assumptions right now?
- Which ones did we test since our last review?
- Has our overall risk increased or decreased?
This keeps strategy honest, and it gives engineering leaders a clear way to talk about risk reduction, not just delivery dates.
Measure What Matters: The What and the Why
Evidence-driven engineering needs instrumentation, but David warned against an obsession with clickstream data alone. Quantitative tools, such as feature-level analytics and product funnels, tell you what users did. They never tell you why. He encourages teams to:
- Add analytics that match the experiment’s success criteria.
- Avoid making big bets based only on AB tests and click rates.
- Keep a steady habit of weekly customer conversations to uncover the reasons behind behavior.
For lifecycle metrics, he uses “pirate metrics” (AARRR) as a simple shared language:
- Acquisition: how people arrive.
- Activation: first meaningful action.
- Retention: how and how often they come back.
- Referral: how they invite others.
- Revenue: where and how you get paid.
Cross-functional leaders should define what each of these means for their product, then instrument a baseline. After that, each experiment should target a specific part of the funnel and ask, “Did we move this number in the right direction?”
For CTOs scaling teams, this ties directly into how you build and evolve your platform. If you are wrestling with technical bottlenecks that block this level of measurement, the advice in this episode ’s pairs well with guidance from the AgilityFeat blog on scaling DevOps infrastructure for growing teams.
AI, Vibe Coding, and MVP Discipline
The conversation also touched on “vibe coding” with generative AI tools that can produce designs or working prototypes in minutes. David sees two main issues:
- Rework is hidden. AI-generated designs often have poor information hierarchy. AI-generated code is usually not production quality. Design and engineering still need to redo much of the work, but stakeholders may think they are “80% done.”
- Usability feedback is mistaken for value. With low-fidelity sketches, users talk about value: “This is useful” or “I would not use this.” With polished prototypes, they talk about details: “Make this button green” or “Change this flow.”
Teams then infer that because users suggest changes, they must also see value. Many still will not pay or adopt the product.
David’s concern is that AI speeds up the wrong kind of learning. It helps you polish faster, not understand demand faster. He repeated a lesson from early MVP work: sometimes you must throw away an MVP even if it “worked,” because it cannot scale safely or cleanly.
For a deeper look at how GenAI is changing team workflows, while still needing human judgment, AgilityFeat has a useful recap on generative AI transforming software development roles.
Pick the Right Customers and Challenge “I Am the Customer”
A common anti-pattern arises when someone senior says, “I am the customer, I would use this.” That may be true for them, but often they are not a representative segment.
David uses a simple three-step filter to define who you should talk to and test with:
- Do they have the problem? You can observe clear symptoms or pain.
- Are they aware of the problem? They recognize it as a problem, not just background noise.
- Are they actively seeking a solution? There is evidence in their behavior: search queries, workarounds, forum posts, existing tools.
Approach each group differently. People who have the problem but are not seeking a fix may only feel a mild “headache.” Those who have the problem, know it, and are seeking help feel a “migraine.” They are where early validation usually makes the most sense. When leaders push an idea based only on their own taste, David sometimes asks a sharper question:
“Would you bet your retirement on this?”
Many would not. That gap between how confidently they will spend company money and how carefully they spend their own is a signal that more evidence is needed. For long B2B sales cycles, he also advises breaking big, slow experiments into smaller “leading indicator” steps that move from words to action:
- Interviews and discovery calls.
- Surveys or forms where people type their intent.
- Non-binding letters of intent.
- Small prepayments or deposits.
Each step closes the gap between what customers say and what they do.
How AgilityFeat Can Support Evidence-Driven Engineering
For CTOs, building an evidence-driven culture is not only a method problem, it is also a team problem. You need product, UX, and engineering capacity that can run experiments without slowing your roadmap to a halt. AgilityFeat specializes in building nearshore teams across Latin America that plug into your stack and ways of working. That can mean:
- Nearshore squads who build and instrument MVPs.
- Staff augmentation focused on product engineering or DevOps.
- A longer-term build-operate-transfer model to establish your own center.
If you are considering expanding with nearshore talent, our guide on how nearshoring scales tech teams and overview of staff augmentation benefits for fast‑growing tech teams are practical starting points.
The key is to scale with people who are comfortable treating roadmaps as hypotheses and who know how to turn those into disciplined experiments.
Put Evidence at the Center of Your Roadmap
Evidence-driven engineering is not a new framework to roll out. It is a mindset shift for how your company decides what to fund, what to ship, and what to stop. From David Bland’s experience, a few habits are enough to start:
- Bring engineering leadership into strategy early, not after the roadmap is frozen.
- Frame big bets as “what would have to be true” and write down your assumptions.
- Design tests with clear success criteria, then respect the result.
- Build basic instrumentation and talk to customers every week so you see both the what and the why.
- Give your teams permission to kill zombie work and throw away MVPs that have served their learning purpose.
For CTOs, the question is not whether your teams can build the product. Engineering can build almost anything. The question is whether you are building the right thing, something customers value enough to change behavior and pay for.
If you want more conversations like this, you can explore past episodes at scalingtechpod.com.





