Building AI Products: 4 Core Lessons from Productboard's AI PMs
The hype around AI products is loud, but building something customers actually trust and want to use? That's a different story.
At Productboard, we're not just talking about AI—we're building it into the core of our platform. Chris Patton, Principal AI PM for Productboard Pulse, and Dominik Ilichman, Senior AI PM for Productboard Spark, recently sat down to share the lessons they've learned from shipping AI products that analyze massive volumes of customer feedback and power product discovery workflows.
Their insights cut through the noise. No buzzwords, no hand-waving—just practical wisdom from the trenches of building customer-centric AI products. Here are the four core challenges every AI PM needs to solve, and how Productboard's team is tackling them.
1. Choosing the Right AI System Architecture for Your Problem
Not all AI systems are created equal. One of the biggest mistakes teams make is jumping straight to the flashiest solution without understanding what type of system their problem actually requires.
Chris and Dominik break AI systems into three fundamental types:
Interpretive Systems are all about retrieval and synthesis. These systems pull data from multiple sources, analyze it, and present insights. Think: "What's happening in my customer feedback?" or "Summarize this product's performance." The key challenge here is building sophisticated retrieval mechanisms that can accurately surface the right information from massive datasets.
Constrained Action Systems operate within a narrow band of well-defined workflows. You know exactly what actions can be taken, and the outcomes are predictable. These systems are perfect for specific tasks like updating statuses or triggering notifications based on certain conditions. The workflows are orchestrated, the goals are clear, and the AI supports a predetermined path.
Open-Ended Agentic Systems are the most complex. These systems can take multiple different actions, execute multi-step workflows, and operate non-deterministically. They require sophisticated planning, sub-agents, and verification mechanisms to ensure accuracy as they navigate complex, open-ended tasks.
How This Plays Out at Productboard
Productboard Pulse is built as an interpretive system. It analyzes anywhere from 200,000 to a million pieces of customer feedback at scale. Product managers rely on Pulse to surface patterns, synthesize insights, and help them understand what customers actually need. Because these PMs are making critical product decisions based on this analysis, accuracy isn't optional. That's why Pulse required building sophisticated pre-processing, data management, and context window management systems to ensure reliability.
Productboard Spark, on the other hand, is an agentic system. It assists product managers with discovery, writes product specifications, and facilitates team collaboration through a conversational interface. Spark works on top of all your product strategic data—customer feedback, strategic materials, product specs, and backlogs. Because this dataset can be enormous, Spark uses an agent graph architecture with sub-agents that can delegate tasks and pull only the most relevant information into context at any given time.
The lesson: Start by understanding your problem type. Don't build an agentic system when a constrained workflow would work better. Don't settle for simple retrieval when you need sophisticated synthesis.
2. Understanding Foundational Model Capabilities (And Their Limits)
Once you know what type of system you're building, you need to pick the right model. The "big three" foundation models aren't interchangeable. They each have distinct strengths, and choosing the wrong one can sink your product before it launches.
The Model Landscape
Anthropic’s Claude excels at code generation, structured processes, and sophisticated tool use. If you're building developer tools or complex multi-step reasoning workflows, Claude is your friend.
OpenAI's GPT models shine in natural language tasks and creative writing. They're consumer-oriented and excel at content generation, conversational interfaces, and general-purpose applications.
Google Gemini brings strengths in workplace integration and multimodal processing. It's built for business-oriented applications that need to handle different types of data seamlessly.
But, this landscape shifts constantly. Chris noted that Google's Gemini 3.0 release already changed some of these dynamics. The frontier models move fast, and you need to keep tabs on their evolving capabilities.
Beyond the Model: What Really Matters
Choosing a model is just the beginning. The real work is in understanding:
- How models fail and designing for graceful degradation
- What "good enough" looks like for your specific use case
- Cost versus accuracy tradeoffs at scale
- Latency requirements for your user experience
At Productboard, the team constantly evaluates whether their model choices still serve their users' needs as both models and requirements evolve. Building AI products means committing to continuous evaluation and optimization, not set-it-and-forget-it deployment.
3. Ensuring AI Reliability Through Rigorous Evaluation
AI systems are probabilistic, not deterministic. That makes reliability a fundamentally different challenge than traditional software testing. Instead of checking if output X always follows input Y, teams must measure quality across dimensions that matter to users.
The Rise of AI Evals
AI evaluations (evals) have become the cornerstone of reliable AI products. Evals systematically test whether your AI system meets the quality, accuracy, and safety standards your users expect. Without them, you're flying blind.
The Productboard team emphasized several types of evals:
Accuracy-based evals measure whether the AI produces factually correct outputs. For Pulse, this means verifying that insights pulled from customer feedback genuinely reflect what customers said. For Spark, it means ensuring generated product specs align with strategic context.
Quality-based evals assess the usefulness and coherence of outputs. Does the summary actually help the user understand the key points? Is the generated text clear and actionable?
Safety and guardrails ensure the AI doesn't produce harmful, biased, or inappropriate content. This includes testing for edge cases where the system might behave unexpectedly.
Building an Eval Culture
One of the most powerful insights from the webinar: successful AI teams make evaluation a continuous practice, not a one-time checkpoint. At Productboard, evals run throughout development and after deployment to catch regressions and identify opportunities for improvement.
Chris emphasized the importance of "human in the loop" principles. Even with sophisticated evals, you need real humans reviewing outputs, providing feedback, and maintaining oversight. Users should always have visibility into what the AI is doing and control over any actions it takes.
The teams also stressed that evals need to evolve with your product. As you add features, user expectations change, and your evaluation criteria should change too.
4. Coordinating Teams and Educating Stakeholders
AI products don't just require technical excellence. They require organizational transformation.
The Education Challenge
When you're building with a technology that's genuinely new, shipping features is only half the battle. The other half is teaching people how to think differently. This education happens on two fronts:
Internal education means bringing your engineering, design, product, and business teams along for the journey. Everyone needs to understand what AI can and can't do, what tradeoffs you're making, and why certain decisions matter. At Productboard, this meant explaining concepts like context window management, retrieval accuracy, and probabilistic outputs to stakeholders who'd never had to think about these things before.
External education means setting appropriate expectations with users. AI products work differently than traditional software. Users need to understand what to expect, how to get the best results, and what the limitations are. Dominik and Chris both emphasized the importance of transparent communication about AI capabilities.
Cross-Functional Collaboration Is Non-Negotiable
Building AI products requires tighter collaboration than traditional software. Product managers need to work closely with data scientists, ML engineers, and infrastructure teams in ways that didn't exist before. The old model of handing off a PRD and waiting for results doesn't work here. Success requires continuous dialogue about what's feasible, what's valuable, and how to measure success.
The webinar highlighted how critical it is for AI PMs to develop technical fluency. You don't need to become a machine learning engineer, but you do need to understand how models work, what data requirements look like, and how to communicate effectively with technical teams.
Managing Change and Expectations
Perhaps most importantly, building AI products means managing organizational change. Teams that have worked with deterministic software for years need to adjust to the probabilistic nature of AI. Stakeholders who expect instant results need to understand why iteration and evaluation matter.
Chris and Dominik both emphasized patience and transparency as core values. When things don't work perfectly, explain why. When you need more time for evaluation, make the case clearly. The teams that succeed with AI are the ones that bring everyone along for the journey.
Building AI Products That Matter
The lessons from Productboard's AI PMs come down to a few essential truths:
- Architecture choices matter just as much as model choices. Understand the problem you're solving before you pick your solution approach.
- Foundational models are powerful but not interchangeable. Choose deliberately based on your specific needs, and stay current as the landscape evolves.
- Reliability requires continuous evaluation. Build evals into your workflow from day one and keep refining them as your product matures.
- Success requires organizational alignment. Educate your teams, collaborate tightly, and manage expectations transparently.
At Productboard, these principles have shaped how we've built Pulse and Spark—two AI products that genuinely help product managers work smarter. Pulse centralizes and analyzes customer feedback at scale, surfacing insights that would be impossible to find manually. Spark accelerates product discovery and execution with an intelligent agent that understands your strategic context.
These aren't experimental features or nice-to-haves. They're core capabilities that change how product teams operate.
Want to dive deeper into Chris and Dominik's insights? Watch the full webinar on-demand to hear their detailed explanations, see real examples, and get the answers to common audience questions.