How Corelight Built the First Agentic Cybersecurity Product and the Framework Every PM Should Steal

PRODUCTBOARD

22nd April 2026AI Product Management, Spark

The night before Vijit Nair joined us at Productboard's San Francisco product community meetup, his team at Corelight shipped what he describes as one of the few truly agentic products in the cybersecurity industry. Corelight builds threat detection software sold into security operations centers, and their new product does in seconds something that previously took a trained analyst 1-2 hours: it pulls all the relevant context, autonomously investigates a security alert, and delivers a verdict in seconds.

Vijit walked in the next evening without a polished success story. What he brought instead was something more useful.

“It took way longer than any of us anticipated. The things we did not anticipate were just how much work we’d need in terms of context engineering, agent evals and scaffolding."

— Vijit Nair, VP of Product, Corelight

This is the gap that most product teams are quietly living in right now. Everyone is being told to move fast with AI. Far fewer people are talking about what moving well actually looks like. Over the course of the evening, Vijit laid out a framework that gave the room a shared vocabulary for exactly that question.

The 3-Level AI Maturity Model

The framework starts with a spectrum. On one end, you have AI features where humans are entirely in control and AI plays a supporting role. On the other, you have systems that operate completely autonomously, with no human review at any step. Vijit breaks this into three named levels.

Level 1 — Assist. Human-led, AI-assisted. The PM and the end user drive every step of the workflow, and AI surfaces insights, drafts content, or flags issues on request. This is high control, low agency, and it’s where every team should begin.
Level 2 — Automation. AI-led, with a human in the loop. Roughly 50 to 70 percent of the workflow runs autonomously, and a human reviews, approves, and course-corrects. The default has flipped: AI initiates, humans validate.
Level 3 — Autonomous. No human review. The system detects, decides, and acts on its own. This is the destination, not the starting point.

“You think of that as going from high control, low agency on the left, to very low control, high agency on the right. And on the high control, low agency side — that’s where everybody should begin.”

— Vijit Nair, VP of Product, Corelight

What makes this model genuinely useful is the premise underlying it. Climbing from Level 1 to Level 3 is a trust-building journey with your users, not a technology roadmap. The model capabilities may be ready for full autonomy. Your customers almost certainly are not, and in high-stakes B2B environments, the consequences of skipping levels aren’t just bad UX. They’re relationship-ending.

“If you give an answer to a SOC analyst where you say a particular threat is not valid and it actually is valid, they will never look at your AI product again. Trust is a very important component in our industry.”

— Vijit Nair, VP of Product, Corelight

The same logic applies in financial services, legal tech, healthcare, and any other domain where an AI mistake carries real downstream cost. The question worth asking before your next sprint: which level are your AI features operating at today, and is that actually where your customers trust you to be?

Why Most Teams Are Building at the Wrong Level

There’s a pattern that quietly derails agentic product builds. Teams believe they’re operating at Level 2 or 3. Their customers are still calibrated to Level 1. Internal confidence and customer trust diverge, and the product ends up over-promising in ways that erode the relationship rather than build it.

Two things from the Corelight build came up repeatedly as surprises, even for a team that had been working with AI since GPT-3.5 launched in 2023.

Data unlocks Level 2, Trust unlocks Level 3

Moving up the spectrum is a shift in the relationship between user and machine, governed by two distinct keys. Moving from Assist to Automation requires high-fidelity data. Without a robust context engine, an agent cannot initiate a workflow accurately. Level 2 becomes possible once the AI has enough information to act correctly most of the time. Even with perfect data, you cannot reach Autonomous status without the user’s permission to let go of the wheel. In high-stakes fields, the "human-in-the-loop" is a psychological requirement. You only reach Level 3 when the user has seen Level 2 succeed so consistently that manual review feels redundant.

The scaffolding is the real work.

It’s tempting to think of agentic product development as: choose a model, write some prompts, ship. What Vijit’s team found is that the model work itself was the easy part. The bulk of the effort went into the infrastructure around it, including guardrails, context engineering, prompt engineering, orchestration, and building agents specifically designed to check the work of other agents before any output reaches a user. None of that was trivial, and none of it was anticipated in the original timeline.

Human validation is the actual bottleneck.

Corelight ran a blind-mode beta, running the AI silently in the background against real customer data while their team validated outputs with actual SOC analysts. Getting a human expert to sit down, work through an investigation, and confirm whether the agent was right or wrong turned out to be the hardest part of the entire build.

“You can’t just validate based on data. You need to go talk to a human being. You need to go to the analyst at that customer and say, ‘This is what my product found. What do you think?’”

— Vijit Nair, VP of Product, Corelight

That single step, not the engineering, not the model selection, pushed the timeline from six months to nine. The practical implication is worth writing down somewhere visible: sequence trust-building with your users as explicitly as you sequence your engineering milestones. Define what earning the right to Level 2 looks like before you start building for it.

The Hidden Prerequisite: Your Context Engine

Moving up the maturity model has a prerequisite that rarely shows up in the planning doc. Each level requires a continuous supply of high-quality context, and in most product organizations, that context is scattered across docs, Slack threads, customer call notes, and tribal knowledge that lives in one person’s head. When the AI runs on thin context, the outputs are thin. When the scaffolding relies on well-structured, current customer knowledge, the whole system gets sharper.

Vijit’s framing of what this means for the PM role was one of the more direct challenges of the evening.

“In the age of AI, you are the context machines. A few years from now, if you’ve got engineers fully building on AI coding tools and they’re vibing all the time, the bottleneck is actually going to be the context that you can provide. The product managers are going to be the bottom line.”

— Vijit Nair, VP of Product, Corelight

At Corelight, this translated into a concrete system. The team built a dedicated product context repo in GitLab containing company strategy, ICP, user personas, and competitive landscape, all in structured markdown and connected directly to their AI tooling. They also built shared “skills”: reusable prompt templates for press releases, requirements docs, and release notes, so every PM on the team starts from the same foundation rather than rebuilding context from scratch. And the playbooks that PMs wrote from real customer conversations became the literal context fed to every agent in the agentic product. The quality of what the agents produced traced directly back to the quality of those customer conversations.

But someone still had to build, iterate, and upload that context. This is the problem Productboard Spark is built to solve. Spark continuously synthesizes customer feedback, discovery notes, and product signals into structured, actionable context, so when a PM sits down to brief an AI agent, write a spec, or move a feature from Level 1 to Level 2, they’re starting from a living body of customer evidence. The context machine needs fuel. Spark is the fuel system.

The Anti-Slop Rule: Human Thinking Has to Come First

There’s a failure mode that surfaces reliably the moment teams start ramping up on AI tools, and Vijit had a specific story for it.

“I remember talking to a PM and saying, ‘Can you write a press release for this?’ And the next day I got a 39-page document. There’s no 39-page press release. What are you doing?”

— Vijit Nair, VP of Product, Corelight

The problem was not that the PM used AI. It was that they used AI before doing any of the thinking. The output looked like a press release and had the structure of a press release. But reading through it, there was no customer understanding, no point of view, no argument. The AI had generated confidently in the absence of any real signal to generate from.

Vijit’s response was to institute a rule: before AI can be used on any initiative, a PM must first write a short, one-to-two page human-authored proposal. If it reads like it came from a model, it goes back, and gets a 1:1 conversation with a direct question: walk me through this sentence, where did it come from?

“If it all comes from AI, our competition is going to adopt this too, and they’ve got access to the same tools. The PMs have to apply their thinking. They are the ones talking to customers. They are feeling the pain that customers are feeling. That intuition, that expertise, needs to be translated by a human into context that AI tools can use.”

— Vijit Nair, VP of Product, Corelight

Everyone in the room has access to the same frontier models. Competitive differentiation comes from the quality of human thinking that gets translated into context, and that thinking only comes from actually talking to customers, sitting with their frustrations, and developing a real point of view on what matters. Productboard Spark helps PMs do that customer work faster and more systematically, so the human thinking that seeds every initiative is genuinely richer.

Jordan Nolff, Productboard’s VP of Growth and Product, and the evening’s host, extended the point from a product leadership lens.

“Our job as a PM is to figure out what to build and curate and what not to build and what to kill. As the speed of execution compresses further and further, there’s going to be more and more of an importance placed on analytics, closing the loop of understanding what is happening on the features you’re building, as a core input to the customer discovery you’re doing.”

— Jordan, VP of Growth & Product, Productboard

Moving Your Product Team Up the AI-Maturity Curve

The framework is only useful if it changes something. Here’s a practical starting point at each level.

At Level 1

Identify one workflow where you have enough customer trust and data quality to pilot Level 2, and be specific about what “enough trust” actually means.
Run a blind-mode test. Let the AI operate silently against real data before surfacing anything to users, and use that window to validate outputs with actual customers.
Start building your context repo now. Even a lean markdown document covering ICP, user personas, and product strategy gives every AI tool in your org a better foundation to work from.

At Level 2

Define your human-in-the-loop checkpoints explicitly. What triggers a human review versus what passes through automatically? Write it down.
Track customer trust signals separately from product usage metrics. Usage tells you people are clicking. Trust signals tell you whether they’re relying on the output.
Instrument the handoff points so you know whether trust is being earned incrementally or quietly eroding.

Pushing toward Level 3

In regulated industries, expect this to require external validation: industry analysts, compliance teams, customer champions who can vouch for the system’s accuracy.
Build agents that check other agents. Autonomous outputs that haven’t been reviewed by a second system before reaching users carry unnecessary risk.
Keep Vijit’s framing in mind for why Level 3 matters in security: “They’re adopting AI faster than we are. We need systems only AI can battle AI.” For some industries, autonomous is a competitive necessity, not a luxury.

What Makes an AI Product Defensible

One of the sharpest exchanges of the evening started with an audience question that’s quietly keeping a lot of PMs up at night: if AI can replicate workflows cheaply and quickly, what is actually defensible anymore?

Vijit’s answer was direct. There are two moats worth building, and only one of them is obvious.

The obvious one is proprietary data. If you’re building AI features, the richness and exclusivity of the data underlying them is what distinguishes your output from a competitor’s. A workflow built on top of data anyone can access is replicable. A workflow built on data only you have is not.

“If you’re only building a workflow and somebody else can replicate the workflow, that’s not a long-term defensible moat.”

— Vijit Nair, VP of Product, Corelight

The less obvious moat is deep domain expertise translated into a differentiated customer interface. Understanding a specific customer’s workflow so intimately that you can build an AI-powered experience they couldn’t get anywhere else requires a kind of creativity and customer empathy that frontier models cannot replicate. Jordan brought this to life with a concrete example.

“Look at Harvey in the legal space. Is that going to get displaced by something you could vibe-code in Claude? Maybe not, because there’s domain knowledge, expertise, and the element of what happens if I’m wrong. The vibe-coded app probably isn’t going to gain as much traction as the company that has invested expertise to ensure it can’t be wrong.”

— Jordan, VP of Growth & Product, Productboard

For B2B SaaS teams sitting on a product that was built around workflow complexity rather than data advantage, this is worth taking seriously. The engineering effort that once made a sophisticated SaaS application hard to replicate is compressing fast. The moats that will hold are the ones rooted in knowledge that takes years to accumulate and judgment that takes real expertise to apply.

The Bigger Picture

Near the end of the evening, Jordan offered a framing for where product management as a discipline is heading, and it’s one worth holding onto.

“We’re moving into a world where closing the loop, understanding what is happening on the features that you’re building, becomes a core input to the customer discovery you’re doing on market research and the competitive landscape. The curation to say, ‘Yes, AI coding agent, we are going to do this, and I’m going to define that in a complete specification so that I can keep pace with the development cycle.’ That’s the world of product management I think we’re moving toward.”

— Jordan, VP of Growth & Product, Productboard

The 3-level maturity model is a framework for evaluating your AI features, but taken seriously it’s also a framework for evaluating your team. Getting to Level 3 requires PMs who are close enough to the customer reality, and practiced enough at translating that reality into structured context, that they become the irreplaceable link between human insight and machine execution. That’s where the job is going. And by most accounts from the room that evening, it’s a harder and more interesting version of the job than product management has ever seen.

Start with the Context

If you’re building toward Level 2 or 3 and you want your customer insights to actually fuel AI rather than sitting untouched in a doc somewhere, that’s exactly what Productboard Spark is for. Spark turns the voice of your customer into structured, always-current context your team can act on.

Try Spark free →

Join Us at the Next Meetup

This conversation is part of an ongoing series Productboard is hosting across San Francisco, including fireside chats, workshops, and showcases designed to help product people learn from one another in real time. We’d love to see you at the next one.

Register for our next SF meetup →

FAQ

What is the difference between AI-assisted and fully autonomous AI in product management?

AI-assisted (Level 1) means humans lead every step and AI provides support. Fully autonomous (Level 3) means the system detects, decides, and acts without human review. Getting from one to the other requires incrementally earning customer trust through transparency, auditability, and demonstrated accuracy, not just shipping more capable models.

How do you build customer trust when launching an agentic product?

Start in blind mode: run the AI silently against real customer data before surfacing any output. Validate findings with actual end users, not just internal test sets. Show your work by letting users see the reasoning behind every AI decision. And build agents that check other agents before anything reaches the user. Trust is earned in layers, not granted at launch.

What should a product manager's role be when engineers are using AI coding agents?

As engineering velocity accelerates, the bottleneck shifts to the quality of context that PMs provide upfront: customer insights, user personas, workflow details, competitive positioning. PMs who invest in structured, rich context repositories will directly accelerate what their engineering teams can build. Those who don’t will become the slowest part of the system.

How do you prevent low-quality AI output in product teams?

Require a short human-authored document or input before AI is used on any project. This forces the thinking to happen before the generation starts. The discipline is about ensuring human judgment drives the direction before AI scales the execution.

What makes an AI product defensible in a regulated or high-stakes industry?

Two things: proprietary data that can’t be easily replicated, and deep domain expertise translated into a differentiated customer workflow. Workflow alone isn’t enough; anyone can replicate a workflow cheaply with AI. What holds is the combination of unique data and an intimate understanding of customer pain that a generic model simply can’t match.

How do you think about feature bloat when measuring AI impact by features shipped?

Feature count is a leading indicator, not the goal. The only metric that really matters is whether your North Star metrics are moving and whether the business is growing in the right direction. Leading indicators like features shipped and quality metrics tell you if you’re trending right; lagging indicators tell you if it’s actually working. Optimizing for the proxy leads you astray.

How do you decide what to build that AI won't commoditize in a few quarters?

Focus on problems where domain expertise is genuinely hard to replicate, where being wrong carries real consequences, and where the human-digital interface needs deep specialization. The ability to understand a specific customer’s pain at that level of intimacy is still a uniquely human capability. That’s where the edge lives.