Amber Lunch and Learn

SLIDE 1 — Title

Thanks for coming. I’ll get things moving — there’s a lot to cover, and I want to leave plenty of time for questions.

Today I’m going to walk you through Project Amber from top to bottom. What it is, the problem it solves, what we’ve built so far, how it’s structured, and where we’re taking it.

For those of you who’ve heard about it and been curious — this is your full picture. For those who’ve been involved in some way already, hopefully this gives you context on the pieces you haven’t seen.

And a quick note on the name before I dive in: Amber isn’t arbitrary. It’s a nod to our Nordic-heritage naming — Baltic amber was the resin that once connected Nordic trade routes. The idea is that Amber connects our members to the answers they need, drawing on the full body of Bluetooth knowledge.

Two things I want to flag upfront. First, everything I’m showing you is real — not a prototype built for the slide deck. It’s running on production-grade Azure infrastructure with live SIG data. Second, I’ll be straight with you about where it stands. We’re proud of what’s been built, and I’ll tell you honestly what still needs to happen before it’s in members’ hands.

SLIDE 2 — What is Project Amber?

At its core, Amber is the Bluetooth SIG’s own AI assistant — purpose-built for the Bluetooth ecosystem.

The key phrase is purpose-built. This is not ChatGPT with a Bluetooth instruction bolted onto it. It’s a system grounded in SIG-specific data: our own specifications, our qualification database, our TCRL test documentation, our ICS statements. It knows the difference between Core Spec 6.0 and 6.1 because we’ve explicitly built that versioning into the index. It understands that when a member asks about a “proximity sensor,” they probably mean the Proximity Profile, PXP. That kind of domain knowledge doesn’t come out of a generic AI — it has to be built.

The four capability areas on this slide are what that translates to in practice.

Specification Intelligence. 200-plus Bluetooth specifications, indexed. 37,000 pages broken into over 41,000 retrievable chunks. Semantic search — meaning it understands intent, not just keywords — with citations back to the exact source document and section.

Product Database. Live access to 70,000-plus qualified products through the Bluetooth SIG Qualified Products API. Search by company, model, feature. Always current — no stale snapshots.

Qualification Guidance. 5,415 TCRL test cases and over 11,600 ICS features, queryable in plain English. Ask “what tests do I need for a Bluetooth Low Energy heart rate monitor” and get a structured, accurate answer.

Natural Language Interface. You talk to it the way you’d talk to a knowledgeable colleague. Amber maps your intent to the right sources, synthesizes an answer, and shows you its work.

SLIDE 3 — The Challenge Amber Solves

Let me make the problem concrete, because most of us in this room have felt it personally.

Volume Overload. 200-plus specifications. 37,000 pages. If you know exactly which spec to open, you can find an answer eventually. If you don’t, finding the right starting point is itself a significant challenge — even for people who’ve been working in Bluetooth for years.

Complex Processes. Qualification is the best example. To qualify a product, a member needs to navigate TCRL test plans, select test cases, complete ICS statements, understand lab requirements, manage fees, and track timelines — all documented separately across different systems. There’s no single guide that walks you through it. Amber is designed to be that guide.

Expertise Silos. Deep Bluetooth knowledge lives with a small number of people who are in high demand. Every time a member opens a support ticket because they couldn’t find the answer themselves, that’s a bottleneck we created. Amber gives members the self-service capability to find their own answers, so expert time goes toward work that actually requires expert judgment.

SLIDE 4 — Amber in the SIG Ecosystem

Amber is additive. It does not replace anything we already have.

Qualification Workspace, PTS, the qualified products database, the Zendesk knowledge base, bluetooth.com — all of that continues exactly as it is today. What Amber adds is an AI intelligence layer on top. A single interface that knows how to reach into all of those sources and synthesize an answer.

Right now, if a member has a question, they might check bluetooth.com, search the knowledge base, look at Qualification Workspace, and send a support ticket when they still can’t find the answer. With Amber, they ask once and get a single cited response. Amber just removes the burden of knowing which source to consult.

And it’s available 24/7. A member in Asia at two in the morning doesn’t have to wait for business hours in Kirkland.

SLIDE 5 — The Chapter Framework

We’re building Amber in chapters — each one a coherent, independently valuable set of capabilities mapped to real member needs, not just engineering milestones.

Chapter 1 is live now. Specification and Product Intelligence — everything I’ve just described.

Chapter 2 is next. Test Authoring and Validation — bringing Amber into the spec and test development process itself. The team is still defining the scope, and I’ll spend a few minutes on it because it’s genuinely different from Chapter 1.

Chapter 3 is on the roadmap. Qualification Workflow — embedding Amber natively inside Qualification Workspace so members are guided through qualification in plain language. This is tied to the QW product roadmap and Jay’s team’s work, so it’s more directional than the first two.

The chapter structure is intentional: real commitments about what’s in scope now versus later, incremental delivery, and learnings from each chapter shaping the next. We are not promising everything at once.

SLIDE 6 — How Project Amber is Built

Before I get into the chapters, I want to spend a few minutes on how this project runs — because understanding the structure helps explain how we’ve moved at this pace.

Amber is a small-team initiative built with a lean but disciplined approach. We work in a standard software development model: changes go through GitHub as pull requests, get reviewed before they’re merged, and are deployed to Azure in version-tagged increments so we always know exactly what’s running and can roll back instantly if needed. Every deployment gets tested against the live system before it’s declared stable. It’s not a large team, but it runs with real process behind it.

What’s different from a traditional SIG project is the scale and tooling. Rather than a large cross-functional team and a formal sprint cadence, we have a focused group moving quickly — with AI tooling handling a lot of the work that would otherwise require more people. Code review, documentation, architectural analysis, stakeholder communications — AI accelerates all of that. But the underlying practices are solid: version control, peer review, secrets management in Azure Key Vault, Auth0 SSO, proper CORS and session handling. We are building this to be handed to a production team, not rebuilt from scratch.

The rigor shows in the evaluation methodology. Every change gets validated against a 250-question test harness with SME-verified ground truth answers. When we say 82% win rate, that’s a reproducible number — same questions, same scoring, blind evaluation. Decisions about what to build next are driven by that data.

On the Microsoft side: after Bangkok, Microsoft development services become the production build partner. The SIG retains full product ownership — the roadmap, requirements, and domain knowledge stay with us. Microsoft leads the infrastructure work to take the system to production scale on managed Azure services. The groundwork we’ve laid in the POC is what makes that handoff clean.

SLIDE 7 — Chapter 1 in Detail

Let me give you a sense of what Chapter 1 delivers and how we know it’s working.

Specification Search — 200-plus specs indexed, semantic retrieval, cited answers linked to the source section. Semantic search matters: you can ask “how does channel sounding work” and it finds the right content even without exact spec terminology. It handles multi-turn too — follow up with “what does that mean for direction finding” and it maintains context.

Product Intelligence — 70,000-plus qualified products, live API, always current. You can compare manufacturers, look up a specific model, or ask “how many Bluetooth products does Apple have qualified” and get a precise answer from real data.

Qualification Guidance — 5,415 TCRL test cases and 11,677 ICS features. Ask which test cases apply to a device type, what a specific ICS row means, or how to interpret a conformance requirement.

The headline number: 82% win rate against ChatGPT on 250 domain-specific questions — the kind of questions members actually ask, across spec content, product lookup, qualification process, ICS interpretation, TCRL guidance, security, audio, and more. Scored blind.

The honest version: we started at 25%. Alex Y identified that most losses were either missing documents in our index or routing logic refusing legitimate questions before they even reached the AI. Both fixable. After targeted fixes we went from 25% to 72% in the first pass, then to 82% as we expanded the dataset. We know exactly why it wins and why it loses — and that’s what makes me confident going forward.

For member launch readiness, what really matters is accuracy against SME-verified ground truth answers — not just the competitive benchmark. Daniel’s team is converting SME notes into standardized ground-truth answers now, targeting 250 fully labeled queries before the Bangkok board meeting on March 17. That gives us ±6% margin of error — the statistical confidence we need to make a defensible go/no-go decision.

SLIDE 8 — Chapter 2: Test Authoring & Validation

Chapter 2 is where Amber moves from answering questions to participating in the work itself.

Right now, when someone writes a test case or a specification section, there’s no real-time feedback. You write it, it goes through a committee review cycle, and issues surface late — when rework is most expensive. Chapter 2 asks: what if Amber was alongside the author the whole time?

Concretely: an author is writing a test case. Amber validates it in real time against the relevant specifications, ICS tables, errata, and the existing test suite. If there’s a conflict — the test case references a requirement modified in a later spec version, or contradicts an existing test — Amber flags it immediately. The author fixes it before it reaches committee review. Daniel described this as bringing expertise to the moment of authoring, not the moment of review.

The same agents that give feedback during writing can also run automatically on commit — a CI pipeline for specs. Push a draft section, and the agents check it for consistency, completeness, and conflicts before a human reviewer even looks at it.

The three pillars on the slide — Authoring Support, Conflict Detection, Review and Quality Agents — represent that arc.

For anyone following the AsciiDoc discussion: Chapter 2 does not require AsciiDoc to begin. Our current data — Paligo exports, HTML specs, the XML corpus — is structured well enough to work with now. The AsciiDoc transition and Chapter 2 are being coordinated on rollout and messaging, but the technical dependency isn’t there.

SLIDE 9 — Chapter 3: Qualification Workflow

Chapter 3 is the furthest out, but worth understanding because it’s where this becomes transformational for member experience.

The vision: a member opens Qualification Workspace and Amber is embedded right there. They don’t need to know anything about the qualification process before they start. Amber asks about their device, walks them through test selection, helps complete the ICS form with live spec context, explains lab requirements, and guides them through submission — all in plain English.

For a first-time qualifier, this removes the biggest barrier: not knowing where to start. For experienced members, it saves time on the tedious process steps. And for us, it scales support capacity without scaling headcount.

Chapter 3 is directional — the specifics will be shaped by learnings from Chapters 1 and 2, member feedback, and the QW product roadmap. We’re not nailing down details until we’re closer.

SLIDE 10 — Amber & Project Blue

Some of you are following both, so I want to be clear about how they relate.

Project Blue modernizes the SIG’s toolchain infrastructure — Test-Documents-as-Code, AsciiDoc and Git for spec authoring, bi-directional Qual Workspace-PTS sync, structured machine-readable test data, telemetry across the toolchain.

Amber is the AI intelligence layer on top. Blue produces structured artifacts; Amber makes them conversationally accessible to anyone.

They are not the same project — separate teams, separate scope, separate timelines. But they’re genuinely complementary. The more structured Blue makes our underlying data, the more reliably Amber can access, interpret, and cite it.

Blue is the plumbing. Amber is the tap. Better plumbing means cleaner water — but you don’t wait for perfect plumbing to turn on the tap. That’s why Chapter 1 is live today, on the data we already have.

SLIDE 11 — Roadmap & Milestones

Here’s the arc — how we got here and where we’re going.

Q3 2025 — POC. A RAG prototype on Bluetooth specifications, ChromaDB and Claude. The question was: can you actually build something useful here? The answer was yes.

Q4 2025 — Azure Deployment. Moved onto Azure Container Apps with Auth0 SSO. This is where it stopped being an experiment and started being a system.

Early 2026 — Production Architecture. Azure AI Search replaced ChromaDB. Azure Key Vault for secrets management, Redis for session storage. We migrated from Claude to Azure OpenAI — a deliberate decision driven by compliance requirements, not a technical preference. The system now uses GPT-4o-mini for routing and GPT-5 for response synthesis.

March 2026 — Bangkok Board Demo. The board has been watching this project, and Bangkok is where we present Chapter 1 at near-production quality. Our standard: 95%-plus completion against the original PRD — not demo theater. SME validation is underway, final evaluation runs before March 3rd.

June 2026 — Milan F2F — Member Launch. Chapter 1 goes live to the full membership, and Chapter 2 scoping begins.

From POC to member launch is an unusually lean investment for a system of this scope — which reflects both the AI-assisted approach and the focused team.

SLIDE 12 — What’s Next For You

Three things I want from you today.

Try it. Pilot access is open to SIG staff now. Ask a real question — something you’ve actually had to go find an answer to in the last month. See what it returns, check whether the citations are right, and tell me where it’s wrong. Every gap you find improves the system before it reaches members.

Share use cases. Tell me where you spend time hunting for answers, or where you see members struggling. That input directly shapes Chapter 2. If you work in spec development or test authoring, I particularly want to hear from you.

Spread the word. The more SIG staff using Amber during the pilot, the better our feedback signal before member launch. If you know colleagues in working groups or power users in the member community who’d make good early adopters, flag them to me or Daniel Cowling.

I’ll leave it there. What questions do you have?