Amber BoD

SLIDE 1 — TITLE
================================================================================

[SK] I’m Steve Karr, I lead business systems at the SIG,
Daniel and I will be going through this deck.

Before I get into it,
I have to say I’m very excited about this project for a ton of reasons you’ll hear about today.

Two things I want to flag upfront.

First — everything I’m showing you today is real. This is not a prototype built for the slide deck.

It is running right now
on production-grade Azure infrastructure
with live SIG data behind it.

Second — We aren’t done yet and I’ll be straight with you
about where it stands today and what still needs to happen
before it’s in members’ hands.

SLIDE 2 — AGENDA
================================================================================

[SK] Five sections.

What Amber is, and the naming distinction
that matters before anything else.

How Project Amber and Project Blue relate —
because they’re not the same thing,
but they depend on each other.

Why the chapters are sequenced the way they are.

A deep dive on Chapter One —
what it does, what’s in it,
how it was built, and what we know about
how well it’s working.

And then a live demo.

Of course, we’re happy to take questions throughout so chime in if anything comes up.

SLIDE 3 — WHAT IS AMBER?
================================================================================

[SK] Two names, one strategy —
and this distinction matters
for everything that follows.

“Project Amber” — with “Project” —
is the SIG’s strategic AI program for members.

Think of it the way you’d think about
a product family name —
the umbrella.

Each chapter is a distinct product,
delivering real standalone value,
all running on the same AI engine underneath.

“Amber” — without “Project” —
is the member-facing AI assistant.

Chapter One.
Live today.
Launching to all members at Milan in June.

The analogy that helps:
Claude is the AI effort at Anthropic,
but Claude Code and Claude CoWork
are distinct executions of that engine
for specific use cases.

Project Amber is the program.
Amber is the first product.

And the key phrase for Amber
is purpose-built.

This is not ChatGPT
with a Bluetooth instruction bolted onto it.

Let me be specific about what that means.

When a member asks about a “proximity sensor,”
Amber knows they probably mean
the Proximity Profile — PXP.

It knows the difference between
Core Spec 6.0 and 6.1
because we explicitly built that versioning
into the index.

It knows the difference between
a TCRL test case and an ICS feature
and routes to the right one automatically.

And it understands that when someone asks
“what tests do I need for a Bluetooth Low Energy heart rate monitor,”
that’s a qualification question —
not a spec search —
and it answers accordingly.

That kind of domain knowledge
doesn’t come out of a generic AI.
It has to be built.

And there’s one more thing
a general-purpose AI can’t give you —

availability.

A member in Asia at two in the morning
doesn’t have to wait for business hours in Kirkland.

They ask the question.
They get the answer.
Cited. Accurate. Immediate.

That’s what purpose-built means in practice.

SLIDE 4 — PROJECT AMBER: THE CHAPTERS
================================================================================

[SK] Project Amber delivers in chapters —
each a coherent set of capabilities
mapped to real member needs,
not engineering phases.

Chapter One is now —
Specification and Product Intelligence.

Ask a question in plain English,
get a cited answer in seconds.

78% win rate against ChatGPT
across 250 domain-specific questions.

I’ll come back to what that number
actually means in a moment.

Chapters 2a and 2b are next.

2a brings Amber into the test authoring process —
AI assistance for test engineers
validating test cases against specs in real time.

2b brings it into spec authoring —
inline conflict detection for spec authors
working in AsciiDoc.

2b has a hard dependency on
the AsciiDoc migration from Project Blue.
That’s not a preference, it’s a technical gate.
More on that shortly.

Chapter Three is on the roadmap —

Amber embedded natively inside
Qualification Workspace,

guiding members through qualification
in plain language.

The chapter structure is intentional.
Real commitments, incremental delivery,
learnings from each chapter shaping the next.

We are not promising everything at once.

SLIDE 5 — PROJECT AMBER: WHAT DO MEMBERS GET?
================================================================================

[SK] This table is the investment case by chapter.

Chapter One reaches 21,000-plus SIG members.

The output is Q&A and process guidance.
It has low dependency on Project Blue —
we run on vectorized PDFs today.

The strategic value is member enablement
and support deflection.

Every question a member answers themselves
through Amber is a question
that doesn’t become a support ticket.

And it’s available around the clock —
which means a member qualifying a product
in Tokyo at midnight
gets the same quality of guidance
as someone in Kirkland during business hours.

Chapter 2a targets roughly 100 test authors
and reviewers.

It requires AsciiDoc test docs in GitHub,
so it has a high Blue dependency.

The payoff is approximately a million dollars
a year in test authoring cost savings,
plus about two months off spec process time.

Chapter 2b targets around 200 spec authors
and reviewers.

This has a critical Blue dependency —
specs need to be in GitHub.

And the payoff is significant:
approximately two years off spec process time
and fewer issues reaching the market.

The investment compounds.

Chapter One pays for itself in support deflection
and member experience.
Two and 2a pay for themselves
in contractor savings.
2b is the biggest process return —
but it needs Blue’s foundation first.

SLIDE 6 — RECAP: PROJECT BLUE AND PROJECT AMBER
================================================================================

[SK] Some context on the relationship
between these two programs,
because they’re easy to conflate.

Project Blue modernizes the SIG’s
toolchain infrastructure —
Test-Documents-as-Code,
AsciiDoc and Git for spec authoring,
bi-directional Qual Workspace-PTS sync,
structured machine-readable data.

Project Amber is the AI intelligence layer
that sits on top.

Blue produces structured artifacts.
Amber makes them conversationally accessible
to anyone.

They are not the same project —
separate teams, separate scope,
separate timelines.

But they’re genuinely complementary.

The more structured Blue makes
our underlying data,
the more reliably Amber can access,
interpret, and cite it.

Here’s the way I think about it:

Blue is the plumbing.
Amber is the tap.

Better plumbing means cleaner water —
but you don’t wait for perfect plumbing
to turn on the tap.

That’s why Chapter One is live today,
on the data we already have.

SLIDE 7 — PROJECT AMBER: CHAPTER ONE (section divider)
================================================================================

[SK] Let’s go deep on Chapter One —
what’s running right now.

SLIDE 8 — WHAT IS AMBER? (recap)
================================================================================

[SK] Quick reminder before we go deep —
because we’re now inside the Chapter One section
and I want to make sure the naming is locked in.

Project Amber is the program.
The roadmap. The umbrella. All four chapters.

Amber is the product.
Chapter One. Live today.
Launching to all members at Milan in June.

Natural language Q&A over 200-plus specs,
70,000-plus qualified products,
and TCRL and ICS guidance — in one interface.

That’s what we’re about to show you.

SLIDE 9 — WHAT CHAPTER ONE SOLVES
================================================================================

[SK] Most of the people in this room
have felt this problem personally.

Volume Overload.

200-plus specifications. 37,000 pages.

If you know exactly which spec to open,
you can find an answer eventually.

If you don’t —
and most members don’t —
finding the right starting point
is itself a significant challenge.

Even for people who’ve been working
in Bluetooth for years.

Complex Processes.

Qualification is the best example.

To qualify a product,
a member needs to navigate TCRL test plans,
select test cases,
complete ICS statements,
understand lab requirements,
manage fees, and track timelines —

all documented separately,
across different systems.

There is no single guide
that walks you through it.

Amber is designed to be that guide.

And it’s there at two in the morning
when a member in a different time zone
hits a wall mid-qualification.

Expertise Silos.

Deep Bluetooth knowledge lives with
a small number of people
who are in very high demand.

Every time a member opens a support ticket
because they couldn’t find the answer themselves —
that’s a bottleneck we created.

Amber gives members
the self-service capability
to find their own answers,

so expert time goes toward work
that actually requires expert judgment.

SLIDE 10 — WHAT IS CHAPTER ONE?
================================================================================

[SK] Chapter One is live now,
and it’s designed for everyone —

marketers, engineers, product teams,
qualification staff, and members.

No Bluetooth expertise required
to get a useful, cited answer.

Three capability areas.

Specification Search.

200-plus Bluetooth specifications indexed.
37,000-plus pages, 41,721 chunks.
Semantic search — meaning it understands intent,
not just keywords.
Cited answers with links to the source section.

You can ask “how does channel sounding work”
and it finds the right content
even without exact spec terminology.
Follow up with “what does that mean
for direction finding”
and it maintains context across the conversation.

Product Intelligence.

70,000-plus qualified products,
live API, always current.

You can ask “how many Bluetooth products
does Apple have qualified”
and get a precise answer from real data,
not a training snapshot.

Qualification Guidance.

16,496 TCRL test cases
and 11,677 ICS features.

Ask which test cases apply to a device type,
what a specific ICS row means,
or how to interpret a conformance requirement.

The headline numbers:
78% win rate versus ChatGPT,
under 10 seconds average response time.

I want to give you the honest version
of that 78%, because context matters.

We started at 25%.

Alex identified that most of the losses
were either documents missing from our index

or routing logic refusing legitimate questions
before they even reached the AI.

Both fixable.

After targeted fixes,
we went from 25% to 72%,
then to 78% as we expanded the dataset.

We know exactly why it wins
and exactly why it loses —

and that’s what makes me confident
about where it goes from here.

SLIDE 11 — WHAT DATA IS LOADED INTO CHAPTER ONE?
================================================================================

[SK] This is the data inventory —
what’s actually in the system today.

Adopted specs at latest versions: 100% loaded.

Core 6.2 and all current profile versions,
sourced from HTML specs and vectorized PDFs.

Older versions: about 10% —
Core 5.4 through 6.1
for backwards compatibility queries.

In-progress specs: intentionally not loaded.

HDT, HLC, others under active development
are not appropriate
to surface in member guidance.

Test documents: 95% —
Test Suites, ICS, and TCRL.

Templates, process documents,
policy, and governing documents:
all at 100%.

And then there’s a category we’re
continuously building —
Help and Guides.

Zendesk KBAs,
solved support tickets converted
to curated Q&A pairs,
terminology lists,
developer guides,
qualification SME guidance.

We pulled two years of solved support tickets,
scrubbed the PII,
and built 161 curated Q&A pairs from them.

That’s real member questions
with real expert answers
baked directly into the system.

The inventory grows as the SIG adds content.

SLIDE 12 — HOW CHAPTER ONE WAS BUILT
================================================================================

[SK] I want to spend a minute on the development model
because understanding how this was built
explains how we’ve moved at this pace.

Amber is a small-team initiative
built with a lean but disciplined approach.

Changes go through GitHub as pull requests,
get reviewed before they’re merged,

and are deployed to Azure
in version-tagged increments —

so we always know exactly what’s running
and can roll back instantly.

Everything gets tested against
the live system before it’s declared stable.

What’s different from a traditional SIG project
is the scale and the tooling.

Rather than a large cross-functional team
and a formal sprint cadence,

we have a focused group moving quickly,

with AI tools handling a lot of the work
that would otherwise require more people —

code review, documentation,
architectural analysis, communications.

But the underlying practices are solid:

version control, peer review,
secrets in Azure Key Vault,
Auth0 SSO, proper session handling.

We are building this to be handed
to a production team —
not rebuilt from scratch.

The rigor shows in how we measure quality.

Every change gets validated against
a 250-question test harness
with SME-verified ground truth answers.

When we say 78% win rate,
that’s a reproducible number —
same questions, same scoring, blind evaluation.

Decisions about what to build next
are driven by that data.

On Microsoft:

after Bangkok,
Microsoft development services
become the production build partner.

The SIG retains full product ownership —
the roadmap, requirements, and domain knowledge
stay with us.

Microsoft leads the infrastructure work
to take the system to production scale.

The groundwork we’ve laid
is what makes that handoff clean.

SLIDE 13 — PROJECT AMBER: CHAPTER TWO (section divider)
================================================================================

[SK] Now let’s talk about where Amber goes next —

SLIDE 14 — WHAT CHAPTER TWO SOLVES
===============================================================================

[DC] The numbers on this chart are averages,
and they mask where the real friction is.

A significant portion of that
33-month drafting cycle

is what I’d call search-and-verification work.

Checking that new text is consistent
with existing specs.
Finding the right precedent.
Understanding what a related profile
already requires.

When you’re doing that manually
across 37,000 pages,
it’s slow and it’s error-prone.

And the errors that slip through
don’t surface until committee review —
which is expensive —
or in the field, which is worse.

That is the specific problem Chapter Two removes.

Not by replacing the authors or the reviewers.

By putting the verification
inline with the work,
at the moment of authoring,
before those errors ever reach a committee.

SLIDE 15 — WHAT IS CHAPTER TWO?
================================================================================

[DC] Chapter Two brings Amber
into the development process itself.

Right now,
when someone writes a test case
or a section of a specification,
there’s no real-time feedback.

You write it, it goes through a committee cycle,
and issues surface late —
when rework is most expensive.

Chapter Two asks:
what if Amber was alongside the author
the whole time?

Concretely:

An author is writing a test case.

Amber validates it in real time
against the relevant specifications,
ICS tables, errata, and existing test suite.

If there’s a conflict —
the test case references a requirement
modified in a later spec version,
or contradicts an existing test —

Amber flags it immediately.

The author fixes it
before it reaches committee review.

The same agents that give feedback during writing
can also run automatically on commit.

Think of it as a CI pipeline for specs —

push a draft section,
and the agents check it for consistency,
completeness, and conflicts
before a human reviewer even looks at it.

Three components on this slide.

2a — Test Authoring:
AI validates test cases against specs,
ICS tables, and errata in real time.
Flags TCRL conflicts before committee review.
Audience: test engineers and TCRL working groups.

2b — Spec Authoring:
Authors write in AsciiDoc,
Amber checks consistency inline.
Cross-spec conflict detection.
Audience: spec authors and working group editors.

And the CI model they share —
same checks at authoring and at commit.
Encodes reviewer expertise
into repeatable automated checks.
Supplements committee review, doesn’t replace it.

One technical note on 2b worth being clear about.

AsciiDoc is not a version control preference.

It’s a prerequisite.

The AI can only validate against
structured, machine-readable content.

Without specs in AsciiDoc in GitHub,
there’s nothing reliable to validate against.

That’s the Blue dependency that gates 2b —
and it’s what makes the
two-year process reduction estimate credible.

The precision of what the AI can do
is proportional to the structure
of the data it’s working against.

SLIDE 16 — ROADMAP (section divider)
================================================================================

[SK] Let’s look at the timeline.

SLIDE 17 — PROJECT AMBER ROADMAP AND MILESTONES
================================================================================

[SK] Bangkok is not the finish line.

It is the transition
from pre-production to launch.

Here’s the arc.

Q3 2025 — POC.

A RAG prototype on Bluetooth specifications.
The question was:
can you actually build something useful here?
The answer was yes.

Q4 2025 — Azure Deployment.

Moved onto Azure Container Apps
with Auth0 SSO.
This is where it stopped being an experiment
and started being a system.

Early 2026 — Production Architecture.

Azure AI Search replaced ChromaDB.
Key Vault for secrets, Redis for sessions.
We migrated to Azure OpenAI —
a deliberate decision driven by
compliance requirements,
not technical preference.

Now — Bangkok, March 2026.

The board sees it live.
Our standard is production-ready —
not demo theater.
SME validation is running now.

June 2026 — Milan F2F.

Chapter One goes live to full membership.
Chapter 2a scoping begins.

Three things to hold:

Chapter One is on track for June.

Subsequent chapter timelines
are gated on Project Blue’s delivery.

There are no hidden dependencies
in the sequencing.

SLIDE 18 — LIVE DEMO
================================================================================

[SK] Let’s show you the system.

>> Switch to live amberpoc.com tab
>> Confirm Auth0 login is complete before speaking

Four scenarios, covering all three
core capability areas.

Live queries — not scripted responses.

– – – – – – – – – – – – – – – – – – – – – – – – –

Scenario one — specification query.

>> Type: “What are the mandatory requirements for
>> LE Audio broadcast source in Bluetooth Core 6.2?”

This is the kind of question that would normally mean
opening Core 6.2, navigating to the right volume,
and cross-referencing the profile spec.

Watch the response — specifically the citation.

>> After response:

That citation is not decorative.

If someone in this room challenged the answer
right now,
there is a specific document and section
to point to.

That traceability is what differentiates Amber
from asking ChatGPT the same question.

– – – – – – – – – – – – – – – – – – – – – – – – –

Scenario two — product intelligence.

>> Type: “How many Bluetooth qualified products does
>> Samsung have, and what are their most recent qualifications?”

This is hitting the live Qualified Products API.

Not a training snapshot.
Current as of this moment.

– – – – – – – – – – – – – – – – – – – – – – – – –

Scenario three — qualification guidance.

>> Type: “What ICS features are mandatory if I’m
>> implementing HFP as an Audio Gateway?”

Previously, answering this meant
opening the HFP ICS document
and manually identifying mandatory rows.

Amber synthesizes that
across 16,496 indexed test cases.

– – – – – – – – – – – – – – – – – – – – – – – – –

Scenario four — multi-turn follow-up.

>> Type: “And what are the most common failure
>> points in HFP qualification?”

It retains context from the previous question.

The member doesn’t have to restate
what we’re talking about.

That conversational continuity
is a real usability difference from any search tool.

– – – – – – – – – – – – – – – – – – – – – – – – –

>> If time permits:

If anyone wants to try a question —
something you’ve had to look up recently,
or a qualification question from your own work —

we’ll run it live.