— SLIDE 1: TITLE —

 

Thank you.

 

We’re going to spend the next thirty minutes on Project Amber —

a working AI system we’ve been running in pre-production since mid-2025.

 

This is not a concept presentation or a roadmap pitch.

The system is live.

It’s connected to real SIG data.

And you’ll see it running in the demo.

 

By the end of the session, you’ll understand what it is,

how it’s built,

where it fits in the SIG’s tooling strategy,

and what the path to a full member-facing launch looks like.

 

Daniel is our technical lead on SME content and evaluation.

I’ll be carrying most of the technical detail today.

— SLIDE 2: AGENDA —

 

Five sections.

We’ll move quickly through the context slides

because the demo is the main event.

But the framing matters for the investment conversation

we’ll have at the end.

 

First — Project Amber overview.

Second — how it fits with Project Blue.

Third — the chapter framework.

Fourth — Chapter One deep dive.

Fifth — the live demo.

— SLIDE 3: WHAT IS PROJECT AMBER? —

 

Amber is the Bluetooth SIG’s own AI assistant —

purpose-built for the Bluetooth ecosystem.

 

The name was chosen deliberately.

Baltic amber was one of the most traded commodities in the ancient world,

connecting people across vast distances to something rare and valuable.

That’s the metaphor:

connecting members to the knowledge they need,

instantly,

instead of after hours of searching.

 

The technical definition:

Amber is a Retrieval-Augmented Generation system —

a RAG architecture —

built on Azure OpenAI,

sitting on top of the SIG’s own data.

 

It doesn’t hallucinate product records or invent specifications,

because it doesn’t generate that information freely.

It retrieves it from sources we control,

then synthesizes a cited response.

 

Four data pillars power it.

 

Specification Intelligence —

two hundred-plus Bluetooth specifications,

thirty-seven thousand pages,

indexed into forty-one thousand, seven hundred and twenty-one retrievable chunks.

Every answer is anchored to a real document section.

 

Product Database —

live access to seventy thousand-plus Bluetooth-qualified products

via the SIG’s own Qualification API.

Not a cached snapshot — live data, as of the moment the member asks.

 

Qualification Guidance —

sixteen thousand, four hundred and ninety-six TCRL test cases

and eleven thousand, six hundred and seventy-seven ICS features,

queryable in plain English.

 

And the Natural Language Interface that ties it all together.

You ask Amber the way you’d ask a knowledgeable colleague.

It figures out where the answer lives,

retrieves it,

and presents a cited, synthesized response.

 

The key word is purpose-built.

We didn’t put a ChatGPT wrapper on the SIG website.

This system was designed from the ground up for Bluetooth domain accuracy.

— SLIDE 4: THE CHALLENGE AMBER SOLVES —

 

To understand why this matters,

it helps to be specific about the pain we’re solving.

 

Volume overload.

Thirty-seven thousand pages across two hundred-plus specifications.

When a developer needs a specific conformance requirement,

they often don’t know which of the two hundred documents to start in.

By the time they find the right section,

they’ve already spent an hour.

Or given up and emailed us.

 

Complex processes.

Qualification isn’t one document.

It’s TCRL test plans, ICS forms, lab requirements,

fee schedules, and submission timelines —

all separate, all maintained independently.

Members get lost.

 

Expertise silos.

The people who know how this all fits together

are a handful of SIG staff.

That’s a bottleneck.

It doesn’t scale.

And it doesn’t serve members in different time zones

who need answers at two in the morning.

 

Amber doesn’t replace those experts.

It encodes their knowledge into a system

that’s available twenty-four hours a day,

to any authenticated member,

in any language.

— SLIDE 5: AMBER IN THE SIG ECOSYSTEM —

 

One question we get from technical audiences:

does Amber replace any of our existing tools?

 

The answer is explicitly no.

 

Amber is an AI intelligence layer

that sits on top of the tools and data we already have.

 

The SIG already owns the specifications,

Qualification Workspace,

Profile Tuning Suite,

the products database,

TCRL documentation,

Zendesk knowledge base articles,

the member portal.

 

Amber doesn’t replace any of those.

It makes them accessible through a single natural language interface.

 

A member no longer needs to know

that the answer to their question lives in a PTS document

versus a Zendesk article

versus a spec section.

They ask Amber.

Amber figures out where the answer lives,

retrieves it,

and presents a cited, synthesized response.

 

Available twenty-four seven.

Without requiring the member to know the SIG’s tool landscape in advance.

— SLIDE 6: HOW PROJECT AMBER IS BUILT —

 

This slide is about engineering discipline,

and I want to give it proper weight

because it’s what separates a production-ready system from a demo.

 

Development workflow.

Alex Y authors the code.

I review every change as a GitHub pull request before it merges.

Deployments use version-tagged container images in Azure Container Registry —

every version is rollback-ready.

Nothing goes live until it’s been tested against the live system.

Standard professional software development practice, applied consistently.

 

AI-assisted but human-validated.

We use Claude, ChatGPT, and Copilot throughout development.

They accelerate the work significantly.

But AI-assisted development doesn’t mean AI-governed quality.

Every architectural decision is validated against the evaluation harness.

The eighty-two percent win rate is reproducible:

same test set, same questions, blind scoring.

We don’t make changes based on intuition.

We make them because the numbers move in the right direction.

 

Microsoft partnership.

Microsoft’s development services team is the production build partner.

They’ve reviewed the architecture, they understand the codebase,

and the engagement model has been reviewed and approved by SIG IT for compliance.

SIG retains full product ownership —

roadmap, requirements, acceptance criteria.

Microsoft leads the infrastructure engineering.

 

The result is a system that isn’t demo theater.

It’s a production candidate that Microsoft’s team can take and run with.

— SLIDE 7: THE CHAPTER FRAMEWORK —

 

We deliver Amber in chapters —

coherent releases of capability,

each mapped to a specific member need and audience.

Not engineering milestones.

Member value milestones.

 

Chapter One — now.

Specification and Product Intelligence.

Live today in pre-production.

Natural language Q and A across all SIG specifications,

product database lookup,

and qualification guidance.

Eighty-two percent win rate against ChatGPT

across two hundred and seventeen domain-specific questions.

This is what you’ll see in the demo.

 

Chapter Two — next.

Test and Spec Authoring Assistants.

We’ve split this into two parallel tracks.

 

Chapter Two-A targets test engineers.

AI validation of test cases against specs, ICS tables, and errata in real time,

flagging conflicts before they reach committee review.

 

Chapter Two-B targets spec authors.

AI-assisted writing with inline conflict detection

within an AsciiDoc-based authoring environment.

The AsciiDoc migration is a hard prerequisite for member-facing launch.

 

Chapter Three — roadmap.

Qualification Workflow Assistant.

Amber embedded directly inside Qualification Workspace —

guiding members through the entire qualification journey,

context-aware and step by step.

 

The chapter model gives the board a clear line of sight.

Chapter One is the production investment decision

we’re asking you to approve today.

Chapters Two and Three follow from that foundation.

— SLIDE 8: HOW THE CHAPTERS CONNECT —

 

This diagram is the architectural key

to understanding Amber’s long-term design.

Everything flows through AsciiDoc in GitHub.

 

AsciiDoc in GitHub is the specification and test source of truth.

It’s machine-readable, version-controlled structured content

that enables all chapters to both read from

and write back to the same foundation.

 

Chapter One reads from the spec and test corpus to answer member questions.

Chapter Two-A writes validated test content back to that corpus.

Chapter Two-B writes structured spec content.

Chapter Three reads from it to guide members through qualification.

 

The AsciiDoc dependency matters.

Until specs are in machine-readable structured format,

AI can read them but cannot reliably write back to them with confidence.

Project Blue is building that foundation.

Amber is the intelligence layer on top.

— SLIDE 9: CHAPTER 1 DEEP DIVE —

 

Chapter One is live now.

Let me walk through what’s actually running.

 

The design principle was universal audience.

Marketers, engineers, product managers, qualification staff, members —

anyone should be able to get a useful answer

without needing Bluetooth expertise to formulate the question.

That’s a harder problem than building a system for experts.

We solved it with multi-stage routing.

 

Three capability pillars.

 

Specification Search.

Two hundred-plus specifications, thirty-seven thousand pages,

forty-one thousand, seven hundred and twenty-one semantic chunks.

A member asks a spec question.

The system runs a vector similarity search across those chunks,

retrieves the most relevant sections,

and synthesizes a cited response.

The citations link back to the specific section in the actual specification.

Eighty-eight percent accuracy in our evaluation.

 

Product Intelligence.

Seventy thousand-plus qualified products,

accessed via live API call at query time — not a cached index.

This is why we hit ninety-four percent accuracy on product lookups.

ChatGPT’s training data for Bluetooth products is months to years stale.

Our data is current as of the moment the member asks.

 

Qualification Guidance.

Sixteen thousand, four hundred and ninety-six TCRL test cases

and eleven thousand, six hundred and seventy-seven ICS features,

indexed and queryable.

A developer asks what tests they need for a BLE audio device

and gets a structured answer pointing to the specific TCRL requirements.

Seventy-nine percent accuracy —

the lowest of the three, and the area with the most optimization headroom.

 

Response latency came down from twenty-three-and-a-half seconds

at the start of evaluation

to under ten seconds

through async parallelization and model routing.

— SLIDE 10: CHAPTER 2 — TEST & SPEC ASSISTANTS —

 

Chapter Two moves Amber from read-only intelligence

to active authoring assistance.

Two distinct audiences with separate delivery tracks.

 

Chapter Two-A — Test Authoring.

A test engineer writes a test case.

Amber validates it in real time

against the relevant specs, ICS tables, and errata.

Conflicts are flagged before the case reaches committee review —

not after, when the fix cost is much higher.

The same validation agents that run inline during authoring

also run on commit,

so you get a CI pipeline for test content.

 

Chapter Two-B — Spec Authoring.

A spec author writes in AsciiDoc.

Amber checks consistency, completeness, and cross-spec conflicts inline —

in the editor, not in review.

AsciiDoc plus Git makes this possible:

machine-readable, diff-friendly structured content

that Amber can reason about semantically.

AsciiDoc migration is the hard prerequisite for member-facing launch.

 

The goal isn’t to replace committee review.

It’s to ensure that what reaches committee review is already clean.

— SLIDE 11: CHAPTER 3 — QUALIFICATION WORKFLOW —

 

Chapter Three is the roadmap item,

but the vision is worth framing now

because it closes the member journey that Chapter One starts.

 

A member today navigates qualification

through a combination of Qualification Workspace,

documentation, staff emails, and trial and error.

 

Chapter Three puts Amber directly inside Qualification Workspace.

Context-aware.

Embedded.

Step-by-step guidance for the entire qualification process.

 

Test selection,

ICS form completion,

fee guidance,

lab requirements,

submission —

all in plain language,

aware of what the member is currently doing in the interface.

 

Faster time-to-qualification for every member,

regardless of experience level.

SIG support capacity scales without adding headcount.

The qualification journey becomes something

that doesn’t require a specialist to navigate.

— SLIDE 12: AMBER & PROJECT BLUE —

 

This is the slide I want the board to spend real time on,

because it answers the strategic question:

where does Amber sit in the SIG’s long-term architecture?

 

Project Blue modernizes the SIG’s tools infrastructure.

The key deliverable that matters for Amber is Test-Documents-as-Code —

AsciiDoc in GitHub.

Structured, machine-readable, version-controlled

specification and test content.

 

Project Amber is the conversational intelligence layer

that consumes Blue’s outputs.

 

Without Blue’s structured data,

Amber can read specs and answer questions — which is valuable.

With Blue’s structured data,

Amber can validate, author, and reason about spec content

at a much higher level of precision.

That’s Chapters Two and Three.

 

The relationship is directional and additive:

Blue builds the foundation,

Amber delivers the intelligence on top.

Neither replaces the other.

 

Blue and Amber are not competing investments.

They’re complements.

Investing in one accelerates the value of the other.

— SLIDE 13: ROADMAP & MILESTONES —

 

The chart shows where we are and where we’re going.

Two milestone markers anchor the timeline.

 

Bangkok — March twenty-twenty-six.

That’s today.

Bangkok is not the finish line —

it’s the transition point from pre-production to production.

Chapter One moves from internal soft launch

to full member-facing launch preparation.

Microsoft handoff begins.

Chapter Two scoping starts.

 

Milan — June twenty-twenty-six.

Target for full Chapter One member launch.

This is the date we’re building toward.

 

The infrastructure row at the bottom is important.

It shows that the pre-production platform was intentionally lightweight —

ChromaDB, Claude API, a fast-iteration environment.

Production is Azure-native:

managed services, enterprise security, proper CI/CD.

 

Thirty percent of the POC transfers:

the domain routing logic, the curated corrections, the terminology knowledge.

The other seventy percent is infrastructure replatforming.

And that is Microsoft’s job.

— SLIDE 14: LIVE DEMO —

 

This is the part of the presentation

where I stop talking and show you the system.

 

What you’re about to see is the live pre-production environment —

the same system that’s been running for months.

The data is real, the responses are real,

and there are no guardrails on the demo.

What you see is what a member would see.

 

We’ll run four scenarios.

I’ll frame each one, run the query live,

and walk through the response.

 

— DEMO 1: SPECIFICATION SEARCH —

 

First scenario — specification search.

I’m asking:

what are the advertising packet format requirements for Bluetooth Low Energy?

 

Notice a few things.

The response is cited.

It tells you exactly which specification

and which section the answer comes from.

You can verify it independently.

And the answer is synthesized —

it understands the question and explains it.

 

— DEMO 2: PRODUCT DATABASE —

 

Second scenario — product database.

I’m asking:

how many Bluetooth products does Apple have qualified?

 

This is live API data.

Not a cached index, not training data.

The number you just saw is what’s in the SIG’s qualification database

at this moment.

 

ChatGPT cannot do this.

Its product knowledge is from training,

which means it’s at minimum months out of date, and often years.

We hit ninety-four percent accuracy on product lookups in our evaluation

precisely because we’re calling the live API.

 

— DEMO 3: QUALIFICATION GUIDANCE —

 

Third scenario — qualification guidance.

I’m continuing in the same conversation,

so watch how Amber handles the context from the previous question.

 

I’m asking:

what test cases do I need to qualify a Bluetooth Low Energy audio device?

 

Two things to note.

First, the content:

this is pulling from sixteen thousand, four hundred and ninety-six TCRL test cases

and mapping them to a plain-English question.

The member doesn’t need to know the test plan structure.

Amber handles that translation.

 

Second, the context:

Amber remembered what we were talking about in the previous question.

It maintained the thread.

 

— DEMO 4: TECHNICAL DEEP DIVE —

 

Fourth scenario — cross-spec synthesis.

I’m asking:

explain the difference between BR/EDR and LE connection establishment procedures.

 

This question requires synthesizing content

from multiple sections of the Core Specification —

the BR/EDR connection procedure and the LE connection procedure

are in different volumes.

 

Amber retrieves relevant chunks from both

and synthesizes a coherent comparative answer.

That cross-spec synthesis is something a member

would otherwise need to do manually,

flipping between different parts of a very large document.

 

We’re happy to take questions from the board and run them live.

That’s the most direct test of the system —

your domain questions, unscripted.

— SLIDE 15: EVALUATION RESULTS —

 

The demo gives you qualitative confidence.

This slide gives you the quantitative basis.

 

Two hundred and seventeen domain-specific questions.

Questions drawn from actual Zendesk member tickets and SME input.

Evaluated blind against ChatGPT —

the evaluator sees two answers

without knowing which system produced which one.

Independent scoring.

 

Product lookup — ninety-four percent.

Live API versus stale training data.

This advantage is structural, not tunable.

 

Spec accuracy — eighty-eight percent.

Domain-indexed RAG versus general knowledge.

Purpose-built wins here.

 

Response time — ninety-two percent.

Latency fell from twenty-three-and-a-half seconds at the start of evaluation

to under ten seconds average.

 

TCRL guidance — seventy-nine percent.

The lowest category, and the most honest number.

This is where we have room to grow,

and it’s the focus of ongoing optimization before member launch.

 

The overall eighty-two percent win rate

expands to two hundred and fifty questions

with SME-validated ground truth before member launch.

The test harness is reproducible —

same questions, same methodology, blind scoring.

We don’t manage the number.

We run the test.

— SLIDE 16: ROAD TO PRODUCTION —

 

Three phases, each with clear criteria.

 

Now to Bangkok.

SME review of the two-hundred-and-fifty-question evaluation harness underway.

Domain experts validating answers, not just scoring wins.

Security review complete.

Chapter One feature-complete for the demo you just saw.

 

Post-Bangkok.

Microsoft leads infrastructure.

The handoff model is thirty-seventy.

Approximately thirty percent of what we’ve built —

the domain routing logic, the curated corrections, the terminology glossary —

transfers directly.

The other seventy percent is infrastructure replatforming

onto Azure managed services.

We retain product ownership throughout.

Microsoft leads the engineering.

Chapter Two scoping runs in parallel so there’s no momentum gap.

 

Production.

Microsoft teams lead infrastructure.

SIG manages roadmap, requirements, and acceptance criteria.

Member-facing rollout at Milan.

 

The principle at the bottom of this slide

is the team’s operating commitment,

and I’d ask the board to hold us to it:

 

Production-ready beats demo theater.

 

Every chapter must meet measurable criteria before it’s declared complete.

We are not shipping for the sake of a deadline.

 

To close where we opened:

 

This is not a concept or a roadmap.

It’s a working system.

 

Bangkok is the transition from pre-production to production.

 

Milan is the member launch.

 

Thank you.

We’re happy to take questions.