Sound ID — tdlidar_sound
Semantic audio triggers — react to a clap, music, applause or speech by name, going beyond a raw beat.
Category: Audio · Tier: Free · Needs: microphone access on the phone
What it does
Classifies what the phone’s microphone is hearing into one of 300+ everyday sound classes and streams the winning label (e.g. “clapping”, “music”, “speech”, “dog”) plus a confidence score. Where the Audio op only knows how loud and what frequency, Sound ID knows what the sound is, so you can fire visuals on meaning — applause launches confetti, music switches to a beat-reactive scene, silence falls back to ambient.
The label is a string, so it rides an OSC In DAT; the confidence is a number on a small CHOP.
OSC in
| address | type | range | rate |
|—|—|—|—|
| /tdlidar/sound/label | string | one of 300+ class names | on change |
| /tdlidar/sound/confidence | float | 0–1 | on change |
Outputs
out_label(Text DAT) — the current sound class name, written by the OSC In DAT’s callback.out1(CHOP) — one channel:tdlidar/sound/confidence(0–1), how sure the classifier is.
Parameters
| par | default | what it does | |—|—|—| | OSC Port | 9000 | UDP port to listen on (match the app) |
Quick start (beginner)
- Enable Sound ID in the app and grant microphone access.
- Drop
tdlidar_sound. The OSC In DAT callback writes the label intoout_label;out1carries the confidence. - Clap, play music, or talk near the phone —
out_labelchanges to name it andout1shows how confident it is. - Wire
out_labelinto a Text TOP to display the live sound name.
Advanced patterns
- Semantic trigger (the point): in the OSC In DAT callback (or a downstream Select DAT filtering by label), test for a class —
clapping,music,cheering— and pulse a trigger when it matches. Gate it onout1(confidence) with a Logic CHOP greater than ~0.5 so a faint mistaken guess doesn’t fire your show. - Confidence as intensity: map
out1through a Math/Range CHOP and use it as the strength of the effect the label chose — a confident “applause” lands bigger than a hesitant one. - Debounce flapping: the label can wobble between similar classes. Hold the last confident label with a DAT Execute (only overwrite when confidence clears the threshold), or require the same label for N updates before acting.
- Scene routing: build a small Lookup/Select DAT mapping class names → scene indices, convert to a CHOP value, and drive a Switch TOP so the room’s sound picks the look.
Gotchas
- Two different operators: the label is a string on an OSC In DAT; the confidence is a float on a CHOP. Don’t try to read the label from a CHOP — it won’t appear.
- Always gate on
out1confidence. The classifier always emits some label; a low-confidence guess is noise, not a cue. - Labels change on the fly and can chatter between near-neighbours (e.g. “speech” ↔ “conversation”); debounce before firing anything destructive like a scene cut.
- It’s a microphone in a real room — overlapping sounds and bleed will confuse it; design cues around clear, dominant events (a big clap, sustained music), not subtle ones.