Commit Graph

6557 Commits

Author SHA1 Message Date
mk
c42274a511 fix: tighten spacing between VAD enable checkbox and mode radio buttons
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 08:00:14 -03:00
mk
9fc9655dbb fix: proper radio buttons for VAD mode, standard=16ms/aggressive=10ms
- Use compound-web Form/InlineField/RadioControl/Label/HelpMessage for
  VAD mode selection (proper radio button rendering)
- Standard mode: 256 samples / 16 ms hop + 5 ms open / 20 ms close ramp
- Aggressive mode: 160 samples / 10 ms hop + 1 ms open / 5 ms close ramp
- Worklet stores WebAssembly.Module and recreates TenVADRuntime with the
  correct hop size whenever the mode changes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 07:52:52 -03:00
mk
e95e613c08 feat: add VAD mode setting — standard vs aggressive latency
Standard: 5 ms open / 20 ms close ramp (comfortable feel)
Aggressive: 1 ms open / 5 ms close ramp (lowest possible latency)

The mode is surfaced as a radio selector in Settings → Audio → Voice
activity detection, visible while VAD is enabled. Wired through
NoiseGateParams.vadAggressive → worklet updateParams.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 07:48:41 -03:00
mk
025735c490 perf: reduce TEN-VAD latency from 16 ms to 10 ms, asymmetric gate ramp
- Hop size 256 → 160 samples @ 16 kHz: VAD decision every 10 ms instead
  of 16 ms (minimum supported by TEN-VAD)
- Asymmetric VAD ramp: 5 ms open (was 20 ms) to avoid masking speech onset,
  20 ms close retained for de-click on silence

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 07:44:47 -03:00
mk
dc1f30b84f feat: replace Silero VAD with TEN-VAD running inside the AudioWorklet
TEN-VAD (official TEN-framework/ten-vad WASM, no npm dependency) replaces
@ricky0123/vad-web. The WASM module is compiled once on the main thread and
passed to the AudioWorklet via processorOptions, where it is instantiated
synchronously and called every 16 ms with no IPC round-trip.

- Add public/vad/ten_vad.{wasm,js} from official upstream lib/Web/
- NoiseGateProcessor: TenVADRuntime class wraps the Emscripten WASM with
  minimal import stubs; 3:1 decimation accumulates 256 Int16 samples @
  16 kHz per hop; hysteresis controls vadGateOpen directly in-worklet
- NoiseGateTransformer: fetch+compile WASM once (module-level cache),
  pass WebAssembly.Module via processorOptions; remove setVADOpen()
- Publisher: remove all SileroVADGate lifecycle (init/start/stop/destroy,
  rawMicTrack capture); VAD params folded into single combineLatest;
  fix transient suppressor standalone attach (shouldAttach now includes
  transientSuppressorEnabled)
- vite.config.ts: remove viteStaticCopy, serveVadAssets plugin, and all
  vad-web/onnxruntime copy targets (public/vad/ served automatically)
- Remove @ricky0123/vad-web, onnxruntime-web deps and resolution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 07:43:52 -03:00
mk
dbd4eef899 feat: decouple noise gate and VAD, pre-warm model for instant enable
Noise gate and Silero VAD now work fully independently — the worklet
attaches when either is enabled and bypasses the amplitude gate when
only VAD is on (noiseGateActive flag). SileroVADGate gains a two-phase
lifecycle: init(ctx) loads the ONNX model eagerly when the AudioContext
is first created; start(stream) is then near-instant when the user
enables VAD. stop() pauses without unloading the model so re-enabling
is also instant. VAD checkbox no longer requires the noise gate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 00:15:32 -03:00
mk
325094b54d fix: feed VAD the raw mic track captured before setProcessor
After setProcessor resolves, track.mediaStreamTrack returns the processed
(noise-gated) track. The VAD was seeing gated silence, closing immediately,
and deadlocking with both gates closed. Capture the raw MediaStreamTrack
before calling setProcessor and pass that to SileroVADGate instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 00:06:47 -03:00
mk
aff09d0e49 fix: use Silero v5 model for 32ms frames and lower default thresholds
The legacy model is hardcoded to 1536 samples (96ms frames); v5 uses 512
samples (32ms), reducing gate open latency by 3x. Also lower default
positive/negative thresholds to 0.2/0.1 so the gate opens at the first
sign of speech rather than waiting for high model confidence.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 00:02:17 -03:00
mk
859db651e0 feat: add VAD threshold controls and smooth gate ramp
Replace the hard 0/1 VAD gate with a 20ms ramp in the worklet to prevent
clicks on open/close transitions. Expose positive and negative speech
probability thresholds as user-adjustable settings (defaults 0.5/0.35).
Sliders with restore-defaults button added to the VAD section of the
audio settings tab.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:57:35 -03:00
mk
1ffee2d25e fix: serve VAD assets from node_modules in dev mode
vite-plugin-static-copy only copies files at build time; in dev the /vad/
requests fell through to the SPA 404 handler, returning text/html which
caused the WASM magic-number validation error. Add a configureServer
middleware that serves the worklet bundle, ONNX model, and WASM files
directly from node_modules with correct MIME types during development.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:50:55 -03:00
mk
4a58277090 fix: force onnxruntime-web@1.18.0 via resolutions to eliminate nested 1.24.3
vad-web's own dependency was resolved to ort@1.24.3 (nested in its
node_modules), which only has threaded WASM requiring a .mjs dynamic
import that Vite fails to serve correctly. Pin ort to 1.18.0 via yarn
resolutions so all packages share the same copy with ort-wasm-simd.wasm
(non-threaded SIMD). Also remove the now-unnecessary COOP/COEP headers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:48:44 -03:00
mk
f2988cd689 fix: downgrade onnxruntime-web to 1.18 for non-threaded SIMD WASM
ort 1.19+ dropped non-threaded WASM binaries and replaced them with a
threaded .mjs loader that Vite's dev server fails to serve correctly
(wrong MIME type / transform interception). ort 1.18 ships ort-wasm-simd.wasm
which works with numThreads=1 and needs no .mjs dynamic import.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:45:05 -03:00
mk
b25cec3aa0 fix: copy ort .mjs file, add COOP/COEP headers, set numThreads=1
The threaded ORT WASM requires ort-wasm-simd-threaded.mjs to be served
alongside the .wasm files, and needs SharedArrayBuffer (COOP/COEP headers).
Add the .mjs to the static copy targets, add the required headers to the
Vite dev server, and set ort.env.wasm.numThreads=1 as a single-threaded
fallback that avoids the SharedArrayBuffer requirement entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:41:51 -03:00
mk
edd1e1d34e fix: start VAD gate open to avoid permanent silence on model load failure
Starting the gate closed caused permanent silence if the ONNX model or
WASM files failed to load (onFrameProcessed never fired). Gate now starts
open so audio flows immediately; the first silence frame closes it. Also
ensures the gate is always reset to open when VAD is disabled.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:37:36 -03:00
mk
9f5b639190 fix: switch VAD gate to per-frame probability control
onSpeechStart/onSpeechEnd fire at segment boundaries — with constant
non-speech noise, onSpeechEnd never fires so the gate stayed open.
Switch to onFrameProcessed which fires every ~96ms and applies hysteresis
(open at >0.5, close at <0.35) matching Silero's own thresholds. Gate now
starts closed and opens only once the first speech frame is confirmed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:34:08 -03:00
mk
428b76db25 feat: add Silero VAD toggle to audio pipeline
Integrates @ricky0123/vad-web's MicVAD as an optional voice activity detector
alongside the noise gate. When enabled, the Silero ONNX model classifies each
audio frame as speech or silence; silence frames mute the worklet's output via
a new VAD gate message. VAD is wired into Publisher.ts alongside the existing
noise gate transformer. Vite is configured to copy the worklet bundle, ONNX
model, and ORT WASM files to /vad/ so they're reachable at runtime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:29:43 -03:00
mk
0788e56c51 feat: add restore defaults button to transient suppressor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:21:20 -03:00
mk
411e18c48a feat: add transient suppressor to audio pipeline
Implements a per-sample transient suppressor in the noise gate AudioWorklet
that instantly cuts gain when a sudden loud peak (desk hit, mic bump) exceeds
the slow background RMS by a configurable threshold, then releases over a
short window. Exposes enable, sensitivity, and release controls in the audio
settings tab.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 23:13:57 -03:00
mk
68d8bb1f92 feat: noise gate implementation 2026-03-23 20:56:58 -03:00
Valere Fedronic
385ab0a0ed Merge pull request #3805 from element-hq/robin/ringing
New ringing UI
2026-03-19 10:43:16 +01:00
Robin
4d69565312 Merge pull request #3809 from element-hq/robin/invert-buttons
Invert the colors of the camera and microphone buttons
2026-03-18 14:49:55 +01:00
Robin
fa844446b6 Invert the colors of the camera and microphone buttons
So that they use primary color tokens when unmuted, and secondary color tokens when muted. This makes them work like the screen sharing button.
2026-03-18 11:29:55 +01:00
Robin
9dfade68ee New ringing UI
This implements the new ringing UI by showing a placeholder tile for the participant being dialed, rather than an overlay.
2026-03-18 11:20:43 +01:00
renovate[bot]
6d14f1d06f Update GitHub Actions (#3804)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2026-03-14 14:48:44 +00:00
Robin
5cf5227eaa Merge pull request #3802 from element-hq/toger5/fix-qs-glob-js-yaml-security
Update qs, js-yaml, glob for security patches
2026-03-13 17:31:40 +01:00
Timo K
78240c2ec8 update qs, js-yaml, glob for security patches 2026-03-13 08:01:52 +01:00
Timo K
bf8bf80417 Revert "update qs, js-yaml, glob for security patches"
This reverts commit 7da9bca08c.
2026-03-13 08:01:08 +01:00
Timo K
7da9bca08c update qs, js-yaml, glob for security patches 2026-03-13 07:59:49 +01:00
Timo
748c8e6d0d Merge pull request #3800 from element-hq/toger5/fix-tar-minimatch-security
Bump tar, minimatch
2026-03-12 23:54:10 +08:00
Timo
40ef16a55f Merge pull request #3799 from element-hq/toger5/fix-rollup-security-alert
Update vite, vitest and rollup
2026-03-12 23:51:53 +08:00
Timo
07ad8374a9 Merge pull request #3595 from element-hq/toger5/dont-trap-in-invalid-config
Reset overwrite url if it is invalid (does fail to reach sfu)
2026-03-12 22:06:26 +08:00
Timo K
a0d5c79999 also update coverage 2026-03-12 14:52:16 +01:00
Timo K
3e171d9639 bump tar, minimatch
security alert
2026-03-12 14:48:05 +01:00
fkwp
413329cd26 Fix: zizmor findings (#3797)
* zizmor auto fixes

* add github action for security analysis with zizmor

* add access token to iOS push action

* fix zizmor findings

* add exceptions for dangerous-triggers including comments for reasoning

* improve comments

* prettier
2026-03-12 13:30:45 +01:00
Timo K
6b8f6e9405 update vite vitest and rollup
(rollup needs updating to fix a security alert)
2026-03-12 12:10:17 +01:00
fkwp
af54b39698 fix: typo pushing element registry OCI images now to the correct target (#3796)
* Push docker images to oci.element.io

* prettier

* add id-token permission as its required by tailscale login

* pass secrets to reusable workflows

* change secret path team -> voip

* Update .github/workflows/build-and-publish-docker.yaml

Co-authored-by: Gaël Goinvic <97093369+gaelgatelement@users.noreply.github.com>

* typo

---------

Co-authored-by: Gaël Goinvic <97093369+gaelgatelement@users.noreply.github.com>
2026-03-11 16:09:02 +01:00
Timo
c05d223133 Merge pull request #3775 from element-hq/toger5/new-pip-layout
Implement new Pip Layout (with control buttons)
2026-03-11 23:07:47 +08:00
Timo K
c7f25feb66 use better test condition for mute buttons 2026-03-11 15:46:19 +01:00
Timo K
a20edca9a1 fix pip container query 2026-03-11 15:36:37 +01:00
Timo K
d00ff78d65 fix pip interaction test (button presses) 2026-03-11 15:21:36 +01:00
fkwp
839c4dd738 fix: OCI image push to element registry (#3795)
* Push docker images to oci.element.io

* prettier

* add id-token permission as its required by tailscale login

* pass secrets to reusable workflows

* change secret path team -> voip

* Update .github/workflows/build-and-publish-docker.yaml

Co-authored-by: Gaël Goinvic <97093369+gaelgatelement@users.noreply.github.com>

---------

Co-authored-by: Gaël Goinvic <97093369+gaelgatelement@users.noreply.github.com>
2026-03-11 15:17:12 +01:00
fkwp
41f7b643fb Add zizmor checks on CI (#3792)
* zizmor auto fixes

* add github action for security analysis with zizmor

* add access token to iOS push action
2026-03-11 14:20:05 +01:00
Timo K
3a9d394529 activate click tests 2026-03-11 14:05:17 +01:00
fkwp
c9557e91d5 fix: add id-token permission as its required by tailscale login (part 3) (#3793)
* Push docker images to oci.element.io

* prettier

* add id-token permission as its required by tailscale login

* pass secrets to reusable workflows
2026-03-11 13:06:20 +01:00
Timo K
1e400bc550 remove unsused import 2026-03-10 18:26:12 +01:00
Timo K
6485da8fff add playwright tests for new pip layout 2026-03-10 15:17:41 +01:00
Timo K
54bef07b3b linter 2026-03-10 13:57:06 +01:00
Timo K
273eedd256 keep pip as it was before on mobile 2026-03-10 13:57:06 +01:00
Timo K
38382539ad fix lints 2026-03-10 13:57:06 +01:00
Timo K
8db1c4c370 Implement new Pip Layout (with control buttons) 2026-03-10 13:57:06 +01:00