fix: use Silero v5 model for 32ms frames and lower default thresholds

The legacy model is hardcoded to 1536 samples (96ms frames); v5 uses 512 samples (32ms), reducing gate open latency by 3x. Also lower default positive/negative thresholds to 0.2/0.1 so the gate opens at the first sign of speech rather than waiting for high model confidence. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 00:02:17 -03:00
parent 859db651e0
commit aff09d0e49
3 changed files with 9 additions and 3 deletions
--- a/vite.config.ts
+++ b/vite.config.ts
@@ -79,6 +79,10 @@ export default ({
          src: "node_modules/@ricky0123/vad-web/dist/silero_vad_legacy.onnx",
          dest: "vad",
        },
+        {
+          src: "node_modules/@ricky0123/vad-web/dist/silero_vad_v5.onnx",
+          dest: "vad",
+        },
        {
          src: "node_modules/onnxruntime-web/dist/*.wasm",
          dest: "vad",