Speak or upload — each utterance (shared VAD) is decoded by both models and shown token-aligned so you can compare them directly. sherpa-onnx ONNX int8 (ORT CPU): Micro (robust) vs Qwen3-ASR-0.6B (teacher).
Per utterance, the two transcripts are token-aligned: top = Micro, bottom = 0.6B; matching tokens plain, differences highlighted, · = missing token. 📁 file input runs through the same pipeline.