Whisper on Swedish hardware

Record.
Transcribe.
Who said what.

Audio to text with speaker diarization. Whisper-large-v3 runs on dedicated GPU hardware in Sweden. Your audio never leaves the country.

How it works

Have an API key? Log in

Why staik VOICE?

Your data stays in Sweden

The Whisper model runs on dedicated GPU hardware in Sweden. No audio is sent outside the country and nothing is used for AI training.

Speaker diarization included

Pyannote 3.1 separates speakers automatically — perfect for meetings, interviews and podcasts. Toggle on/off as needed.

Drop-in OpenAI Whisper

Same API as OpenAI's /v1/audio/transcriptions. Just swap base_url. Supports mp3, wav, m4a, webm, ogg.

How it works

From audio file to diarized text in seconds.

1

Upload

Drop an audio file or pick from your device. Supports mp3, wav, m4a, webm and ogg up to 100 MB.

2

AI transcribes

The WhisperX pipeline runs large-v3 on Swedish GPUs and pyannote separates speakers with word-level timestamps.

3

Export

Get plain text, diarized text, SRT, VTT or JSON with word-level timestamps and speaker labels.