Gemini API Integration
Ayumi uses the Google Gemini API for voice-to-text transcription. You bring your own API key (BYOK), so you have full control over costs and usage.
Getting a Gemini API Key
Section titled “Getting a Gemini API Key”- Visit Google AI Studio
- Sign in with your Google account
- Navigate to the API keys section
- Create a new API key
- Copy the key
Configuring in Ayumi
Section titled “Configuring in Ayumi”- Open Ayumi > Settings (or
Cmd+,) - Go to the Transcription tab
- Paste your Gemini API key
- Select your preferred model
- Open the transcription settings
- Enter your Gemini API key
- Choose your model
The API key is stored securely in your device’s Keychain.
Supported Models
Section titled “Supported Models”| Model | Description |
|---|---|
| gemini-2.5-flash | Fast and capable, good balance of speed and quality |
| gemini-2.5-flash-lite | Lighter model, faster processing |
| gemini-3-flash-preview | Latest preview model |
| Custom | Enter any Gemini model ID |
Custom Prompt Presets
Section titled “Custom Prompt Presets”You can create custom prompts to control how audio is transcribed and analyzed:
- Go to Transcription settings
- Tap Add Preset
- Give it a name and write your prompt instructions
- Optionally select a specific model for this preset
- Save
Custom presets appear in the recording view alongside the built-in options.
Data Privacy
Section titled “Data Privacy”When using the Gemini API:
- Audio data is sent to Google’s servers for processing
- Ayumi shows a consent dialog before the first API call
- No data is stored on Ayumi’s servers — the API call goes directly from your device to Google
- See Google’s AI terms for details on how Google handles API data