Skip to content

Gemini API Integration

Ayumi uses the Google Gemini API for voice-to-text transcription. You bring your own API key (BYOK), so you have full control over costs and usage.

  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Navigate to the API keys section
  4. Create a new API key
  5. Copy the key
  1. Open Ayumi > Settings (or Cmd+,)
  2. Go to the Transcription tab
  3. Paste your Gemini API key
  4. Select your preferred model
  1. Open the transcription settings
  2. Enter your Gemini API key
  3. Choose your model

The API key is stored securely in your device’s Keychain.

ModelDescription
gemini-2.5-flashFast and capable, good balance of speed and quality
gemini-2.5-flash-liteLighter model, faster processing
gemini-3-flash-previewLatest preview model
CustomEnter any Gemini model ID

You can create custom prompts to control how audio is transcribed and analyzed:

  1. Go to Transcription settings
  2. Tap Add Preset
  3. Give it a name and write your prompt instructions
  4. Optionally select a specific model for this preset
  5. Save

Custom presets appear in the recording view alongside the built-in options.

When using the Gemini API:

  • Audio data is sent to Google’s servers for processing
  • Ayumi shows a consent dialog before the first API call
  • No data is stored on Ayumi’s servers — the API call goes directly from your device to Google
  • See Google’s AI terms for details on how Google handles API data