OpenAI Realtime for Bubble — Quickstart

Add live, two-way voice to your Bubble app in minutes.
No server to host · WebRTC · Ephemeral tokens

Need help? contact me@therealpablo.com

Overview

This plugin provides a Bubble visual element (Realtime Call) and a server action (Create Ephemeral Realtime Token) to start a secure WebRTC session with OpenAI Realtime models.

Latency
Low-latency, mic → AI → audio back
Security
OpenAI key stays on Bubble server; browser uses short-lived token
Setup time
~5 minutes
Demo Link
https://realtimevoicedemo.bubbleapps.io/version-test
Editor Link
https://bubble.io/page?id=realtimevoicedemo

1) Plugin Keys

In your Bubble app's Plugins tab, set the openai API key:

Plugin Key Setup
Your long-lived API key never reaches the browser.

2) Setup (Quick)

Note: File search is not currently supported.
Please ignore the vector_store_id field when configuring your ephemeral token.
  1. Realtime Call Drop the element on your page. Must be visible.
  2. Add a Start button workflow: Start Workflow
    • Step 1: Plugins → Create Ephemeral Realtime Token (set model, voice, optional instructions)
    • Step 2: Element actions → Realtime Call → start
    • Pass: token/model/voice from Step 1's result
  3. Add a Stop button → Realtime Call → stop Stop Workflow

3) Start / Stop Actions

Element statuses: idle | starting | mic-ok | connecting | connected | error | ended

Start

Start Action
Element actions → Realtime Call → start
Inputs:
- token (Text, required)
- model (Text, required; e.g. gpt-4o-realtime-preview)
- voice (Text, required; e.g. verse)

Stop

Stop Action
Element actions → Realtime Call → stop
Ends the call and releases mic + connection.

4) Saving Transcripts

You can capture live transcripts and save them to your Bubble database. Use the "Assistant Done talking"events to save data each time an assistant is done talking.

Screenshot of a running transcript in a Bubble repeating group
Note: The assistant_done event is triggered every time the Assistant finishes speaking. This ensures you capture complete assistant responses for your transcript.

5) Troubleshooting

6) FAQ

Where do I configure model, voice, and instructions?
Only in the server action Create Ephemeral Realtime Token. Pass those values straight to the element's start.
Can I add push-to-talk?
Yes. Add a button that mutes/unmutes mic tracks or gates input by holding a state; extend the element if needed.
What do the element statuses mean?
idle
Ready to start a call
starting
Collecting mic permissions & token
mic-ok
Microphone has been approved
connecting
Negotiating WebRTC with OpenAI
connected
Live two-way audio is active
error
Connection or permission failure
ended
Call was stopped or timed out