Transcript Editor
Time-synced transcript editing with automatic transcription, speaker identification, and seamless audio integration
Coming Soon
The Transcript Editor is currently under development and will be available soon. This documentation provides a preview of the features that will be included.
Overview
The Transcript Editor in Wubble Studio provides powerful tools for working with time-synced text alongside audio. Automatically transcribe audio content with high accuracy, edit transcripts alongside the waveform, and export captions for accessibility. Perfect for podcasts, videos, interviews, and any content requiring transcription or captions.
Edit text and audio simultaneously—changes to the transcript automatically adjust timing, and audio edits update the transcript. Speaker identification, automatic punctuation, and intelligent formatting make transcript creation fast and accurate. Export in multiple formats including SRT, VTT, and plain text.
What You Can Do
Core Features
Automatic Transcription
Upload audio or video and receive accurate, time-stamped transcriptions automatically. AI-powered transcription supports multiple languages, accents, and audio quality levels with high accuracy.
- High accuracy: 95%+ accuracy on clear audio
- Multi-language: Support for 40+ languages and dialects
- Automatic punctuation: Proper capitalization and punctuation
- Timestamps: Word-level and sentence-level timing
Time-Synced Editing
Text and audio remain perfectly synchronized as you edit. Click any word to jump to that point in the audio. Edit the text and the audio automatically adjusts, or edit the audio and the transcript updates accordingly.
Text-Based Audio Editing
Delete words from the transcript to automatically remove that audio. No need to find the exact timestamp—just edit the text.
Click to Navigate
Click any word in the transcript to jump directly to that moment in the audio. Perfect for quick navigation in long recordings.
Speaker Identification
Automatically detect different speakers in multi-person recordings like interviews, podcasts, or meetings. Assign names to speakers and organize the transcript by speaker turns for easy reading.
- Automatic detection of 2-10 different speakers
- Assign custom names/labels to each identified speaker
- Visual color coding for quick speaker identification
- Filter and search by specific speakers
Export in Multiple Formats
Export transcripts and captions in industry-standard formats for use across platforms and applications. Customize formatting, timing, and metadata for each export.
SRT
SubRip format for video captions and subtitles
VTT
WebVTT format for web video captions
Plain Text
Clean transcript without timestamps
JSON
Structured data with full timing information
Word
Microsoft Word document with formatting
Professional transcript layout with timestamps
Search & Find
Search the transcript to find specific words, phrases, or topics instantly. Jump to any mention in the audio with a single click. Perfect for long recordings where you need to find specific moments quickly.
- Full-text search across entire transcript
- Highlight all occurrences of search terms
- Jump between search results with keyboard shortcuts
- Filter by speaker to search specific people
Editing Workflow
Generate Transcript
Upload your audio or video file and let AI automatically transcribe it. Transcription typically completes in less time than the audio duration with high accuracy.
Review and Correct
Read through the transcript while listening to the audio. Correct any transcription errors, fix punctuation, and verify speaker labels. Click words to play that section.
Remove Filler Words
Delete "um," "uh," false starts, and long pauses directly from the transcript. The corresponding audio is automatically removed, cleaning up your recording effortlessly.
Label Speakers
For multi-person recordings, assign meaningful names to detected speakers (e.g., "Host," "Guest," specific names). This makes transcripts more readable and professional.
Export Captions
Export in your desired format. Choose SRT or VTT for video captions, plain text for blog posts or show notes, or PDF for professional documentation.
Using the API
Automate transcription and caption generation through the Wubble Studio API. Process large batches of audio, integrate transcription into your workflow, or build custom transcript tools.
// Speech-to-text with REST API
const form = new FormData();
form.append('audio', audioFile);
const response = await fetch('https://prod-backup-backend.wubble.ai/v1/speech/speech-to-text', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
},
body: form,
});
const payload = await response.json();
console.log(payload.data.transcript);API Documentation
See the Transcript API Reference for complete documentation of transcription services, editing operations, and export formats.
Common Use Cases
🎙️ Podcast Editing
Transcribe podcast episodes, remove filler words and false starts by editing the transcript, create show notes from the transcript, and export captions for video versions on YouTube.
📹 Video Captions
Generate accurate captions for YouTube, social media, or educational videos. Export in SRT or VTT format for upload to any video platform. Improve accessibility and SEO.
📝 Interview Documentation
Transcribe interviews, research discussions, or focus groups. Identify speakers automatically, search for specific topics, and export professional documentation with speaker labels.
🎓 Educational Content
Transcribe lectures, webinars, and tutorials. Export captions for accessibility compliance, create study materials from transcripts, and make content searchable by text.
💼 Meeting Notes
Automatically transcribe meetings, calls, and presentations. Search transcripts to find specific discussions, assign action items, and share written records with team members.
Best Practices
Use High-Quality Audio
Transcription accuracy improves dramatically with clean audio. Use good microphones, reduce background noise, and ensure clear speech for best results.
Review Before Exporting
Always review automated transcripts for accuracy. While AI is highly accurate, proper nouns, technical terms, and accents may need correction.
Label Speakers Meaningfully
Use clear, descriptive speaker labels. "Host," "Guest 1," actual names, or role descriptions make transcripts much more readable than "Speaker 1," "Speaker 2."
Break into Paragraphs
For readability, break long monologues into logical paragraphs. Group related thoughts together and use paragraph breaks at natural speech pauses.
Choose the Right Export Format
SRT/VTT for video captions, plain text for blog posts or documentation, JSON for programmatic access. Consider your end use when exporting.
Use Text Editing for Audio Cleanup
Leverage text-based audio editing to quickly remove filler words, false starts, and mistakes. It's often faster than traditional waveform editing.