Documentation
Studio Tools

Transcript Editor

Time-synced transcript editing with automatic transcription, speaker identification, and seamless audio integration

🚧

Coming Soon

The Transcript Editor is currently under development and will be available soon. This documentation provides a preview of the features that will be included.

Overview

The Transcript Editor in Wubble Studio provides powerful tools for working with time-synced text alongside audio. Automatically transcribe audio content with high accuracy, edit transcripts alongside the waveform, and export captions for accessibility. Perfect for podcasts, videos, interviews, and any content requiring transcription or captions.

Edit text and audio simultaneously—changes to the transcript automatically adjust timing, and audio edits update the transcript. Speaker identification, automatic punctuation, and intelligent formatting make transcript creation fast and accurate. Export in multiple formats including SRT, VTT, and plain text.

What You Can Do

Automatic transcription with high accuracy and speaker detection
Time-synced editing where text and audio stay perfectly aligned
Edit audio by editing text—remove filler words and pauses seamlessly
Speaker identification and labeling for multi-person recordings
Export captions in SRT, VTT, and other standard formats
Search and navigate audio by searching text content

Core Features

Automatic Transcription

Upload audio or video and receive accurate, time-stamped transcriptions automatically. AI-powered transcription supports multiple languages, accents, and audio quality levels with high accuracy.

  • High accuracy: 95%+ accuracy on clear audio
  • Multi-language: Support for 40+ languages and dialects
  • Automatic punctuation: Proper capitalization and punctuation
  • Timestamps: Word-level and sentence-level timing

Time-Synced Editing

Text and audio remain perfectly synchronized as you edit. Click any word to jump to that point in the audio. Edit the text and the audio automatically adjusts, or edit the audio and the transcript updates accordingly.

Text-Based Audio Editing

Delete words from the transcript to automatically remove that audio. No need to find the exact timestamp—just edit the text.

Click to Navigate

Click any word in the transcript to jump directly to that moment in the audio. Perfect for quick navigation in long recordings.

Speaker Identification

Automatically detect different speakers in multi-person recordings like interviews, podcasts, or meetings. Assign names to speakers and organize the transcript by speaker turns for easy reading.

  • Automatic detection of 2-10 different speakers
  • Assign custom names/labels to each identified speaker
  • Visual color coding for quick speaker identification
  • Filter and search by specific speakers

Export in Multiple Formats

Export transcripts and captions in industry-standard formats for use across platforms and applications. Customize formatting, timing, and metadata for each export.

SRT

SubRip format for video captions and subtitles

VTT

WebVTT format for web video captions

Plain Text

Clean transcript without timestamps

JSON

Structured data with full timing information

Word

Microsoft Word document with formatting

PDF

Professional transcript layout with timestamps

Search & Find

Search the transcript to find specific words, phrases, or topics instantly. Jump to any mention in the audio with a single click. Perfect for long recordings where you need to find specific moments quickly.

  • Full-text search across entire transcript
  • Highlight all occurrences of search terms
  • Jump between search results with keyboard shortcuts
  • Filter by speaker to search specific people

Editing Workflow

1

Generate Transcript

Upload your audio or video file and let AI automatically transcribe it. Transcription typically completes in less time than the audio duration with high accuracy.

2

Review and Correct

Read through the transcript while listening to the audio. Correct any transcription errors, fix punctuation, and verify speaker labels. Click words to play that section.

3

Remove Filler Words

Delete "um," "uh," false starts, and long pauses directly from the transcript. The corresponding audio is automatically removed, cleaning up your recording effortlessly.

4

Label Speakers

For multi-person recordings, assign meaningful names to detected speakers (e.g., "Host," "Guest," specific names). This makes transcripts more readable and professional.

5

Export Captions

Export in your desired format. Choose SRT or VTT for video captions, plain text for blog posts or show notes, or PDF for professional documentation.

Using the API

Automate transcription and caption generation through the Wubble Studio API. Process large batches of audio, integrate transcription into your workflow, or build custom transcript tools.

Transcript Editor API Exampletypescript
// Speech-to-text with REST API
const form = new FormData();
form.append('audio', audioFile);

const response = await fetch('https://prod-backup-backend.wubble.ai/v1/speech/speech-to-text', {
  method: 'POST',
  headers: {
    Authorization: `Bearer ${process.env.WUBBLE_API_KEY}`,
  },
  body: form,
});

const payload = await response.json();
console.log(payload.data.transcript);
ℹ️

API Documentation

See the Transcript API Reference for complete documentation of transcription services, editing operations, and export formats.

Common Use Cases

🎙️ Podcast Editing

Transcribe podcast episodes, remove filler words and false starts by editing the transcript, create show notes from the transcript, and export captions for video versions on YouTube.

📹 Video Captions

Generate accurate captions for YouTube, social media, or educational videos. Export in SRT or VTT format for upload to any video platform. Improve accessibility and SEO.

📝 Interview Documentation

Transcribe interviews, research discussions, or focus groups. Identify speakers automatically, search for specific topics, and export professional documentation with speaker labels.

🎓 Educational Content

Transcribe lectures, webinars, and tutorials. Export captions for accessibility compliance, create study materials from transcripts, and make content searchable by text.

💼 Meeting Notes

Automatically transcribe meetings, calls, and presentations. Search transcripts to find specific discussions, assign action items, and share written records with team members.

Best Practices

Use High-Quality Audio

Transcription accuracy improves dramatically with clean audio. Use good microphones, reduce background noise, and ensure clear speech for best results.

Review Before Exporting

Always review automated transcripts for accuracy. While AI is highly accurate, proper nouns, technical terms, and accents may need correction.

Label Speakers Meaningfully

Use clear, descriptive speaker labels. "Host," "Guest 1," actual names, or role descriptions make transcripts much more readable than "Speaker 1," "Speaker 2."

Break into Paragraphs

For readability, break long monologues into logical paragraphs. Group related thoughts together and use paragraph breaks at natural speech pauses.

Choose the Right Export Format

SRT/VTT for video captions, plain text for blog posts or documentation, JSON for programmatic access. Consider your end use when exporting.

Use Text Editing for Audio Cleanup

Leverage text-based audio editing to quickly remove filler words, false starts, and mistakes. It's often faster than traditional waveform editing.

Was this page helpful?