Captions is an AI-powered video editing and captioning platform, founded in 2021 by Gaurav Misra and Dwight Churchill in New York, United States. The platform offers video processing tools targeting a broad user base including content creators, social media managers, educators, and small businesses. As of 2025, it has reached over 10 million users worldwide and supports the production of more than 3 million videos per month. Captions has raised $60 million in a Series C investment round involving Sequoia Capital, Andreessen Horowitz, and Kleiner Perkins, bringing its valuation to over $500 million.
Captioning and Language Tools
Captions uses automatic speech recognition and natural language processing technologies to generate subtitles for videos. Users can customize the font, color, and size of subtitles. The software transcribes audio from video content without requiring manual input. Subtitles not only enhance accessibility but also contribute to search engine optimization and help reach users who view content without sound. The platform supports over 100 languages, facilitating access to global audiences.
Video Editing Tools
The platform offers a user-friendly online interface for basic editing tasks such as cutting, trimming, adding transitions, and inserting media without the need for complex software. The AI Edit feature allows users to apply a pre-defined editing template that automatically trims videos, adds transitions, integrates background music, and inserts B-roll footage. This enables users with no prior editing experience to create professional-quality content.
AI-Powered Avatar and Advertising Tools
Through modules such as AI Creator and Mirage, Captions provides AI-generated avatars and fully automated advertising content creation. The AI Avatar feature enables users to produce videos using a digital character without physical shooting. Mirage creates entire videos with AI-generated actors, scripts, voiceovers, and scenes. These tools are often used in user-generated content (UGC)-style ad campaigns and eliminate the need for copyright licensing and actor contracts.
Multilingual Video Production and Voice Technologies
Captions include tools like AI Translate and AI Lipdub that translate videos into other languages and provide dubbing. With support for more than 28 languages, users can create multilingual content with subtitles and voiceovers. Its voice technology includes AI-based voice cloning tools such as AI Voice Clone, AI Voiceover, and AI Music, allowing narration in both original and synthetic voices.
Image and Clip Generation Tools
Captions integrates AI image generation models such as OpenAI (DALL·E 3), Google (Imagen 3), Luma (Photon), Ideogram, Recraft, and FLUX to support visual elements in video content. These visuals can be used as subtitle backgrounds, scene designs, or layered graphics. The AI Clip Generator automatically identifies short clips with viral potential from longer videos and exports them in formats optimized for social media platforms.
Enterprise and Professional Use
In addition to individual users, Captions offers subscription plans for businesses and large institutions. Plans such as Pro, Max, Scale, Business, and Enterprise differ in terms of capacity, processing speed, support level, and access permissions. The Enterprise plan includes features like data privacy support, corporate training, and dedicated customer managers.
Future Plans and R&D Investments
In 2024, Captions announced a $100 million R&D investment initiative in New York aimed at advancing AI-based video production technologies. The focus areas include avatar-driven content creation, real-time translation, automated script writing, and video post-production. The company predicts that generative AI tools will become central to the future of content production.