Unique3D
Create a 1M faces 3D colored model from an image!
Create a 1M faces 3D colored model from an image!
Try PaliGemma on document understanding tasks
Generate custom audio from a text prompt
Annotate and describe images with text prompts
Transform a videoβs style using a text prompt
Video upscaler/restorer
Annotate videos with captionβbased object bounding boxes
Generate images from prompts or images
Generate summaries from YouTube videos or uploaded videos
Chat about images by uploading them
Build and run language models visually
Upscale and enhance images with AI-powered detail
In-browser speech recognition w/ word-level timestamps
High-fidelity Virtual Try-on
Video-to-Audio Generation with Hidden Alignment
Multimodal Image-to-Video
Transcribe audio in any language using text data
Generate images from text prompts
Aesthetically Controllable Text-Driven Stylization w/o Train
Generate lifelike video animations from images and audio
Try on clothes virtually with images
Generate enhanced images by blending foreground with custom backgrounds
Try on clothes on a person image
Text-to-Video
Generate text from images or videos
Transcribe speech and generate AI response
Convert image text to markdown format
Create polished ID photos with automatic background removal
Answer questions about any uploaded image
Travel through the model latent space
Create a video from an image with camera motion
Analyse any image with Llama3.2
Fill and edit images using masks
Convert PDFs to individual page images
Generate search queries for document images
Answer questions about uploaded images and documents
Transcribe audio or YouTube videos into text
Generate music from text descriptions
Generate long spokenβstyle scripts from documents for audio
Ultra-high resolution image synthesis
Generate and edit audio from text prompts
VLMEvalKit Evaluation Results Collection
Generate personalized research profiles and chat with Arxiv Copilot
Run code snippets and get instant results with Qwenβ2.5
High-fidelity Virtual Try-on
Describe image contents with prompts
Visual Retrieval with ColPali and Vespa
Using RAG LLM to assist your academic writing
Generate new person images with swapped clothes or poses