Baidu's native multimodal model, capable of understanding and generating text, images, sound, and video. It includes instruction following, factual reasoning, agentic planning, and
Baidu's native multimodal model, capable of understanding and generating text, images, sound, and video. It includes instruction following, factual reasoning, agentic planning, and the use of tools for creation, analysis, and global-level services
Updated means this listing was last refreshed on Feb 27, 2026.
AI software for editing YouTube videos by cutting bloopers, removing silences, and generating clips.
Create hyperrealistic video montages by replacing any face. This AI deals with complex angles and expressions for virtually undetectable results, both in photos and animated clips.
Estimate depth and reconstruct 3D scenes from one or more images: video, SLAM, Gaussian Splatting, multi-camera, spatial consistency, and increased accuracy for robotics, AR/VR, an