Baidu has officially released its latest multimodal AI model, ERNIE-4.5-VL. This new model introduces an innovative "image thinking" feature, enhancing its ability to understand and process images in addition to its powerful language processing capabilities. The model is designed for efficiency, using only 3B activation parameters, which allows for quick response times in various AI applications. The key technological breakthrough is the model's ability to not only analyze images but also to perform related actions like enlarging images and conducting image-based searches. These advancements are expected to enrich the user interaction experience between images and text, opening new possibilities for applications in intelligent search, e-commerce, and online education. Baidu has open-sourced the model, allowing developers and researchers to explore its potential and promote further development in multimodal AI. This release marks another significant step for Baidu in strengthening its position in the competitive artificial intelligence landscape.