Google

Google unveils PaliGemma 2, its latest open vision-language model for advanced image understanding

Google introduces PaliGemma 2 as its latest open vision-language model, offering leading performance in various tasks and easy fine-tuning for specific needs.

Google has introduced PaliGemma 2, its latest open vision-language model (VLM). The updated model boasts 'long captioning' for more detailed and contextually relevant image captions. Available in various sizes, it offers leading performance in tasks such as chemical formula recognition, music score recognition, spatial reasoning, and chest X-ray report generation. PaliGemma 2 is designed to be a drop-in replacement for the original model, providing immediate performance gains without major code modifications. Pre-trained models and code are available on Kaggle, Hugging Face, and Ollama.

#Google

Latest News

xBloom

xBloom Studio: The Coffee Maker That Puts Science in Your Cup

3 months ago

HomeKit

Matter 1.4.1 Update: Daniel Moneta Discusses Future of Smart Home Interoperability on HomeKit Insider Podcast

3 months ago

Mac

OWC Unleashes Thunderbolt 5 Docking Station with 11 Ports for M4 MacBook Pro

3 months ago

Technology

Nomad Unveils Ultra-Slim 100W Power Adapter for On-the-Go Charging

3 months ago

iOS

iOS 19 Set to Debut Bilingual Arabic Keyboard and Virtual Calligraphy Pen for Apple Pencil

3 months ago

Apple

Big Tech Lawyers Accused of Encouraging Clients to Break the Law

3 months ago