Google

Google unveils PaliGemma 2, its latest open vision-language model for advanced image understanding

Google introduces PaliGemma 2 as its latest open vision-language model, offering leading performance in various tasks and easy fine-tuning for specific needs.

Google has introduced PaliGemma 2, its latest open vision-language model (VLM). The updated model boasts 'long captioning' for more detailed and contextually relevant image captions. Available in various sizes, it offers leading performance in tasks such as chemical formula recognition, music score recognition, spatial reasoning, and chest X-ray report generation. PaliGemma 2 is designed to be a drop-in replacement for the original model, providing immediate performance gains without major code modifications. Pre-trained models and code are available on Kaggle, Hugging Face, and Ollama.

#Google

Latest News

xBloom

xBloom Studio: The Coffee Maker That Puts Science in Your Cup

4 months ago

Motorola

Moto Watch Fit Priced at $200: Is It Worth the Cost for Fitness Enthusiasts?

4 months ago

iOS

iOS 18's Subtle but Significant Privacy Boost: Granular Contact Sharing Control

4 months ago

Google

Walmart Unveils Onn 4K Plus: The Affordable $30 Google TV Streaming Device

4 months ago

Apple

Judge Forces Apple to Comply: Epic Games' Fortnite Returns Hinge on Court Order

4 months ago

OnePlus

OnePlus Unveils the ‘Plus Key’: Is It Just an iPhone Knockoff or Something Revolutionary?

4 months ago