Apple and NVIDIA Collaborate to Speed Up Large Language Model Performance
Apple collaborates with NVIDIA for faster LLM performance

Apple engineers shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models using their Recurrent Drafter (ReDrafter) technique. The technique was integrated into NVIDIA's TensorRT-LLM, which helps run LLMs faster on NVIDIA GPUs. In benchmarking, the integration led to a 2.7x speed-up in generated tokens per second for greedy decoding.
Latest News

xBloom
xBloom Studio: The Coffee Maker That Puts Science in Your Cup
3 months ago

HomeKit
Matter 1.4.1 Update: Daniel Moneta Discusses Future of Smart Home Interoperability on HomeKit Insider Podcast
3 months ago

Mac
OWC Unleashes Thunderbolt 5 Docking Station with 11 Ports for M4 MacBook Pro
3 months ago

Technology
Nomad Unveils Ultra-Slim 100W Power Adapter for On-the-Go Charging
3 months ago

iOS
iOS 19 Set to Debut Bilingual Arabic Keyboard and Virtual Calligraphy Pen for Apple Pencil
3 months ago

Apple
Big Tech Lawyers Accused of Encouraging Clients to Break the Law
3 months ago