Apple and NVIDIA Collaborate to Speed Up Large Language Model Performance
Apple collaborates with NVIDIA for faster LLM performance

Apple engineers shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models using their Recurrent Drafter (ReDrafter) technique. The technique was integrated into NVIDIA's TensorRT-LLM, which helps run LLMs faster on NVIDIA GPUs. In benchmarking, the integration led to a 2.7x speed-up in generated tokens per second for greedy decoding.
Latest News

xBloom
xBloom Studio: The Coffee Maker That Puts Science in Your Cup
4 months ago

Motorola
Moto Watch Fit Priced at $200: Is It Worth the Cost for Fitness Enthusiasts?
4 months ago

iOS
iOS 18's Subtle but Significant Privacy Boost: Granular Contact Sharing Control
4 months ago

Google
Walmart Unveils Onn 4K Plus: The Affordable $30 Google TV Streaming Device
4 months ago

Apple
Judge Forces Apple to Comply: Epic Games' Fortnite Returns Hinge on Court Order
4 months ago

OnePlus
OnePlus Unveils the ‘Plus Key’: Is It Just an iPhone Knockoff or Something Revolutionary?
4 months ago