Apple and NVIDIA Collaborate to Speed Up Large Language Model Performance
Apple collaborates with NVIDIA for faster LLM performance
Apple engineers shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models using their Recurrent Drafter (ReDrafter) technique. The technique was integrated into NVIDIA's TensorRT-LLM, which helps run LLMs faster on NVIDIA GPUs. In benchmarking, the integration led to a 2.7x speed-up in generated tokens per second for greedy decoding.
Latest News
xBloom
xBloom Studio: The Coffee Maker That Puts Science in Your Cup
6 months ago
Motorola
Moto Watch Fit Priced at $200: Is It Worth the Cost for Fitness Enthusiasts?
6 months ago
iOS
iOS 18's Subtle but Significant Privacy Boost: Granular Contact Sharing Control
6 months ago
Google
Walmart Unveils Onn 4K Plus: The Affordable $30 Google TV Streaming Device
6 months ago
Apple
Judge Forces Apple to Comply: Epic Games' Fortnite Returns Hinge on Court Order
6 months ago
OnePlus
OnePlus Unveils the ‘Plus Key’: Is It Just an iPhone Knockoff or Something Revolutionary?
6 months ago