@Balinares - Lemmy

Balinares@pawb.social

0 Posts
1 Comment

Joined 3 years ago

Cake day: June 9th, 2023

You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.

OverviewCommentsPosts

Balinares@pawb.socialtoTechnology@lemmy.world•DeepSeek Permanently Reduces The Price Of Its Flagship V4 Model By 75 Percent
link
fedilink
English
arrow-up
19·
5 days ago
They invented a hybrid attention design that drastically reduces the amount of memory needed for the KV cache at inference time. Like, dividing it by 10. And memory is a large part of the cost of inference.

link
fedilink