@theunknownmuncher

theunknownmuncher@lemmy.world · 11 days ago

Accidental fentanyl exposure via the air or skin contact is medically and scientifically impossible.

3 people may have died but a fairytale story is not what killed them.

theunknownmuncher@lemmy.world · 14 days ago

This is only amplified by LLMs

theunknownmuncher@lemmy.world · 14 days ago

Google Search has long since diminished from it’s peak. 15 years ago we used to play a “game” by describing something as vaguely or strangely as possible and would be entertained to find it still on the first page of search results.

SEO and paid ranking caused a steady decline in Google search performance, and LLMs are the kill shot.

Indexing and ranking were already a solved problem, and conventional algorithms accomplished it much better than LLMs do. Unfortunately, LLMs need to find uses to justify investment, so now they must be used for any task that they can be used for, even if its worse than the solutions that we already had.

theunknownmuncher@lemmy.world · 27 days ago

How about literally anyone competent? Why is this always framed as a choice between the lesser of two bads? Of course Biden is not as bad as Trump, but no, I’m not “longing” for Biden.

theunknownmuncher@lemmy.world · 28 days ago

Americans longing for Sleepy Joe Biden

Doubt.

theunknownmuncher@lemmy.world · 1 month ago

It’s true and a real thing, but its also just a BS excuse to push for removal of body cameras.

theunknownmuncher@lemmy.world · 1 month ago

The most important question to ask when evaluating end-to-end encryption: who manages the keys?

If Facebook manages all of the keys and is responsible for telling which public key belongs to who, then of course Facebook can read every message.

theunknownmuncher@lemmy.world · 1 month ago

Quote me in full.

Okay!

You can run at scale, on huawei. You can also run it on a cpu

Yeah, that is absolutely not what you argued.

Anyway, you’ve conceded that I’m correct that you cannot run it at scale on a CPU, because running on CPU is too slow and inefficient, and that they instead use GPU hardware like Huawei GPUs to run the model at scale. That’s good enough for me!

theunknownmuncher@lemmy.world · 1 month ago

Yes, you can run it at scale.

at scale

Shift those goalposts! We went from “at scale” to “it still runs”

theunknownmuncher@lemmy.world · edit-2 1 month ago

You’ve proved my point that you don’t know what you’re talking about by blindly linking to the git repo. Couldn’t find any source that supports your claim? I wonder why.

Sure you can serve one request at a time to one patient user at a slow token per second rate, which makes running locally viable, but there is no RAM that has the bandwidth to run this model at scale. Even flash would be incredibly slow on CPU with multiple requests. You’d need the high bandwidth of VRAM and to run across multiple GPUs in a scalable way, it requires extremely high bandwidth interconnects between GPUs.

theunknownmuncher@lemmy.world · edit-2 1 month ago

Nope! You don’t know what you’re talking about. At all. But you can have fun running a 1.6 trillion parameter model on CPU at basically 0 tokens per second at scale, MoE or not.

theunknownmuncher@lemmy.world · 1 month ago

I mean, sure. You could also run it by drawing marks in sand. It doesn’t make any sense to do either, though.