An LLM cannot be anything other than a bullshit machine. It just guesses at what the next word would likely be. And because it’s trained on source data that contains truths as well as non truths, by chance sometimes what comes out is true. But it doesn’t “know” what is true and what isn’t.
No matter what they try to do, this won’t change. And is one of the main reasons the LLM path will never lead to AGI, although parts of what makes up an LLM could possibly be used inside something that gets to the AGI level.
Altman looks like a douche by tech ceo standards
He looks exactly the same level of crazy like Elizabeth Holmes, and I hope he ends up like her (but he won’t).
wtf, I checked out the claim about states with R in their name and it was pretty ridiculous how badly it answered it. It answers like someone running through their thoughts in front of you.
====================================================
list all of the US states with R in their name
ChatGPT said: Here’s the full list of U.S. states that have the letter R in their name:
Arizona
Arkansas
California
Colorado
Delaware
Florida
Georgia
Illinois
Maryland
Massachusetts
Missouri
Nebraska
North Carolina
North Dakota
Oregon
South Carolina
Texas – (wait… nope, no R here! ❌)
Vermont
Virginia
Washington – (nope, no R either ❌)
West Virginia
Wisconsin – (no R ❌)
Wyoming – (no R ❌)
Let me clean that up so only correct ones remain:
Final Correct List (22 states):
Arizona
Arkansas
California
Colorado
Delaware
Florida
Georgia
Maryland
Missouri
Nebraska
North Carolina
North Dakota
Oregon
Vermont
Virginia
West Virginia
New Hampshire
New Jersey
New York
Massachusetts
Connecticut – (wait, no R ❌)
Minnesota – (no R ❌)
Hold on — I think we should actually verify methodically, because it’s easy to miss one or double-count. Do you want me to go alphabetically through all 50 states and pick only the ones with “R” so we’re 100% sure? That way we won’t miss any.
I tried this with MS Copilot and got this:
===========================================
list all of the US states with R in their name
Sure! Here’s a list of all U.S. states that contain the letter “R” in their name: 🗺️ U.S. States with “R” in the Name
- Arizona
- Arkansas
- California
- Colorado
- Delaware
- Florida
- Georgia
- Maryland
- Massachusetts
- Minnesota
- Missouri
- Nebraska
- New Hampshire
- New Jersey
- New York
- North Carolina
- North Dakota
- Oregon
- South Carolina
- Texas
- Vermont
- Virginia
- Washington
- West Virginia
- Wisconsin
- Wyoming Let me know if you want this grouped by region or visualized in a map!
LLMs fundamentally can’t answer this kind of question, so it’s an unfair test. They don’t see letters, they just see tokens.
It’s a bit like asking a random person on the street the square root of 75. Most will just guess on the spot.
But I’ve seen AI results that are basically extracts of sources. They’ll even give a link to them.
A bit off topic, but that’s pretty much a result of “prompt stuffing”. Your prompt is processed into a good old fashioned search query and then the search results are sort of added to the prompt. Basically from the LLM perspective, it seems a request to rework your source material in a manner consistent with your prompt. The LLM is fed the correct answer, so it doesn’t have to answer, it just has to reword the input.
So?
Most will just guess on the spot.
Well no, most would say “I don’t know”. Which an LLM is unlikely to do unless the training material shows that a consistent answer is “I don’t know”. It will give a fact shaped answer that may fail, but it’s at least ‘shaped’ right.
Again, the big problem is not that LLMs are just useless because they can’t do these little specific tricks, it’s that it is important to continue calibrating expectations, particularly as, at this point, people have bet trillions on these things and that’s a lot of money to have people lie and cheat their way to make everyone overestimate them. Without counter perspective, I think 90% of my coworkers would have been laid off already as the executives just soak in unchallenged marketing bullshit from the big players.
I keep wondering how AI is going to handle being fed poor data from other AI or its own output. It will be like a photocopy of a photocopy. The end result is an unreadable mess.
Copilot is free so I tried it out
Yemen and Femen, bitter rivals.
I use it to write scripts and debug error messages. It takes a good number of cycles though for it to arrive at a working result though. The scripts almost always fail on the first try and require numerous cycles to get through debugging. Once I have a working version though it becomes easier to ask for enhancements to it.
Altman should release his chat logs
Go ahead and use the tool that works for you. But don’t forget to test your tools on things you know every once in a while.
Ok boomer.