Does anyone think the current AI approach will hit a dead end?

22 points by rh121 a day ago

Billions of dollars spent, incredible hype that we will have AGI in several years. Does anyone think the current deep learning / neural net based AI approach will eventually hit a dead end and not be able to deliver its promises? If yes, why?

I realize this question is somewhat loosely defined. No doubt the current approach will continue to improve and yield results so it might not be easy to define "dead end".

In the spirit of things, I want to see whether some people think the current direction is wrong and won't get us to the final destination.

petargyurov 17 hours ago

I think it already has.

We'll get more incremental updates and nice features:

* more context size

* less hallucinations

* more prompt control (or the illusion of)

But we won't get AGI this way.

From the very beginning LLMs were shown to be incapable of synthesising new ideas. They don't sit there and think; they can only connect dots within a paradigm that we give them. You may give me examples of AI discovering new medicines and math proofs as a counter-argument but I see that as re-enforcing the above.

Paired with data and computional scaling issues, I just don't see it happening. They will remain a useful tool, but won't become AGI.

And whether they stay affordable is a question of time; all the big players are burning mountains of cash just to edge out the competition in terms of adoption.

Is there a level of adoption that can justify the current costs to run these things?

Terr_ 16 hours ago

> incapable of synthesising new ideas
I'd argue they don't synthesize any ideas, even old ones. They skip that classic step to emit text, and the human reading that text generates their own idea and (unconsciously, incorrectly) assumes there must've an original idea that caused the text on the other side.
So perhaps it's more like: "LLMs aren't great at indirectly triggering humans into imagining useful novel ideas." (Especially when the user is trying to avoid having to think.)
Yeah, I know, it sounds like quibbling, but I believe it's necessary. This whole subject is an epistemic and anthropomorphic minefield. A lot of our habitual language connotations and metaphors can mislead.
- drweevil 12 hours ago
  
  Exactly this. LLMs synthesize language, based solely on statistical data, shorn of semantics. The "magic" happens when we humans re-insert the semantics into the LLM's output--because that is why language is used: to convey meaning--and assume the meaning was there all along.
  - grosswait 12 hours ago
    
    Aren’t the semantics in the statistical data?
    
    drweevil 7 hours ago
    
    No. One can safely infer the semantics, because how we generally use language is what is encoded, and that how is closely related to the semantics. I imagine that this is why so-called "hallucinations" occur, when there is a subtle (or not so subtle) disconnect between the usage statistics and the semantics. (For example, satire or sarcasm isn't understood by the LLMs, so we get advice like use glue to make cheese 'stick' to pizza.)
  - solardev 11 hours ago
    
    Can you prove that statistics cannot encode semantics?
    
    Terr_ 8 hours ago
    
    Compare: "Can you prove that alien explorers cannot make contact with us?"
    Nobody has the tools to begin proving a negative [0] in either of those cases, and it's possible they'll eventually occur... But so what?
    Just because it could happen someday does not mean it's happening now. Instead, we have decades of seeing humans excite themselves into perceiving semantics that aren't present [0], and nobody's provided a compelling reason to believe that this time things are finally different.
    [0] https://en.wikipedia.org/wiki/Burden_of_proof_(philosophy)
    [1] https://en.wikipedia.org/wiki/ELIZA_effect
    
    solardev 4 hours ago
    
    I don't think this is the unprovable you think it is?
    If LLMs and statistics can't encode semantics, how can do chatbots perform long-form translations with appropriate contexts? How do codebreakers use statistics to break an adversary's communications?
    Sometimes the statistics are semantic, like when "orange" and "arancia" the picture of that fruit all mean the same thing, but Orange the wireless carrier and orange the color are different. Those are connections/probabilities humans also learn via repeated exposure in different contexts.
    I'm not arguing that LLMs are synthesizing new ideas (or old ones), but that they ARE capable of deriving semantic meaning from statistics. Rather than:
    > language, based solely on statistical data, shorn of semantics
    Isn't it more like:
    > language, based solely on statistical data, with meanings emerging from clusters in the data
- AnimalMuppet 12 hours ago
  
  Well... ideas are actually encoded (to some degree) in the words in the training data. So when they synthesize new text, they are, to a degree, synthesizing new ideas.
  To a degree. The problem is that they don't actually understand the ideas in the training data. (Yeah, you can say we don't know how humans actually understand ideas. True, but not the point. However we understand ideas, LLMs don't do that.) And so they can only synthesize new ideas by rearranging words. This is much less than that human thinking. In particular, it seems that it could only generate ideas that are only new recombinations, not breakthrough ideas.
  - Terr_ 7 hours ago
    
    > Well... ideas are actually encoded (to some degree) in the words in the training data. So when they synthesize new text, they are, to a degree, synthesizing new ideas.
    I don't think that follows: Manipulating a (lossy, imperfect) encoding [0] isn't the same as manipulating the thing it was intended to evoke.
    If it is true, then... Well, it's not true in the same way anybody is excited about, because it means "synthesizing new ideas" is something we've been able to do for many decades and which you can easily script up right now at home [1].
    [0] https://en.wikipedia.org/wiki/Encoding_(semiotics)
    [1] https://benhoyt.com/writings/markov-chain/

tim333 4 hours ago

I don't think scaling LLMs will get there but there are other ways to use neural nets. For example MuZero uses a deep neural network to learn games through self play and:

>Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi [a similar Japanese board game] as well as Go, and convincingly defeated a world-champion program in each case https://www.theguardian.com/technology/2017/dec/07/alphazero...

And MuZero did similar, surpassing AlphaGo. So while LLMs may be bad at thinking and learning, other neural net arrangements may do well.

10111two 10 hours ago

Yes. Current deep learning is powerful but fundamentally statistical. It classifies, predicts, and optimizes, but it doesn’t cognize. The assumption is that by scaling parameters and data, generalized intelligence will “emerge.” There is a big possibility that the fundamental framework of traditional AI is incomplete. It’s roots are inspired by Hebbian Learning, which itself is descriptive in nature. A good analogy is to think about Newton’s law of Gravity, works tremendously work, but it couldn’t explain the orbit of mercury. No matter how much we tweaked the maths, it didn’t fit. The framework was incomplete. It took Einstein’s general relativity – a new framework who eventually explained it. If the framework itself doesn’t allow for the truth we are after, we can spend trillions, it not gonna happen. It like having a design of motorcycle and somehow expecting it for fly. Currently, in AI landscape our efforts are being poured to build ramps. So that we can create an illusion that we have a flying machine. It can be argued that its also flying but claims are very misleading. At JN Research, we took a principle-first route. We built Adaptrons - artificial neurons that actually exhibit graded potentials, subthreshold states, and action potentials. When networked, even small systems (hundreds to thousands of units) show memory formation, dreaming, anticipation, and original thought generation. This is not ML, not symbolic AI. It’s a new substrate for cognition. If you’re interested, check us out here: https://jn-research.com

elmerfud a day ago

I would say it already has hit a dead end. We're simply an imperiod of scale right now but the intrinsic problems with how the algorithms work can't be overcome by small tweaks in the algorithms.

This is seen by what people term as hallucinations. AI seeks to please and will lie to you and invent things in order to please you. We can scale it up to give it more knowledge but ultimately those failures still creep in.

There will have to be a fundamentally new design for this to be overcome. What we have now is an incredible leap forward but it has already stalled on what it can deliver.

ManlyBread 15 hours ago

GPT-5 has convinced me that this is already the case. 5 is a major bump-up in terms of version name and yet it is impossible to tell what exactly has improved over the previous version by simply using the product.

On top of that we're at what, year four of the AI "revolution"? And yet ChatGPT is still the only AI tool that is somewhat recognizable and used by the general public. Other AI-based tools are either serving a niche (Cursor, Windsurf), serve as toys (DALL-E, Veo) or are so small and insignificant that barely anyone is using them. If I go to any job board and browse the job offers no company seems to be naming any specific AI-powered tools that they want people to be proficient in. I don't think I've ever seen any company - big or small - either bragging that they've used generative AI as a significant driver in their project or claiming that thanks to implementing AI they've managed to cut x% costs or drive up their revenue by y%. Open source doesn't seem to have much going on either, I don't think there are examples of any projects that got a huge boost from generative AI.

Considering how much money was pumped into these solutions and how much buzz this topic has generated all over the internet in the past 4 years it seems at least bizarre that the actual adoption seems to be so insignificant. In many other areas of tech 4 years would be considered almost an eternity and yet this technology somehow gets a pass. This topic has puzzled me a for a while now but only in this year I've noticed other people pointing out these issues as well.

raphaeltm 12 hours ago

Certainly feels to me like it already has. I like the back and forth on the Big Technology Podcast about whether the model layer or application layer will "win". I think with the release of GPT-5 I'm more and more convinced that the application layer is what will matter more than the actual models. We'll find ways to get around the limitations by building systems that adapt to those limitations by wiring together different models for different use cases. Throwing more data and compute at training feels like it's over IMO.

bobosha 14 hours ago

I largely share Yann LeCun’s perspective that scaling LLM-based approaches will eventually hit a plateau, and that a paradigm shift will be necessary. While there is ongoing debate about what that next paradigm should be, I outline my own views on the subject in this paper [1].

[1] https://www.researchgate.net/publication/381009719_Hydra_Enh...

speedylight 11 hours ago

I think there will come a point where people realize that we will need several ground breaking research papers (not unlike the “Attention is All you Need”) in order build a truly conscious intelligence.

Language is only one aspect of a conscious mind, there are others like the ones that handle executive function, spatial and logical thinking, reasoning, emotional regulation, and many others. LLMs only deal with the language part and that’s not nearly enough to build a true AGI— a conscious mind that lives inside computer that we can control.

Intelligence is an emergent property that comes as a result of all distinct functions of the brain (whether biological or artificial) being deeply intertwined.

10111two 10 hours ago

You raise several good points, and I completely agree. Intelligence is an emergent property but there are some prerequisite to intelligence. Memory formations, original thought generation (imagination and creativity), anticipation, multifunctional trigger ability (think of how our one sense triggers responses of other senses). These emergent properties result in intelligence. We are exploring this principle first approach at JN Research. If you are interested, i would recommend this blog post: https://jn-research.com/blog/f/beyond-ai-%E2%80%93-introduci...
- speedylight 10 hours ago
  
  Hey, thank you for linking the blogpost it was super interesting and aligns very well with my own theories/framework in regard to how we might be able to build a truly intelligent mind. I honestly wish I could be involved in this kind of research because I think I can have a positive impact on the field, I just lack the technical skills (for now) to conduct the research.
  Anywho, I think JN Research is onto something with the Adaptrons/Primite and hope you guys are able to take it as far as it’ll go.
615341652341 11 hours ago

I have this pet theory that LLMs are closer to how our brains generate dreams than how consciousness emerges naturally from the brain using all the other aspects you mentioned.
- tim333 8 hours ago
  
  To me they seem like the rapid language generation bit of the brain where you say stuff without thinking about it. They are good at that but bad at thinking, remembering, visualizing and the like.
  The DeepDream stuff seemed quite dreamlike (https://en.wikipedia.org/wiki/DeepDream)

matt3D 18 hours ago

Watching my children learn how to talk, I have come to the conclusion that the current LLM concept is one part of a two part problem.

Kids learn to speak before they learn to think about what they're saying. A 2/3 year old can start regurgitating sentences and forming new ones which sound an awful lot like real speech, but it seems like it's often just the child trying to fit in, they don't really understand what they're saying.

I used to joke my kids talking was sometimes just like typing a word on my phone and then just hitting the next predictive word that shows up. Since then it's evolved in a way that seems similar to LLMs.

The actually process of thought seems slightly divorced from the ability to pattern match words, but the patter matching serves as a way to communicate it. I think we need a thinking machine to spit out vectors that the LLM can convert into language. So I don't think they are a dead end, I think they are just missing the other half of the puzzle.

kuekacang 17 hours ago

Another part are malleable memory. Something I imagine we as humans are accumulating context daily and doing reinforcement training while we sleep.

efortis 21 hours ago

Given that post, this is what ChatGPT-5 said: "… But achieving AGI purely by scaling current architectures might not happen. The field may need conceptual shifts—new structures or paradigms—rather than just bigger models."

I don’t know AI, but I’m of the few that’s grateful for what it is at the moment. I'm coding with the free mini model and it has saved me a ton of time and I’ve learned a lot.

ericyer 17 hours ago

It’s possible, but not certain. Current AI (like large language models) has shown incredible progress, yet it still relies heavily on scale—more data, more compute. That approach may eventually plateau if models stop gaining meaningful capabilities from just being bigger. Breakthroughs in reasoning, efficiency, and real-world understanding might require new architectures or hybrid methods that combine symbolic reasoning, memory, or other innovations. So while today’s approach can go further, it likely isn’t the final destination.

AngryData 18 hours ago

I think it has its uses, but that 90% of what people think it will be used for or replace won't happen. I don't believe LLMs is a path to general AI at all. Im also unsure if it will actually get that much better as time goes on and expect continuously diminishing returns as junk data from other AI instances, web bots, and people trying to manipulate AI responses creeps in.

But I could be totally wrong because im certainly not an expert in these fields.

zarzavat 17 hours ago

Do we not have general AI already? What would models need to do for it to be general AI?
- netdevphoenix 16 hours ago
  
  General AI to me includes the ability to learn any skills on your own with minimal examples and no support. Intelligence to me means the ability to learn and general to me implies any skills. The nature of the hallucinations do suggest that whatever these systems are doing is not learning. And due to the innate limitations they have as neural nets and transformers, their ability to learn is greatly limited by their own architecture
  - zarzavat 8 hours ago
    
    I guess my point is, if 20 years ago you'd said to me "In the future there'll be a CLI program where you type into a box and it writes your code for you, and it's as good as a well-studied but inexperienced junior, I'd have said "that's AGI!" with no hesitation. Shortly followed by "I should go to med school instead".
    The definition of AGI is very subjective. Clearly, current models are not as good as the top humans, but the progress in the last 20 years has been immense, and the ways in which these models fall short are becoming subtle and hard to articulate.
    Yes transformers do not learn in the same way as humans do, but that's in part an intentional design decision, because humans have a big flaw: our memories can't be separated from our computation, they can't be uploaded, downloaded, modified or backed up. Do we really want AGI to have a bus factor? With transformers you can take the context that is fed to one model and feed it to another model, try doing that with a human!
    In a sense, the transformer model with its context is an improvement on the architecture of the human brain. But we do need more tricks to make context as flexible as human long term memory.

aristofun 14 hours ago

You don’t have to be super smart or waste billions to understand a simple logical fact - you cannot achieve something that isn’t even defined well yet.

What exactly is intelligence? Nobody really knows and understands yet where the “natural” comes from.

Hence all we do so far is nothing but a sophisticated cargo culting.

Case closed.

giardini a day ago

Yes. I gazed into my 8-ball at https://magic-8ball.com/ asked it "Will AI with LLMs fail?" and shook it. It responded "Most likely".

In my future I also saw lots and lots of cheap GPU chips and hardware, much gaming but fewer "developers" and a mostly flat-lined software economy for at least 8 years. Taiwan was still independent but was suffering from an economic recession.

solatrader 17 hours ago

DeepConf,photonic chip... New things and improvements are still coming. And most of AI products are not so well enginieered yet. According to the speed of progress made this year, it's too early to say it's a dead end. There might be some stones missing for AGI, but that doesn't mean what has been built so far is wrong.

k310 21 hours ago

It has run out of data, feeding increasingly on its own output (with help from bots that tarnish and bias it)

The first big settlement for using stolen data has come (Anthropic). How you extricate the books archive and claimamants' works is unknown.

I believe that LLM's in verticals are being fed expert/cleaned data, but wasn't that always the case, i.e. knowledge bases? Much less data and power needed (less than ∞) Oh, and much less investment, IMO.

JohnFen 11 hours ago

Well, it all depends on what is meant by "AGI". There are too many, very different, definitions of what that means for the term to be very useful in precise discussion).

neuralkoi 20 hours ago

It takes time for new technologies to mature. I think people are only looking out for AGI but are not paying attention to the small changes in productivity and exploration these tools are enabling at smaller scales, including in the more unglamorous machinery that makes everything chug along.

Bombthecat 17 hours ago

Nah, not at all, there is so much going on behind the curtain. Like OpenAI finding the reason for hallucinations. ( Aka forcing replies even though it doesn't know the answer)

Philex 15 hours ago

Scaling law will eventually come to an end. Perhaps new technologies will emerge in the future.

karmakaze 11 hours ago

Meta already defined 'dead end' with Behemoth being underwhelming.

I'm bullish on AI's being generally capable by 2030. That date seems to be a 50/50 line for many in the field.

bloak 11 hours ago

But what does "generally capable" mean?
- karmakaze 4 hours ago
  
  When lay people stop differentiating what it can do vs what people do. This will happen long before those in the know say so.

jaredcwhite 9 hours ago

That already happened. Now we're not debating if scaling efforts run up against reality like physics, math, and finance, but what will be the economic and societal fallout.

Personally, I fear the worst.

The bubble is bursting. Hope y'all are alright in the midst of it.

atleastoptimal 20 hours ago

It won't but it seems 95% of people on HN think (hopes) it will because they hate AI and much of big tech

Terr_ 16 hours ago

The amount of cynicism motivated by amateur hatred is way way less than the amount of optimism generated by professional greed and profit motive.
Imagine someone in 2006: "All these cynics and doomsayers about the US housing market are just envious and bitter that they didn't get in on the ground floor with their own investments."
Perhaps some were, but if we're going to start discounting opinions based on ulterior motives, there's a much bigger elephant in the room.
guappa 17 hours ago

That is only your feeling, I can only guess that you have a somewhat irrational love for AI and refuse to see the rationality of other people and just brush them away as "they are irrational h8r".
But all of your comments seem to be dismissive of other people's opinions.
Bombthecat 17 hours ago

Not hate, they want to keep their jobs lol

breckenedge a day ago

What’s the “final destination”?

giardini a day ago

It does sound sinister, now that you've pointed it out.
Nice try, Claude!
- breckenedge 10 hours ago
  
  You’re absolutely right!

thiago_fm 12 hours ago

If you are asking about LLMs, yes we will. We have hit it already.

All those LLMs benchmarks are terrible. LLMs gets better at it, but users don't perceive it. LLMs haven't improved that much the last year.

For AI in general, the future is bright. Now we have a lot of brain power and hardware available, more new research will pop up.

A LONG TIME AGO, Claude (your favorite LLM model?) Shannon has shown that entropy is a fundamental limit. There may be limitations we aren't aware of 'intelligence'.

Despite what experts say, Superintelligence or AGI might not even exist.

Is AGI knowing all the possible patterns in the universe? Nobody can even properly define it. But it is wrong, as not every intelligent thing isn't a pattern.

But are cars going to drive themselves by using similar inputs than a human? Yes, probably soon

Also many improvements to machinery, factories and productivity. They will shape the economy to a new format. No superintelligence or AGI needed. Just 'human'-level pattern recognition.

netdevphoenix 16 hours ago

Problem is that most technologies don't hit a visible "dead end". Look at NLP before transformers, cars, planes, steel tech, wood tech and even books. What you have is a steady slowdown in the number of revolutionary discoveries and just long list of marketing hyped small improvements.

There are fundamental limitations with transformers that will not go away for as long as AI equates transformers.

The first one is the lack of understanding/control by humans. Orgs want guarantees that these systems won't behave unexpectedly while also wanting innovative and useful behaviour from them. Ultimately, most if not all neural nets are black boxes so understanding the reasoning for a specific action at a given time, let alone their behaviour in general is just not feasible due to their sheer complexity. We just don't understand why the behave the way they do in a scientific way anyway than we understand why a specific rabbit did a specific action at that particular moment in a way that can be used to make accurate predictions about when it will do that action again. Due to our lack of understanding, we just cannot control these things accurately. We either block some of their useful abilities to reduces the changes of undesired behaviour or you are you exposed to it. This trade-off is just a fundamental limitation of the fact that transformers are used nowadays are neural nets and as such have all the limitations that they have.

The second one is our inability to completely stop the hallucinations. From what my understanding, this is inherently tied to the very nature of how transformers based LLMs produce output. There is no understanding of the notion of truth or real world. It's just emulating patterns seeing in its training data, it just so happens that some of those don't correlate with real world facts (truth) even if they correlate with human grammar. In so far as there is no understanding of the notion of truth as separate from patterns in data, however implicit, hallucinations will continue. And there is no reason to believe that we will come up with a revolutionary way to train these systems in a way that they understand truth and not just grammar.

The third one is learning, models can't learn or remember as such, context learning is a trick to emulate learning but it's extremely inefficient and not scalable and models don't really have the ability to manipulate it the way humans or other animals can do. This is probably the most damning of them all as you cannot possible have a human level General Artificial that is unable to learn new skills on its own.

I would bet money on there not being significant progress before 2030. By significant progress I mean, the ability to do something that they could not do before at all regardless of the amount of training thrown at them given the same computing resources we have now.

jiggawatts 17 hours ago

Up until now, LLM training has used mostly pre-existing data, hoarding what has already been produced and feeding that in as-is. Think textbooks, Wikipedia, GitHub, etc...

That's starting to run dry, hence the predictions of progress stalling, but it doesn't factor in the option of using vast volumes of synthetic data. Some of the "thinking" models are already using generated problem sets, but this is just the tip of the iceberg.

There are so, so many ways in which synthetic data could be generated! Some random examples are:

- Introduce a typo into a working program or configuration file. Train the AI to recognise the typo based on the code and the error message.

- Bulk-generate homework or exam problems with the names, phrasing, quantities, etc... randomised.

- Train the AI not just on GitHub repos, but the full Git history of every branch, including errors generated by the compiler for commits that fail to build.

- Compile C/C++ code to various machine-code formats, assembly, LLVM IR, etc... and train the AI to reverse-engineer from the binary output to the source.

... and on and on.

sherwinjesuraj 13 hours ago

[dead]

aaron695 10 hours ago

[dead]