@localhost

localhost@beehaw.org · 3 months ago

Oh damn, you’re right, my bad. I got a new notification but didn’t check the date of the comment. Sorry about that.

localhost@beehaw.org · 3 months ago

That’s a 1 month old thread my man :P

But sounds interesting, I haven’t heard of Dysrationalia before. Quick cursory search shows that it’s a term that has been coined mostly by a single psychologist in his book. I’ve been able to find only one study that used the term and it found that “different aspects of rational thought (i.e. rational thinking abilities and cognitive styles) and self-control, but not intelligence, significantly predicted the endorsement of epistemically suspect beliefs.”

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6396694/

All in all, this seems to me more like a niche concept used by a handful of psychologists rather than something widely accepted in the field. Do you have anything that I could read to familiarize myself with this more? Preferably something evidence-based because we can ponder on non-verifiable explanations all day and not get anywhere.

localhost@beehaw.org · 4 months ago

The author’s suggesting that smart people are more likely to fall for cons that they try to dissect but can’t find the specific method being used, supposedly because they consider themselves to be infallible.

I disagree with this take. I don’t see how that thought process is exclusive to people who are or consider themselves to be smart. I think the author is tying himself into a knot to state that smart people are actually the dumb ones, likely in preparation to drop an opinion that most experts in the field will disagree with.

localhost@beehaw.org · 6 months ago

So you’re basically saying that, in your opinion, tensor operations are too simple of a building block for understanding to ever appear out of them as an emergent behavior? Do you feel that way about every mathematical and logical operation that a high school student can perform? That they can’t ever in whatever combination create a system complex enough for understanding to emerge?

localhost@beehaw.org · 6 months ago

I don’t think that anyone would argue that the general public can even solve a mathematical matrix, much less that they can only comprehend a stool based on going down a row in a matrix to get the mathematical similarity between a stool, a chair, a bench, a floor, and a cat.

LLMs rely on billions of precise calculations and yet they perform poorly when tasked with calculating numbers. Just because we don’t calculate anything consciously to get a meaning of a word doesn’t mean that no calculations are actually done as part of our thinking process.

What’s your definition of “the actual meaning of the concept represented by a word”? How would you differentiate a system that truly understands the meaning of a word vs a system that merely mimics this understanding?

localhost@beehaw.org · 6 months ago

technology fundamentally operates by probabilisticly stringing together the next most likely word to appear in the sentence based on the frequency said words appeared in the training data

What you’re describing is Markov chain, not an LLM.

So long as a model has no regard for the actual you know, meaning of the word

It does, that’s like the entire point of word embeddings.

localhost@beehaw.org · 7 months ago

Your opening sentence is demonstrably false. GTP-2 was a shitpost generator, while GPT-4 output is hard to distinguish from a genuine human. Dall-E 3 is better than its predecessors at pretty much everything. Yes, generative AI right now is getting better mostly by feeding it more training data and making it bigger. But it keeps getting better and there’s no cutoff in sight.

That you can straight-up comment “AI doesn’t get better” at a tech literate sub and not be called out is honestly staggering.

localhost@beehaw.org · 7 months ago

I don’t think your assumption holds. Corporations are not, as a rule, incompetent - in fact, they tend to be really competent at squeezing profit out of anything. They are misaligned, which is much more dangerous.

I think the more likely scenario is also more grim:

AI actually does continue to advance and gets better and better displacing more and more jobs. It doesn’t happen instantly so barely anything gets done. Some half-assed regulations are attempted but predictably end up either not doing anything, postponing the inevitable by a small amount of time, or causing more damage than doing nothing would. Corporations grow in power, build their own autonomous armies, and exert pressure on governments to leave them unregulated. Eventually all resources are managed by and for few rich assholes, while the rest of the world tries to survive without angering them.
If we’re unlucky, some of those corporations end up being managed by a maximizer AGI with no human supervision and then the Earth pretty much becomes an abstract game with a scoreboard, where money (or whatever is the equivalent) is the score.

Limitations of human body act as an important balancing factor in keeping democracies from collapsing. No human can rule a nation alone - they need armies and workers. Intellectual work is especially important (unless you have some other source of income to outsource it), but it requires good living conditions to develop and sustain. Once intellectual work is automated, infrastructure like schools, roads, hospitals, housing cease to be important for the rulers - they can give those to the army as a reward and make the rest of the population do manual work. Then if manual work and policing through force become automated, there is no need even for those slivers of decency.
Once a single human can rule a nation, there is enough rich psychopaths for one of them to attempt it.

There are also other AI-related pitfalls that humanity may fall into in the meantime - automated terrorism (e.g. swarms of autonomous small drones with explosive charges using face recognition to target entire ideologies by tracking social media), misaligned AGI going rogue (e.g. the famous paperclip maximizer, although probably not exactly this scenario), collapse of the internet due to propaganda bots using next-gen generative AI… I’m sure there’s more.

localhost@beehaw.org · 1 year ago

GPT3 is pretty bad at it compared to alternatives (although it’s hard to compete with calculators on that field), but if it was just repeating after the training dataset it would be way worse. From the study I’ve linked in my other comment (https://arxiv.org/pdf/2005.14165.pdf):

On addition and subtraction, GPT-3 displays strong proficiency when the number of digits is small, achieving 100% accuracy on 2 digit addition, 98.9% at 2 digit subtraction, 80.2% at 3 digit addition, and 94.2% at 3-digit subtraction. Performance decreases as the number of digits increases, but GPT-3 still achieves 25-26% accuracy on four digit operations and 9-10% accuracy on five digit operations, suggesting at least some capacity to generalize to larger numbers of digits.

To spot-check whether the model is simply memorizing specific arithmetic problems, we took the 3-digit arithmetic problems in our test set and searched for them in our training data in both the forms " + =" and " plus ". Out of 2,000 addition problems we found only 17 matches (0.8%) and out of 2,000 subtraction problems we found only 2 matches (0.1%), suggesting that only a trivial fraction of the correct answers could have been memorized. In addition, inspection of incorrect answers reveals that the model often makes mistakes such as not carrying a “1”, suggesting it is actually attempting to perform the relevant computation rather than memorizing a table.

localhost@beehaw.org · 1 year ago

In my comment I’ve been referencing https://arxiv.org/pdf/2005.14165.pdf, specifically section 3.9.1 where they summarize results of the arithmetic tasks.

localhost@beehaw.org · edit-2 1 year ago

That’s not entirely true.

LLMs are trained to predict next word given context, yes. But in order to do that, they develop internal model that minimizes error across wide range of contexts - and emergent feature of this process is that the model DOES perform more than pure compression of the training data.

For example, GPT-3 is able to calculate addition and subtraction problems that didn’t appear in the training dataset. This would suggest that the model learned how to perform addition and subtraction, likely because it was easier or more efficient than storing all of the examples from the training data separately.

This is a simple to measure example, but it’s enough to suggests that LLMs are able to extrapolate from the training data and perform more than just stitch relevant parts of the dataset together.