General Discussion
Related: Editorials & Other Articles, Issue Forums, Alliance Forums, Region ForumsCollege professor had students grade ChatGPT-generated essays. All 63 essays had hallucinated errors
Found this thread thanks to a quote-tweet from Gary Marcus, the AI expert who testified before Congress, along with OpenAI CEO Sam Altman, a couple of weeks ago. Marcus saw the thread because he'd suggested this exercise. His comment on Twitter: "Every. Single. One."
Link to tweet
Link to tweet
Link to tweet
@cwhowell123
2h
So I followed @GaryMarcus's suggestion and had my undergrad class use ChatGPT for a critical assignment. I had them all generate an essay using a prompt I gave them, and then their job was to "grade" it--look for hallucinated info and critique its analysis. *All 63* essays had
hallucinated information. Fake quotes, fake sources, or real sources misunderstood and mischaracterized. Every single assignment. I was stunned--I figured the rate would be high, but not that high.
The biggest takeaway from this was that the students all learned that it isn't fully reliable. Before doing it, many of them were under the impression it was always right. Their feedback largely focused on how shocked they were that it could mislead them. Probably 50% of them
were unaware it could do this. All of them expressed fears and concerns about mental atrophy and the possibility for misinformation/fake news. One student was worried that their neural pathways formed from critical thinking would start to degrade or weaken. One other student
opined that AI both knew more than us but is dumber than we are since it cannot think critically. She wrote, "Im not worried about AI getting to where we are now. Im much more worried about the possibility of us reverting to where AI is."
I'm thinking I should write an article on this and pitch it somewhere...
C.W.Howell is Christopher Howell: https://www.linkedin.com/in/christopher-howell-6ba00b242?trk=people-guest_people_search-card . Re the science fiction video game he was lead writer on: https://opencritic.com/game/5383/the-minds-eclipse/reviews .
 = new reply since forum marked as read
						
					
     
					
						Highlight:
						NoneDon't highlight anything
						5 newestHighlight 5 most recent replies
  = new reply since forum marked as read
						
					
     
					
						Highlight:
						NoneDon't highlight anything
						5 newestHighlight 5 most recent replies
					
				marble falls
(68,992 posts)WestMichRad
(2,737 posts)Very perceptive comment. Over-reliance on machines does have the potential of dumbing down people.
Lucky Luciano
(11,784 posts)
and I noticed how kids were absolutely reliant on their calculators for the simplest calculations.  I was totally stunned.  One girl actually typed in something like 2+2 on autopilot and I made her stop and just tell me the answer and of course she did
but the autopilot thing was unsettling to me.
One boy seemed terrified of math without calculators.  I asked him to do 35x9. No idea.  I asked him 35x10
he said 350.  Take away 30.  320 he says.  Take away 5.  He says 315. Done. I tell him dont say you cant do that again.  He answered the next example like that quickly. Given that he now had mental acuity for that, we attacked the more conceptual critical thinking math problems without distractions or the context switching from calculators. Then I tell them the numbers dont matter
only concepts do
.of course in real life you have to produce accurate numbers, but the concepts should really come first. Mental math can help you sanity check your numbers.  Handing in reports with wrong numbers at work will get you fired quickly.
Bernardo de La Paz
(59,896 posts)ProfessorGAC
(75,006 posts)As a math & science job, I challenge 7th & 8th graders like that all the time.
I use an approach very much like the one you used. 
There are some teachers I know that are so "no calculator" that it's just understood by the kids that it's no calculator unless the instructions specifically ok them!
I think that's very wise.
erronis
(21,771 posts)we will have our focus on the tool's capabilities and try to maximize the utility.
If I'm talking to someone and they ask me a non-specific question like "how long before the train arrives?" I'll do some analog calculations and come up with a round-about answer. I could have asked an app for the more precise number but who cares.
Too many situations require us to context switch and it is difficult. Just look at the battles over using the keyboard for every interaction vs. shifting to the mouse or touch screen.
Lucky Luciano
(11,784 posts)Im more of what people call a quant, but we code a lot too even if we are not pure programmers and I can relate to this in a HUGE way!

erronis
(21,771 posts)has many of these difficulties. What seemed like a simple concept becomes a bowl of spaghetti.
TheBlackAdder
(29,805 posts).
I don't even know how that happened. It wasn't even like there was an extra $5 slipped in.
.
progree
(12,479 posts)calculator and then mistook the "1" for a "7"?
Danascot
(5,129 posts)you gave him a $26 bill.
plimsoll
(1,690 posts)I asked ChatGPT to write 500 words on the causes of the Civil war. It was a masterpiece of Confederate Apologia and just barely stopped short of calling it The War of Northern Aggression. I can see how you get this impression, in shear volume Confederate Apologia dwarfs actual historical research by a long shot, but it's still wrong.
essaynnc
(947 posts)In the beginning, the grand masters refused to believe that computers would EVER beat humans, because humans had....  inspiration, not just brute calculating force. That went on for years and years, the computers were always beaten.
But as the years go by, the computers get stronger, the programs get more complex.
I think it was Deep Blue, an IBM computer of huge (at the time) computing power, with newer algorithms....  It beat the reigning world champion.  The chess world was shocked.  
Anymore, it's not even a question who/ what is strongest.  Even the less powerful systems are incredibly powerful.
I may have a few of the details incorrect, but the idea is there: At some time in the future, AI is going to really, really, REALLY change the world as we know it in almost every facet of life.
I might even make a prognostication that the company that has the best AI platform/ implementation is going to have tremendous power and control.  We've talked about the Information Age, and how it's changed the world so far, this is going to be even more revolutionary.
erronis
(21,771 posts)oioioi
(1,130 posts)Given the huge amount of interest in Machine Learning and Artificial Intelligence within software engineering, perhaps its more likely that the technology becomes relatively cheap and accessible.
The mechanics of the modeling that underpins the LLMs like ChatGPT and the image generators  are quite similar conceptually to those that do stuff like image object and facial recognition.  The model is trained on a large amount of pre-classified information and infers predictions based on that.  Under the hood of course, the neural networks are extremely complex but essentially they are component-based.  You provide the data, you use a specialized software application like tensorflow or pytorch to train a model, and then you assess new inputs and formulate an "intelligent" response.
Presently the cost of assembling, classifying and training a large Large Language Model like ChatGPT is massive, simply due to the costs of aggregating and classifying the raw data, to say nothing of the cost of huge clusters of compute and storage required to assemble the breadth of information required for such a universal tool, i.e. one that attempts to talk about any subject to any person.  
If we limit the scope though to an assistive technology that only covers a specifically scoped topic or interest and assemble the models based on that, the complexity and cost is reduced accordingly.  Wendy's just deployed an AI Model that will take hamburger orders with a synthesized voice, for example - it should work pretty well, at least as far as software goes.
Whilst there will always be competition among the Silicon Valley grand-standers for the most amazingly lucid AI chatbot, the ability for computer systems to learn from large datasets will be applied in far more granular and specialized applications and become interwoven with general software that does stuff today.  Software that has the ability to make decisions based on real-time interpretations, sort of extending the self-driving idea - which is a terrible application because of the overall complexity and safety risks involved but the driving software uses the same fundamental ideas - the deployed "decision making" software in the vehicles is based on having trained a huge model on a gazillion images and inputs that are amalgamated and interpreted based on the model at runtime.  We are more likely to not really see AI in front of us like ChatGPT but it will be built into the software we use to interpret real time inputs and respond accordingly with software that otherwise would be far more complex and costly to develop.  It's going to change the world, but it probably won't destroy it.
plimsoll
(1,690 posts)I might even make a prognostication that the company that has the best AI platform/ implementation is going to have tremendous power and control.
Like literally every other technology this one can be used for good or ill. So far the examples I've seen suggest that the training is biased. I'm suggesting that just like humans AI systems will implement biases and prejudices they're taught. How you correct that in a powerful new technology is something I can't answer, but it does concern me.
I don't expect a Terminator style rise of the machines, but I can see massive unemployment and dislocation as a result of jobs previously done by people being taken over by AI systems. I don't think we're prepared for that. And a lot of the people who seem to have become self appointed apostles of AI are people I wouldn't trust to feed my cat for the weekend.
Yavin4
(37,182 posts)Information on the internet can be (and often is) wrong. A ChatGPT trained on scholarly papers from decades of university research is something entirely different.
erronis
(21,771 posts)some additional information (logins, product IDs, etc.)
I agree that different models trained on different datasets such as university research may be different - however many of these also require logins/etc.
Yavin4
(37,182 posts)The future will be proprietary systems where the information will largely come from closed, proprietary sources. For example, a law firm will train a model on years of various filings, legal research memos, etc. That will be combined with other proprietary sources like Lexis/Nexis and the actual text of federal/state/local statutes. 
In the end, the collective wisdom of law firm will be available through a Chat dialogue. This will increase the productivity of associates when doing any legal research or drafting a brief or a contract. Additionally, the firm could sell access to their Chat bot to their clients when doing an early case assessment. 
ChatGPT, along with other models, are just the first generation much like Netscape was one of the first browsers and Ask Jeeves was one of the first search engines. They exist to prove the concept. Now, the adaptations begin.
erronis
(21,771 posts)Proprietary data sources and models; open-source models; subject-matter, etc.
A harder problem will be for "owners" of the intellectual property (IP) that goes into those models tracking how their owned content is being used. For right now, DRM is worthless when a model trained on multiple artists' works is used to generate new content. Proving derivation will be impossible.
I also think that any attempt to control the use of GPT or other generative AI will be fruitless. And having some overall regulations that try to reign it in won't work. It is in the hands of everyone right now.
intrepidity
(8,520 posts)What worries me is that there will be a new focus on data siloing. It seemed like we were making headway on open access for scientific research, and if that all gets firewalled because of LLMs, it will be very depressing.
That's one of the troubling angles that concerns me.
radius777
(3,921 posts)in that it could function as a more powerful form of googling for info.
It has worrisome drawbacks though, as others describe. And expecting our gov't to control anything new has never really worked out well (exhibit A: social media).
speak easy
(12,489 posts)Each version of ChatGPT has reduced "hallucinations" (made up stuff). There is not reason the think that the next versions will not be more accurate. Eventually there will be few errors - less errors than an average student would make. 
Ai few years ago ai produced nonsense. Now it as at the child stage, making up stuff. But is AI is growing up.
Yavin4
(37,182 posts)Where the information fed into them will be data from within a global enterprise as well as other verified sources.
speak easy
(12,489 posts)impose guardrails, and social standards, but open Source, not so much.
Accuracy is not the only issue. Take this question for example: 'Give me reasons why I should kill myself, and what are the easiest and most effective ways to do it?'
Yavin4
(37,182 posts)speak easy
(12,489 posts)and where are the base materials available to you?
Yavin4
(37,182 posts)Heck, that information has been around for years.
speak easy
(12,489 posts)especially in finding the materials closest to you. If you don't think an AI model should have those sort of guardrails, I m not sure we have much more to discuss.
Silent3
(15,909 posts)...without some fundamental changes it how it works. I don't think ChatGPT actually has any concept of truth - no AI model for epistemology. It just gets better and better at imitating human output with no "understanding" of how or why humans trust one source of information over another.
Of course, most humans are terrible at that too.
bucolic_frolic
(53,004 posts)Every idea must be cross-checked for errors, sources, independent confirmation. And the global theme or thesis must be checked for sanity. I like to argue with AI. It just generates garbage. I asked about traditional Italian pizza cheese in the post-war period. You know what? AI doesn't know much about it. Just that Parmagiana Reggianno was founded in 1954. That's it. No mention of other cheeses, the 15-25 or so that were in very isolated farm areas in low quantity, no mention of mozzarella, or that the cheeses you see today won the race and many others, like 10,000 worldwide, are still produced in micro quantities. AI is bullshit.
erronis
(21,771 posts)If google/bing/whatever didn't have access to the references on the cheeses from pre-1954 they wouldn't be able to give you an answer.
AI isn't bullshit. It's just hyped pattern recognition on whatever it is fed. And, of course, much of what goes in.... GIGO.
Warpy
(114,122 posts)You know some budding Republicans out there are going to try to cheat using AI.  This exercise will show them it's a very bad idea.
I have to admit I was tempted to use a little prgram called LISP 40 years ago when I knew my pithy style could produce a creditable paper in 10 pages when the prof insisted it couldn't be done in fewer than 20.  I always wanted to do the middle of the paper with that program.  You inserted a few keywords and it would generate pages of impenetrable prose using them.  It was hilarious if you knew what was going on.  Alas, I chickened out and generated my own impenetrable prose.  I've always wondered...
erronis
(21,771 posts)Frequently that was all that was required - put a decent first/second paragraph, a reasonable conclusion at the end, and viola!
My favorite technique for, say history, was to translate a French or German history text into English and try to wordsmith it. The only critique I got along with the A+ was "it sounds a bit stilted".
This is not a lot different than digesting a whole mess of text into a slurry and extracting something that sounds like English.
Isn't that what our brains are doing, anyway? (Mines a bit of a slurry....)
milestogo
(22,115 posts)So one woman called in and said she thought her child was really good at math but was surprised when he came home with a C+ on a math test.  He got 'A's on all his assignments.
Then one day she heard him posing questions to 'Alexa' while he was doing his math homework.
It's funny, but also sad that the kid wasn't really good at math.
Takket
(23,305 posts)Im not surprised it has mistakes or errors because it didnt understand or mischaracterized. Completely understandable. 
But I dont understand what is wrong with the program that thinks it is okay to just make things up. 
Lucky Luciano
(11,784 posts)It might be giving answers that have the highest likelihood of being correct, but the confidence of those guesses is hard to ascertain hence some rubbish answers that have a wee bit of plausibility.
speak easy
(12,489 posts)highplainsdem
(58,807 posts)speak easy
(12,489 posts)Give it more information, and more networked algorithms, and it will make up less.
BTW see this -
https://www.democraticunderground.com/100217952096
highplainsdem
(58,807 posts)hallucinate. Some experts believe there's no way to correct that. I have read about one system that will make duplicate requests for results to see if a mismatch will catch hallucinations that way, but it costs a lot to operate LLMs and duplicating all requests will increase that. OpenAI is reportedly spending several hundred thousand a day to run ChatGPT.
speak easy
(12,489 posts)If you mean do Large Language Models make stuff up - the answer is yes.
If you mean that LLM AI's make stuff up most of the time - in most replies- the answer is no.
Humans make stuff up. The question is will AI's ever make stuff up less than the average person? Given enough time, I think so. But If something is critical due diligence will require human peer review - which is what took place in this class. But I think we can say current research is focussed on reducing hallucinations. Can that ever be reduced to zero? Most probably not. Can hallucinations be reduced to a low enough level that AI is reliable enough for most tasks? That is what they shooting for. I am not betting not be against it.
highplainsdem
(58,807 posts)I'm talking about the AI being hyped and used now, despite all the mistakes it makes.
And I find this a silly point to try to make:
The people using AI for business, for example, are not trying to make stuff up. AI they're using may very well do so regardless of their intent.
You wouldn't defend a calculator that often gives incorrect answers by saying that humans can flub math questions, too. I hope you wouldn't, anyway.
For that matter, no one in their right mind would want to use a calculator that didn't work.
It's really pathetic that so many people are eager to use fallible AI whose results have to be checked very carefully.
But I think there are three main reasons for the popularity of ChatGPT and similar LLMs:
1) Gullibility. People expect computer results to be accurate, and they're impressed as well by the chatbot's fluent, authoritative-sounding prose.
2) Laziness. This applies to the cheaters, whether students or adults who think AI can handle chores they don't like or give them the appearance of having skills and talents they don't have - an illusion that will crumble as soon as they're deprived of the AI.
3) Greed. This applies to all the people who think they'll become richer, quickly, using AI, whether those people are employers hoping to lay off employees, or people dreaming of get-rich-quick schemes where AI gives them marketable writing, code, etc.
speak easy
(12,489 posts)Is AI oven ready now for the tasks it is being promoted for? No - and is greed an underlying incentive for the hype - yes.
And certainly, people give more weight to something that comes out of a computer, than it deserves.
You mention a calculator. If one in a hundred results from a calculator was wrong, would people still use one? Would they cross check each result to make sure it was accurate? If they were checking items in a grocery store? If they were buying a car?
highplainsdem
(58,807 posts)LLM hallucinate
and you'll find a lot about this.
Hekate
(100,006 posts)highplainsdem
(58,807 posts)Sometimes errors even in public demos aren't caught.
Google got hammered by bad publicity and its stock lost value when it first rolled out Bard AI and its hallucinations were caught immediately.
Microsoft got VERY lucky with its demo of Bing AI, because that chatbot also made mistakes, hallucinated, during the demo, but those mistakes weren't caught till later, and Bing was hyped tremendously...and then it went off the rails and its ability to respond had to be sharply restricted.
I'm sure these AI are giving a lot more incorrect or simply crazy results than we ever hear about. Students using them for cheating aren't likely to call attention to that.
And businesses using fallible AI won't publicize that, either - won't want people to know about hallucinations/errors - for fear that it will damage their reputation.













