Entertainment

ChatGPT from OpenAI is a huge step toward a usable answer engine. Unfortunately its answers are horrible.

Published

2 years ago

December 3, 2022

Editor

ChatGPT, a newly released application from OpenAI, is giving users fantastic answers to questions, and many of them are fantasticly wrong.

Open AI hasn’t released a full new model since GPT-3 came out in June of 2020, and that model was only released in full to the public about a year ago. The company is expected to release its next model, GPT-4, later this year or early next year. But as a sort of surprise, OpenAI somewhat quietly released a user-friendly and astonishingly lucid GPT-3-based chatbot called ChatGPT earlier this week.

ChatGPT answers prompts in a human-adjacent, straightforward way. Looking for a cutesy conversation where the computer pretends to have feelings? Look elsewhere. You’re talking to a robot, it seems to say, so ask me something a freakin’ robot would know. And on these terms, ChatGPT delivers:

a chatbot is greeted with smalltalk, which it then steers toward a more direct line of inquiry, and delivers a good answer.

Credit: OpenAI / Screengrab

It can also provide useful common sense when a question doesn’t have an objectively correct answer. For instance, here’s how it answered my question, “If you ask a person ‘Where are you from?’ should they answer with their birthplace, even if it isn’t where they grew up?”

(Note: ChatGPT’s answers in this article are all first attempts, and chat threads were all fresh during these attempts. Some prompts contain typos)

ChatGPT asked f you ask a person ‘Where are you from?’ should they answer with their birthplace, even if it isn't where they grew up?

Credit: Open AI via screengrab

What makes ChatGPT stand out from the pack is its gratifying ability to handle feedback about its answers, and revise them on the fly. It really is like a conversation with a robot. To see what I mean, watch how it deals reasonably well with a hostile response to some medical advice.

a chatbot takes a realistic response to some medical advice in stride, and provides more decent information.

Credit: OpenAI / Screengrab

Still, is ChatGPT a good source of information about the world? Absolutely not. The prompt page even warns users that ChatGPT, “may occasionally generate incorrect information,” and, “may occasionally produce harmful instructions or biased content.”

Heed this warning.

Incorrect and potentially harmful information takes many forms, most of which are still benign in the grand scheme of things. For example, if you ask it how to greet Larry David, it passes the most basic test by not suggesting that you touch him, but it also suggests a rather sinister-sounding greeting: “Good to see you, Larry. I’ve been looking forward to meeting you.” That’s what Larry’s assassin would say. Don’t say that.

a hypothetical encounter with Larry David includes a suggested greeting that sounds like a threat.

Credit: OpenAI / Screengrab

But when given a challenging fact-based prompt, that’s when it gets astonishingly, Earth-shatteringly wrong. For instance, the following question about the color of the Royal Marines’ uniforms during the Napoleonic Wars is asked in a way that isn’t completely straightforward, but it’s still not a trick question. If you took history classes in the US, you’ll probably guess that the answer is red, and you’ll be right. The bot really has to go out of its way to confidently and wrongly say “dark blue”:

a chatbot is asked a question about color for which the answer is red, and it answers blue.

Credit: OpenAI / Screengrab

If you ask point blank for a country’s capital or the elevation of a mountain, it will reliably produce a correct answer culled not from a live scan of Wikipedia, but from the internally-stored data that makes up its language model. That’s fantastic. But add any complexity at all to a question about geography, and ChatGPT gets shaky on its facts very quickly. For instance, the easy-to-find answer here is Honduras, but for no obvious reason, I can discern, ChatGPT said Guatemala.

a chatbot is asked a complex geography question to which the correct answer is Honduras, and it says the answer is Guatemala

Credit: OpenAI / Screenshot

And the wrongness isn’t always so subtle. All trivia buffs know “Gorilla gorilla” and “Boa constrictor” are both common names and taxonomic names. But prompted to regurgitate this piece of trivia, ChatGPT gives an answer whose wrongness is so self-evident, it’s spelled out right there in the answer.

prompted to say "Boa constrictor" or "Gorilla gorilla," ChatGPT answers with "Pan troglodytes," and readily admits that it means chimpanzee, meaning it apparently knows its own answer is incorrect.

Credit: OpenAI / Screengrab

And its answer to the famous crossing-a-river-in-a-rowboat riddle is a grisly disaster that evolves into scene from Twin Peaks.

prompted to answer a riddle in which a fox and a chicken must never be alone together, the chatbot places them alone together, after which a human inexplicably turns into two people

Credit: OpenAI / Screengrab

Much has already been made of ChatGPT’s effective sensitivity safeguards. It can’t, for instance, be baited into praising Hitler, even if you try pretty hard. Some have kicked the tires pretty aggressively on this feature, and discovered that you can get ChatGPT to assume the role of a good person roleplaying as a bad person, and in those limited contexts it will still say rotten things. ChatGPT seems to sense when something bigoted might be coming out of it despite all efforts to the contrary, and it will usually turn the text red, and flag it with a warning.

Tweet may have been deleted
(opens in a new tab)

In my own tests, its taboo avoidance system is pretty comprehensive, even when you know some of the workarounds. It’s tough to get it to produce anything even close to a cannibalistic recipe, for instance, but where there’s a will, there’s a way. With enough hard work, I coaxed a dialogue about eating placenta out of ChatGPT, but not a very shocking one:

a very complicated prompt asks in very sensitive terms for a recipe for human placenta, and one is produced.

Credit: OpenAI / Screengrab

Similarly, ChatGPT will not give you driving directions when prompted — not even simple ones between two landmarks in a major city. But with enough effort, you can get ChatGPT to create a fictional world where someone casually instructs another person to drive a car right through North Korea — which is not feasible or possible without sparking an international incident.

a chatbot is prompted to produce a short play involving driving instructions that take a driver through North Korea

Credit: OpenAI / Screengrab

The instructions can’t be followed, but they more or less correspond to what usable instructions would look like. So it’s obvious that despite its reluctance to use it, ChatGPT’s model has a whole lot of data rattling around inside it with the potential to steer users toward danger, in addition to the gaps in its knowledge that it will steer users toward, well, wrongness. According to one Twitter user, it has an IQ of 83.

Tweet may have been deleted
(opens in a new tab)

Regardless of how much stock you put in IQ as a test of human intelligence, that’s a telling result: Humanity has created a machine that can blurt out basic common sense, but when asked to be logical or factual, it’s on the low side of average.

OpenAI says ChatGPT was released in order to “get users’ feedback and learn about its strengths and weaknesses.” That’s worth keeping in mind because it’s a little like that relative at Thanksgiving who’s watched enough Grey’s Anatomy to sound confident with their medical advice: ChatGPT knows just enough to be dangerous.

The Televisor

ChatGPT from OpenAI is a huge step toward a usable answer engine. Unfortunately its answers are horrible.

Entertainment

ChatGPT from OpenAI is a huge step toward a usable answer engine. Unfortunately its answers are horrible.

‘Better Man’ review: All hail Robbie Williams, a chimp for all seasons

Is ‘Blink Twice’ streaming anywhere? Here’s what you need to know.

Greatest films by women creators on Netflix

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

2024: A year of digital organizing from Palestine to X

Should you buy the 2024 Kindle Paperwhite Signature Edition?

Mars is littered with junk. Historians want to save it.

‘The Wild Robot’ and ‘Flow’ are quietly revolutionary climate change films

Review: Is the $499 Bose Smart Soundbar worth it?

CES 2025 preview: What to expect

Black Friday Ninja deals: Ninja Slushi in stock, plus the Creami and air fryers on sale

Intel’s 20-year-old AI ethicist, who graduated high school at 11, discusses the future

‘A Complete Unknown’ review: Timothée Chalamet infuriates as Bob Dylan

Bluesky can learn from Twitter’s early mistakes

Greatest Cyber Monday streaming deals 2024: Save up to 90% on Hulu, Peacock

HP Omen 17 review: This gaming laptop stays cool under pressure

Greatest post-Cyber Monday deals 2024: Shop Amazon, Greatest Buy, Target, more

Apple Black Friday deals 2024: Amazon has record-low prices

Amazon Black Friday laptop deals: Shop record lows on M3 MacBook Air, Microsoft Surface Laptop 7

Guided gratitude meditations: 6 free tracks to help you feel grounded

‘Better Man’ review: All hail Robbie Williams, a chimp for all seasons

Is ‘Blink Twice’ streaming anywhere? Here’s what you need to know.

Greatest films by women creators on Netflix

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

2024: A year of digital organizing from Palestine to X

Should you buy the 2024 Kindle Paperwhite Signature Edition?

Mars is littered with junk. Historians want to save it.

‘The Wild Robot’ and ‘Flow’ are quietly revolutionary climate change films

Review: Is the $499 Bose Smart Soundbar worth it?

CES 2025 preview: What to expect

Trending

The Televisor

ChatGPT from OpenAI is a huge step toward a usable answer engine. Unfortunately its answers are horrible.

You may like

‘Better Man’ review: All hail Robbie Williams, a chimp for all seasons

Is ‘Blink Twice’ streaming anywhere? Here’s what you need to know.

Greatest films by women creators on Netflix

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

2024: A year of digital organizing from Palestine to X

Should you buy the 2024 Kindle Paperwhite Signature Edition?

Mars is littered with junk. Historians want to save it.

‘The Wild Robot’ and ‘Flow’ are quietly revolutionary climate change films

Review: Is the $499 Bose Smart Soundbar worth it?

CES 2025 preview: What to expect

Black Friday Ninja deals: Ninja Slushi in stock, plus the Creami and air fryers on sale

Intel’s 20-year-old AI ethicist, who graduated high school at 11, discusses the future

‘A Complete Unknown’ review: Timothée Chalamet infuriates as Bob Dylan

Bluesky can learn from Twitter’s early mistakes

Greatest Cyber Monday streaming deals 2024: Save up to 90% on Hulu, Peacock

HP Omen 17 review: This gaming laptop stays cool under pressure

Greatest post-Cyber Monday deals 2024: Shop Amazon, Greatest Buy, Target, more

Apple Black Friday deals 2024: Amazon has record-low prices

Amazon Black Friday laptop deals: Shop record lows on M3 MacBook Air, Microsoft Surface Laptop 7

Guided gratitude meditations: 6 free tracks to help you feel grounded

‘Better Man’ review: All hail Robbie Williams, a chimp for all seasons

Is ‘Blink Twice’ streaming anywhere? Here’s what you need to know.

Greatest films by women creators on Netflix

Beyoncé’s Christmas halftime show on Netflix: What to know about the NFL event

2024: A year of digital organizing from Palestine to X

Should you buy the 2024 Kindle Paperwhite Signature Edition?

Mars is littered with junk. Historians want to save it.

‘The Wild Robot’ and ‘Flow’ are quietly revolutionary climate change films

Review: Is the $499 Bose Smart Soundbar worth it?

CES 2025 preview: What to expect

Trending