The Rise of AI Chat Agents and the Discourse with Dilettantes

December 8, 2022 Chirag Shah

Chirag Shah, University of Washington

ChatGPT is a simple-to-use conversation agent developed by OpenAI. Of course, this is not the first time we have seen an AI agent that generates information. Google’s LaMDA and Meta’s Galactica are some recent examples for text generation. DALL-E (also from OpenAI), Imagen (Google), and Stable Diffusion are examples of image generation tools. Meta demonstrated Make-A-Video tools that creates short videos using text prompts. Week after week, we are starting to see more and more examples of such tools coming out. Each of them generate a good bit of buzz in the media. I personally feel like I’m living through a familiar deja vu, like the Groundhog Day movie. Something like this comes out, someone starts tweeting crazy or impressive examples from it, the media outlets start covering them, some of them interview researchers (including myself), asking the same questions, and before the dust settles on that tool and stories, a new one pops up. And the saga repeats.

—these systems have gotten so good at pretending to know stuff that they have been fooling even some of the most seasoned experts—

So is ChatGPT somehow different? It is certainly very effective (based on my own experience with it and that of many others). It provides remarkably coherent, meaningful, and sometimes even quite insightful responses to complex questions. Its capability for generating natural language responses is impressive, and it is able to maintain the context (at least at a shallow level) between subsequent requests. What does it mean then for AI agents research and development? Are we done? Should we all just keep iterating over ChatGPT? Are there any issues? Why not ask ChatGPT itself?! Here is what it has to say about this:

There is a continued need for research on conversational agents, as the technology continues to evolve and advance. Research helps to improve the capabilities and performance of conversational agents, and can provide insights into their potential applications and limitations. Additionally, research can help to address challenges and ethical concerns related to the use of conversational agents, such as privacy and bias. Therefore, ongoing research in this area is important for the continued development and deployment of these technologies.

That’s very humble of it! But let me put my skeptic hat on and list three things that we are missing as we get giddy with excitement of current and future possibilities.

First, despite the great ability for generating relevant sentences and paragraphs, ChatGPT (or for that matter, any such AI agent) doesn’t really “understand” this conversation or the language behind it. I put “understand” in quotes because we could debate if an AI agent could even have the ability to understand anything. But that’s for a different time. As my colleague Emily M. Bender and I have argued in our paper Situating Search, a language model (LM) based information extraction or generation system does not really have cognition. In fact, Bender’s previous work on Stochastic Parrots more clearly laid out this claim—these LM-based agents are really good at parroting out (in a probabilistic manner) things that they have picked up though analyzing a set of large corpora. In doing so, they are acting like dilettantes—dabbling into things without really understanding. They are great at pretending to know stuff and giving an illusion of someone knowledgable.

This brings me to my second point—illusion of knowledge. I have written about it before (Pulling the curtain from AI illusions), so I won’t go too much into it here. But it’s important to point out that these systems have gotten so good at pretending to know stuff that they have been fooling even some of the most seasoned experts. A regular user often doesn’t stand a chance. And because of seamless generation of natural language plus this ability to pretend to know things, people can develop the kind of trust with these systems that they shouldn’t. Why is now different than all the years before when we have used information retrieval systems like Google? In case of keyword based search-retrieval systems, when one gets a set of results, there is an implicit understanding that these are not the final answers. The answers need to still be investigated and dug out of those list of relevant documents. Two fundamental things change with systems like ChatGPT. These systems are generating answers directly, which skips the step of showing the users sources from where one could look for answers. Second, these systems are providing responses to and in natural language. The only ones with whom we had such an experience in all of our history so far are other humans. Over thousands of years of language development and use, it is ingrained in us that natural language to-and-fro is with a fellow human being. There is an implicit and very powerful trust there. Now, as the systems develop the same abilities, they are able to benefit from this implicit trust we have with the use of natural language, and I would argue that that’s misappropriation. I’m not saying that we don’t want to go down this path. Arguably, it was inevitable. For decades, we have envisioned such systems and strived to build them. But we didn’t think enough about our own readiness and responsibility.

Finally, and connected to that idea of illusion and responsibility from the previous point, we have seen how people get easily hyped up about this and start using these tools in ways and manner they were never intended to be used. We know that even with regular search-retrieval systems like Google and Amazon, people have strong trust—if they present something at the top of the list, it must be good/true, and if they don’t find something, it must not exist. Now with natural language conversation that one could do with ChatGPT and other systems, that trust about what the system can do can amplify even more. In doing so, an average user could rely on these tools for applications and context they absolutely shouldn’t. For example, Woebot is chat-based app for providing mental health related help. It has been shown that this app provides responses to mental health issues that can be “at best nonsensical and at worst retraumatizing.” But it pretends to know stuff and provides very intelligent-sounding natural language responses. It’s a dilettante, but an average user would be hard pressed to find an issue. They really shouldn’t use this app that may even cause them more harm than helping to address their existing problems.

To its credit, ChatGPT doesn’t answer questions related to certain issues like politics and medicines. It also dodges most questions related to opinions. It stays clear of fringe theories and controversial topics. These are definitely put in place to make sure the mistakes that several other bots like Tay made are not repeated. But there is no guarantee that ChatGPT and other such agents can ever be free of biases and prejudice. Often putting such guardrails or specific fixes does not go far enough to prevent other issues of inequality and misrepresentation. As we showed in our work with Google’s image search, when they seemed to have fixed a problem of bias, it was only on the surface. The underlying issues remained. Sure, this could happen with any machine learning system, but the problem here is much deeper. When the sources of information are not shown and the provided information is fabricated, a user doesn’t have the ability to do any verification or validation. As it is, people already lack motivation or skills to do so, but with readily generated answers in natural language, now there is not even a possibility of doing this check. This has grave implications on our decision-making, our comprehension, and our democracy.

Cite this article in APA as: Shah, C. (2022, December 20). The rise of AI chat agents and the discourse with dilettantes. Information Matters, Vol. 2, Issue 12. https://informationmatters.org/2022/12/the-rise-of-ai-chat-agents-and-the-discourse-with-dilettantes/