ChatGPT and AI-Written Research Papers: Ethical Considerations for Scholarly Publishing
Bing Nie, Ting Wang, Nishith Mannuru, Brady Lund, Ziang Wang, and Somipam Shimray
In November 2022, OpenAI launched ChatGPT, a chatbot that uses natural language processing to generate responses to user input. According to Reuters, ChatGPT had over 100 million monthly active users worldwide in January 2023 (Hu, 2023). ChatGPT has an advantage over other chatbots because it is based on GPT-3, the most prominent large language model available, which allows it to answer simple questions as well as more complex requests, including academic writing, thanks to its extensive data storage and efficient design.
GPT can compose a research-quality essay of several hundred words in less than a minute. This breakthrough allows for the creation of entire articles by dividing the main topic into subtopics and having GPT write each section. This innovation could potentially reduce the time required for writing research essays from several hours to a few minutes, and it may even lead to the displacement of professional authors and researchers. As such, ChatGPT and related technologies are considered disruptive innovations that could revolutionize academia and scholarly publishing. It is crucial to consider the ethical implications of these technologies, such as copyright, citation practices, and the “Matthew effect,” to ensure their responsible and ethical use in academic research and publishing.
—GPT can compose a research-quality essay of several hundred words in less than a minute.—
Our study reveals that the potential for bias within AI-driven language models, such as GPT-3, poses a significant threat to the integrity of science. These models are trained on vast amounts of data, primarily from the internet, which can lead to a bias in the data. For example, if the data source is biased or incomplete, this bias will be reflected in the model’s output. Furthermore, the coding process used to train these models may also introduce biases. If the data used to train the models is not diverse enough, the resulting model may be biased towards certain groups or viewpoints. These biases can have a far-reaching impact on research, as the use of biased language models can perpetuate existing biases and misconceptions.
The issue of copyright ownership in AI-generated content is a complex and multifaceted one that requires careful consideration by legal experts and policymakers alike. As AI-driven language models such as GPT-3 become increasingly complex and capable of generating content that is indistinguishable from human writing, the question of who owns the resulting intellectual property becomes more pressing. Currently, the lack of transparency on data sources makes it difficult to know the extent to which content may be plagiarized.
As for citation practice, ChatGPT has been found to produce academic essays with missing references (see Appendix 1 of the original article for an example). The lack of references in early versions of ChatGPT could lead to disorder in scholarly publishing, as the integrity and credibility of the research may be called into question without proper citations. Indeed, when ChatGPT is asked to write a paper with citations, it often provides phantom citations to articles that do not actually exist.
Additionally, academic publishing platforms such as ChatGPT, which may use factors like citation counts in determining which publications to cite, may exacerbate the Matthew effect, which refers to the tendency for successful researchers with high citation counts to continue to be successful and frequently cited, while lesser-known researchers struggle to gain recognition and citations. In the context of academic writing, ChatGPT generates new papers based on existing papers and researchers’ needs. This could potentially exacerbate the lack of innovation and disconnection from practice in academic research.
In order to address these concerns, it may be necessary to develop new legal frameworks and guidelines that consider the unique challenges posed by AI-generated content. This process could involve clarifying existing copyright laws to account for the use of machine learning algorithms and training data, or establishing new regulations to ensure that AI-generated content is properly sourced and attributed.
ChatGPT and related technologies have the potential to significantly impact academia and scholarly research and publishing. However, it is important to consider the ethical implications of these technologies, particularly regarding the use of ChatGPT by academics and researchers. Collaboration between researchers, publishers, and AI-driven language model developers could be helpful for establishing guidelines and protocols that ensure the ethical, transparent, and accountable use of these technologies. Failure to develop such policies may undermine public trust in the scientific process, leading to far-reaching consequences for future research and innovation.
Hu, K. (2023). ChatGPT sets record for fastest-growing user base. Reuters, February 2, 2023.
Cite this article in APA as: Nie, B., Wang, T., Mannuru, N., Lund, B., Wang, Z., & Shimray, S. (2023, March 15). ChatGPT and AI-written research papers: Ethical considerations for scholarly publishing. Information Matters, Vol. 3, Issue 3. https://informationmatters.org/2023/03/chatgpt-and-ai-written-research-papers-ethical-considerations-for-scholarly-publishing/