![]() |
||||||||||||||||
|
||||||||||||||||
Introduction The continuous advancement of artificial intelligence (AI) and large language models (LLMs) has presented several opportunities for librarians to reduce their workload and become more efficient. AI chatbots previously generated responses by relying pre-trained, large-scale data. Newer versions of chatbots have made substantial strides in their capabilities, now offering the ability to search online and supplement outputs using Retrieval-Augment Generation (RAG) to provide more accurate and relevant responses. These advancements present a promising opportunity for librarians to reduce workload and increase efficiency [1, 2]. To explore the potential of generative AI chatbots in assisting health sciences librarians with collection development, we evaluated five generative AI chatbots using two prompts designed to aid librarians. The five generative AI chatbots assessed included ChatGPT o3, NotebookLM, Google Gemini 2.5, Perplexity, and Microsoft Copilot. Prompt 1: ebook Recommendations For the first task, we developed the following prompt for each generative AI chatbot to generate a list of recent ebook titles published in the last two years focused on physical therapy, physician assistant, communication sciences and disorders, and pharmacy:
The results from the first prompt were assessed based on quality, accuracy, the presence of fabricated titles (often referred to as “hallucinations”), whether references were provided, correct citation details, and accurate Library of Congress (LC) call numbers. ChatGPT, Copilot, and Gemini provided five titles per subject as requested, while Perplexity and NotebookLM provided fewer than five or none for specific subjects. Each AI chatbot successfully produced relevant book titles, but all chatbots also hallucinated inaccurate information about each book, including incorrect editions, publication years, links, and APA citations. All five chatbots also provided titles outside of the date range requested. For example, the following book was suggested by Google Gemini:
The suggested book title does exist, is written by the author provided, and is relevant to communication sciences and disorders. However, the fifth edition of the book was published in 2012. The most recent edition is the seventh edition, published in 2025, which would have been a better response to our prompt. Copilot and ChatGPT were the most accurate, as they offered accurate authors and titles while completing the task as requested. We would not recommend any generative AI chatbots for recommendations on recently published titles due to inaccuracies and inconsistencies, but we did find them helpful for discoverability. Prompt 2: Collection Gap Analysis We asked each AI chatbot to complete three steps for the second task. The first step was to analyze the curriculum from Chapman University’s Physician Assistant program directly from the program’s webpage. The second step involved creating a list of the library’s collection that was uploaded into each generative AI chatbot. A list of the library’s collection of physical titles was exported as an Excel file from the Leatherby Libraries Integrated Library System, Sierra from Innovative. The Excel file with the list of the library’s collection contained fields for title, Library of Congress call number, location, and item status. NotebookLM and Perplexity were unable to accept Excel files. As an alternative, we copied and pasted the titles and call numbers into Perplexity’s prompt field and a .txt file for NotebookLM. The following prompt was then developed to ask each AI chatbot to compare and analyze the library’s collection to see if the entire curriculum of the Physician Assistant program was represented:
All five AI chatbots completed the tasks for the second prompt but found inconsistencies in the provided analyses. Each AI chatbot was able to compare the provided curriculum with the list of physical items. Each AI chatbot provided minor differences in the subject gaps they identified, but all provided the reasoning behind the importance of each subject area recommended such as the following example from ChatGPT:
Figure 1: Table of collections gap analysis responses from ChatGPT Only Google Gemini provided inaccurate LC call numbers in its recommendations, while ChatGPT provided additional details and accurate LC call numbers that we found most useful and beneficial to our current collection development cycle. Conclusion We found that AI chatbots can assist librarians in collection development, although they are still prone to inaccuracies. Hallucinations found for the first prompt indicate that information retrieval still needs improvement in generative AI chatbots. When using the AI chatbots as a RAG and providing specific sources and data to analyze, the results were more promising and practical, as suggested by the second task. Overall, our findings suggest that generative AI chatbots can be a supplementary tool, and future improvements may prove helpful to librarians. References 1. Brzustowicz R. From ChatGPT to CatGPT: The Implications of Artificial Intelligence on Library Cataloging. Information Technology and Libraries. 2023;42(3). DOI: https://doi.org/10.5860/ital.v42i3.16295 2. Yamson GC. Immediacy as a better service: Analysis of limitations of the use of ChatGPT in library services. Information Development. 2023;0(0):02666669231206762. DOI: https://doi.org/10.1177/02666669231206762 Authors Ivan Portillo, iportillo@chapman.edu, https://orcid.org/0000-0002-8031-1006, Health Sciences Librarian, Director of Rinker Campus Library Services, Leatherby Libraries, Chapman University, Irvine, CA David Carson, carsondav@ohsu.edu, https://orcid.org/0009-0002-6533-3159, Health Sciences Education & Research Librarian, Oregon Health & Science University, Portland, OR Editor’s Note: This article is based on a project originally published in the January 2025 issue of JMLA. DCT Featured Article – June 10, 2025 |
||||||||||||||||
Copyright 2014 - Doody Enterprises, Inc. - All rights reserved |