Loading clinical trials...

Evaluating the Effectiveness and Acceptability of a GPT-4o and RAG-Based Voice Chatbot for Depression Screening Using PHQ-9 | Clinical Trials | Clareo Health

ENROLLING_BY_INVITATION

Evaluating the Effectiveness and Acceptability of a GPT-4o and RAG-Based Voice Chatbot for Depression Screening Using PHQ-9

NCT06801925•University College, London

View on ClinicalTrials.gov

Summary

This study aims to assess the feasibility and acceptability of a voice-based chatbot, powered by GPT-4o and Retrieval-Augmented Generation (RAG), for conducting depression screening using the Patient Health Questionnaire-9 (PHQ-9). The PHQ-9 is a validated self-report instrument widely used to screen, diagnose, and monitor the severity of depression. It consists of nine questions that correspond to the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) criteria for major depressive disorder. Respondents rate the frequency of symptoms experienced over the past two weeks on a scale from 0 ("not at all") to 3 ("nearly every day"). The total score (ranging from 0 to 27) indicates the severity of depressive symptoms, categorized into minimal, mild, moderate, moderately severe, or severe depression. The PHQ-9 is also used to assess functional impairment and guide treatment decisions in clinical and research settings. The voice-based chatbot integrates GPT-4o, with RAG to enhance its ability to provide informed and contextualized responses during interactions. GPT-4o serves as the conversational engine, capable of generating human-like, empathetic, and contextually appropriate dialogue. RAG, on the other hand, enables the chatbot to retrieve and incorporate external, up-to-date knowledge from a curated database or knowledge repository, ensuring the accuracy and reliability of its responses.

Detailed Description

Depression is a prevalent mental health challenge with significant personal, social, and economic costs. Traditional mental health resources face barriers such as stigma, limited availability, and long wait times. Technology, particularly AI-powered tools, provides an opportunity to bridge these gaps. This study utilizes GPT-4o and RAG to create a voice-interactive chatbot capable of conversational engagement, administering the PHQ-9 questionnaire, and delivering personalized feedback. Participants will fill in the PHQ-9 for self-testing before interacting with the chatbot (the results will not be disclosed to the public and will only be used for accuracy comparisons), and the results of their self-tests will be compared with the results given by the chatbot in terms of accuracy. The chatbot interaction comprises three phases: 1. Warm-up conversations for rapport-building and general support. * The chatbot initiates casual, empathetic dialogues to build rapport with users, helping them feel comfortable and at ease before transitioning to the PHQ-9 screening. * Users can ask general questions related to mental health, and the chatbot provides informed and supportive responses. 2. Administration of the PHQ-9 questionnaire for depression screening. * The chatbot introduces the PHQ-9 questionnaire, explaining its purpose and how the results will help assess the user's mental health. * Through voice interaction, users respond to the nine PHQ-9 questions, and the chatbot records their responses. The chatbot can clarify questions or provide additional context if users have difficulty understanding specific items. 3. Analysis of results and delivery of tailored recommendations. * After the user completes the PHQ-9, the chatbot analyzes the responses, calculates the total score, and categorizes the results into severity levels (e.g., mild, moderate). * Based on the score, the chatbot provides personalized recommendations, such as self-help strategies for mild symptoms or suggesting professional mental health services for more severe cases. Participants will interact with the chatbot and then participate in a 1-hour semi-structured interview to provide feedback on their experience. The study focuses on evaluating the acceptability and feasibility of using such LLM-based chatbots in mental health screening and identifying potential improvements and risks. Study Objectives Primary Objectives 1. To evaluate the acceptability, feasibility, and accuracy of a GPT-4o and RAG-based voice chatbot (HopeBot) for depression screening using PHQ-9. Hypothesis: Participants showed high acceptance of HopeBot (higher than 65%) and high willingness to use such LLM-based chatbot for mental health screening in the future (higher than 65%), indicating recognition of the credibility of LLM as a supportive tool in mental health screening (higher than 65%). Participants use of the HopeBot for depression screening matched their self-test PHQ-9 results by 100% 2. To analyze the chatbot's effectiveness in identifying depressive symptoms and delivering actionable recommendations. Hypothesis: HopeBot can help users take the PHQ-9 test in a friendly way, help users categorize the answers accurately, and give accurate test results, the advice they provide is based on the official PHQ-9 guidelines, and more than 70% of the users say that their responses are very effective and helpful. Secondary Objectives 1. To assess the feasibility and performance of integrating RAG with LLM in creating a voice-interactive chatbot for mental health. Hypothesis: Over 65% of participants recognized that responses using RAG were more helpful and effective. 2. To explore the strengths, limitations, and risks of deploying LLMs in the mental health domain. Hypothesis: More than 65% of users say that HopeBot is very convenient, more accessible, and cost-free to provide non-judgmental advice. However, 50% still expressed concerns about its privacy and data security.

Eligibility

Age

18 - 65 years

Sex

ALL

Healthy Volunteers

Yes

Inclusion Criteria

•Adults aged 18-65 years.
•Fluent in English.
•Access to a device capable of voice interaction and stable internet connection.
•Willing to participate in chatbot interaction and a follow-up interview.

Exclusion Criteria

•Current severe psychiatric diagnoses (e.g., psychosis, bipolar disorder).
•Participants undergoing active treatment for depression with a psychiatrist.
•Discomfort with voice-based technology or inability to provide informed consent.

Locations (1)

UCL Institute of Health Informatics

London, United Kingdom

Key Dates

Start Date

February 1, 2025

Primary Completion Date

March 31, 2025

Completion Date

May 31, 2025

Last Updated

January 30, 2025

Enrollment

100

ESTIMATED participants

Conditions

Depression - Major Depressive DisorderDepression Anxiety Disorder

Interventions

GPT-4o and RAG Voice Chatbot for PHQ-9 Screening

PROCEDURE

A Study of a Deuterated Psilocin Analog (CYB003) in Humans With Major Depressive Disorder

NCT06793397

Major Depressive Disorder (MDD)Depression in Adults+4 more

View study

RECRUITINGNA

Preliminary Efficacy Trial of a Digital Intervention for Depression and Cannabis Use

NCT06878859

Depression - Major Depressive DisorderCannabis Use Disorder+1 more

View study

Data Source & Attribution

This clinical trial information is sourced from ClinicalTrials.gov, a service of the U.S. National Institutes of Health.

ClinicalTrials.gov last update: January 30, 2025Data synced to Clareo: March 29, 2026

Modifications: This data has been reformatted for display purposes. Eligibility criteria have been parsed into inclusion/exclusion sections. Location data has been geocoded to enable distance-based search. For the authoritative and most current information, please visit ClinicalTrials.gov.

Neither the United States Government nor Clareo Health make any warranties regarding the data. Check ClinicalTrials.gov frequently for updates.

View ClinicalTrials.gov Terms and Conditions

Evaluating the Effectiveness and Acceptability of a GPT-4o and RAG-Based Voice Chatbot for Depression Screening Using PHQ-9

Inclusion Criteria

Exclusion Criteria

A Study of a Deuterated Psilocin Analog (CYB003) in Humans With Major Depressive Disorder

Preliminary Efficacy Trial of a Digital Intervention for Depression and Cannabis Use

Virtual Reality-Based Mindfulness as an Adjunct to Treatment as Usual in Treatment-Resistant Depression

Fluoxetine on Emotional Experience (FLEX) Study

Epigenetic Enhancement of Cognitive Training in Aging Mood Disorder Populations

The Effect of PROSE or Scleral Lenses on Mental Health