Facilitating feature selection and extraction in clinical trials with large language models

Jiaji Guo, Wen Sun, Shiting Wen*, Di Wu, Yipeng Zhou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Research on clinical trials requires substantial background and technical knowledge. Large language models (LLMs) have already made a significant impact in various fields. We attempted to use general-purpose LLMs to assist newcomers in clinical trials, enabling them to quickly begin their work. In our work, we demonstrated that in the domain of clinical trials, the current cutting-edge LLMs can provide excellent recommendations for feature selection. By utilizing the features suggested by LLMs, we achieved a 2.5% improvement in AUC compared to complex neural network models when using simpler algorithms. We have also demonstrated that by adjusting the prompts, LLM can play a significant role in the feature extraction process. By adjusting the prompts for certain features suggested by LLM, LLM-assisted feature extraction achieved 100% accuracy in a random sample covering approximately 10% of the entire dataset.

Original languageEnglish
Title of host publicationAdvanced Data Mining and Applications
Subtitle of host publication20th International Conference, ADMA 2024, Sydney, NSW, Australia, December 3–5, 2024, proceedings, part IV
EditorsQuan Z. Sheng, Gill Dobbie, Jing Jiang, Xuyun Zhang, Wei Emma Zhang, Yannis Manolopoulos, Jia Wu, Wathiq Mansoor, Congbo Ma
Place of PublicationSingapore
PublisherSpringer, Springer Nature
Pages230-240
Number of pages11
ISBN (Electronic)9789819608409
ISBN (Print)9789819608393
DOIs
Publication statusPublished - 2025
Event20th International Conference on Advanced Data Mining Applications, ADMA 2024 - Sydney, Australia
Duration: 3 Dec 20245 Dec 2024

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume15390
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference20th International Conference on Advanced Data Mining Applications, ADMA 2024
Country/TerritoryAustralia
CitySydney
Period3/12/245/12/24

Keywords

  • Clinical Trials
  • Application of LLMs
  • Healthcare

Cite this