Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

Grace Y. Chung, Enrico Coiera

Research output: Contribution to journalArticleResearchpeer-review

Abstract

Background. This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined. Methods. A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 - 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted. Results. Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values. Conclusion. The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.

LanguageEnglish
Article number48
Pages1-14
Number of pages14
JournalBMC Medical Informatics and Decision Making
Volume8
DOIs
Publication statusPublished - 2008
Externally publishedYes

Fingerprint

Decision Trees
Information Storage and Retrieval
Randomized Controlled Trials
Population
Semantics
Language

Bibliographical note

Copyright the Author(s) 2008. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

Cite this

@article{8ee21a1006f645018b1d447aaaffea7e,
title = "Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?",
abstract = "Background. This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined. Methods. A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 - 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted. Results. Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8{\%} of a random subset was primary studies with a single population assigned to two or more interventions. 68{\%} of these primary RCT abstracts were structured. 63{\%} contained pharmaceutical interventions. 84{\%} reported the total number of study subjects. In a subset of 21 abstracts examined, 71{\%} reported numerical outcome values. Conclusion. The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.",
author = "Chung, {Grace Y.} and Enrico Coiera",
note = "Copyright the Author(s) 2008. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.",
year = "2008",
doi = "10.1186/1472-6947-8-48",
language = "English",
volume = "8",
pages = "1--14",
journal = "BMC Medical Informatics and Decision Making",
issn = "1472-6947",
publisher = "Springer, Springer Nature",

}

Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports? / Chung, Grace Y.; Coiera, Enrico.

In: BMC Medical Informatics and Decision Making, Vol. 8, 48, 2008, p. 1-14.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

AU - Chung, Grace Y.

AU - Coiera, Enrico

N1 - Copyright the Author(s) 2008. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.

PY - 2008

Y1 - 2008

N2 - Background. This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined. Methods. A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 - 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted. Results. Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values. Conclusion. The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.

AB - Background. This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined. Methods. A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 - 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted. Results. Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values. Conclusion. The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.

UR - http://www.scopus.com/inward/record.url?scp=56349154416&partnerID=8YFLogxK

U2 - 10.1186/1472-6947-8-48

DO - 10.1186/1472-6947-8-48

M3 - Article

VL - 8

SP - 1

EP - 14

JO - BMC Medical Informatics and Decision Making

T2 - BMC Medical Informatics and Decision Making

JF - BMC Medical Informatics and Decision Making

SN - 1472-6947

M1 - 48

ER -