Abstract Details

Title High inter-rater reliability between a machine learning natural language processing algorithm and human data reviewers in identifying acute ischemic stroke in the middle cerebral artery territory from radiographic text reports

Topic Cerebrovascular Disease and Interventional Neurology

Presentation(s) Cerebrovascular Disease and Interventional Neurology Posters (7:00 AM-5:00 PM)

Poster/Presentation
Number 178

Objective To investigate the inter-rater reliability between a machine learning Natural Language Processing (NLP) algorithm and trained data reviewers at identifying acute ischemic strokes in the Middle Cerebral Artery (MCA) territory in a dataset of unstructured, written-text radiology reports from a diverse cohort of patients collected from 2012-2018.

Background Automated analysis of unstructured radiographic text by NLP algorithms that identify the presence of specific clinical criteria has numerous applications including stratification of patient cohorts for research and triage of time-sensitive medical reports. Here, an NLP algorithm, previously developed using a two-hospital center neuroradiology specific training corpus, was externally validated on a cohort from a different center.

Design/Methods An NLP algorithm was tested on a cohort of 4,785 CT and MRI radiographic text reports collected from Boston Medical Center between 2012 to 2018. The algorithm identified the presence of three criteria: ischemic stroke, acute stroke, and stroke location in the MCA territory. 886 reports were hand labeled by study team members trained by attending neurologists, including 801 strokes, 514 of which were acute, and 616 of which were in the MCA territory.

Results Percent agreement was used to determine inter-rater reliability between the NLP algorithm and the hand labelers. Percent agreement on presence of stroke, acuity, and MCA location was 91.9%, 78.2%, and 86.5%, respectively. Discrepancies most often arose because diverse and contextual language was used to describe acuity and location.

Conclusions High inter-rater reliability demonstrated that the NLP algorithm and data reviewers performed with similar discrimination in detecting the presence of acute ischemic strokes in the MCA territory within a dataset of unstructured radiographic text reports representing a racially, ethnically and socio-economically diverse cohort of patients. Acuity was the least reliable characteristic. Applications of automated text identification for stroke characteristics include triage of patients for acute stroke intervention and selection of patient cohorts for research.

Authors/Disclosures
Jack Kalin PRESENTER	Mr. Kalin has received personal compensation for serving as an employee of Celgene Corporation.
Hanife Saglam, MD (West Virginia University)	Dr. Saglam has nothing to disclose.
	No disclosure on file
	No disclosure on file
	No disclosure on file
	No disclosure on file
David M. Greer, MD, F�鶹��ýӳ�� (Boston University School of Medicine)	Dr. Greer has received personal compensation in the range of $10,000-$49,999 for serving as an Editor, Associate Editor, or Editorial Advisory Board Member for Thieme, Inc. Dr. Greer has received personal compensation in the range of $5,000-$9,999 for serving as an Expert Witness for multiple. Dr. Greer has received publishing royalties from a publication relating to health care. Dr. Greer has received publishing royalties from a publication relating to health care. Dr. Greer has received publishing royalties from a publication relating to health care. Dr. Greer has a non-compensated relationship as a Treasurer-Elect with American Neurological Association that is relevant to �鶹��ýӳ�� interests or activities. Dr. Greer has a non-compensated relationship as a President with Neurocritical Care Society that is relevant to �鶹��ýӳ�� interests or activities.
Charlene J. Ong, MD (Boston University)	Dr. Ong has nothing to disclose.

�鶹��ýӳ��

�鶹��ýӳ��