293x Filetype PDF File size 0.74 MB Source: www.thieme-connect.com
Article published online: 2022-06-02
243
© 2022 IMIA and Georg Thieme Verlag KG
Natural Language Processing: from Bedside
to Everywhere
1 1 1 2
Eiji Aramaki , Shoko Wakamiya , Shuntaro Yada , Yuta Nakamura
1 Nara Institute of Science and Technology (NAIST), Nara, Japan
2 Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of
Tokyo, Tokyo, Japan
Summary 1 Introduction of NLP? To clarify these questions, this study
Objectives: Owing to the rapid progress of natural language investigates what clinical/medical NLP has
processing (NLP), the role of NLP in the medical field has radi- Electronic health/medical records (referred achieved in different clinical/medical fields.
cally gained considerable attention from both NLP and medical to as EHR in this study) are rapidly re- This review aims to provide a guide
informatics. Although numerous medical NLP papers are pub- placing paper-based records in hospitals for the NLP specialist who does not know
lished annually, there is still a gap between basic NLP research worldwide. Natural language processing medical informatics well enough. The scope
and practical product development. This gap raises questions, (NLP) techniques have gained importance of this paper is related to studies that have
such as what has medical NLP achieved in each medical field, in the medical field. Because NLP is a hot the potential to directly contribute to daily
and what is the burden for the practical use of NLP? This paper topic in computer science, the number of clinical practice, which we call bedside ap-
aims to clarify the above questions. medical NLP studies is increasing each year plications, consisting of internal medicine,
Methods: We explore the literature on potential NLP products/ dramatically. pre-surgery, post-surgery, oncology, radiol-
services applied to various medical/clinical/healthcare areas. Despite the large number of studies, only ogy, pathology, psychiatry, rehabilitation,
Results: This paper introduces clinical applications (bedside a few practical studies have validated medi- obstetrics, and gynecology, etc. This paper
applications), in which we introduce the use of NLP for each cal NLP applications in real-world settings. introduces existing ready-to-use systems
clinical department, internal medicine, pre-surgery, post-surgery, Studies using randomized controlled trials used in the above fields and summarizes
oncology, radiology, pathology, psychiatry, rehabilitation, ob- (RCTs), which have the highest medical its current methodology and performance.
stetrics, and gynecology. Also, we clarify technical problems to be evidence, are rare. In the PubMed search for Finally, we mention future potential NLP
addressed for encouraging bedside applications based on NLP. “NLP” + “RCT” or “Clinical trial,” we could applications not only for hospital use but
Conclusions: These results contribute to discussions regarding find few studies only [1–4]. Instead of RCT also for patient use.
potentially feasible NLP applications and highlight research gaps studies, several studies employed a retro-
for future studies. spective study using EHR big data: screening
of diseases, case classification, incident de-
Keywords tection, etc. [5–8]. However, unlike medical 2 Bedside Applications
image software, these systems have not been
Natural language processing, medical application, chatbot, commercialized as a product. A similar trend We provide an overview of how far NLP
randomized controlled trial, social media can be observed in the approved applications can be applied to outpatient and inpatient
of the Food and Drug Administration (FDA) diagnosis, treatment, or management in
Yearb Med Inform 2022:243-53 1
as artificial intelligence (AI) systems . Most each department. Historically, shared tasks
http://dx.doi.org/10.1055/s-0042-1742510 were audiology devices, and no medical have been one of the effective ways for re-
systems related to NLP were found. searchers to drive fundamental innovations
In summary, NLP has been actively in the clinical NLP [9]. This is a competitive
studied, but there is still a gap between basic platform where organizers present a techni-
research and practical product development. cally challenging and clinically meaningful
This raises several questions, including what task along with the dataset, gold standards,
has medical NLP achieved in each medical and evaluation criteria. In the early days,
field, and what is the burden for practical use simple tasks were chosen, such as classi-
1 fying patient records based on smoking
https://www.fda.gov/medical-devices/ status [10]. These days, shared tasks deal
software-medical-device-samd/artificial- with far more complex problems, such as
intelligence-and-machine-learning-aiml-
enabled-medical-devices temporal relationship recognition among
IMIA Yearbook of Medical Informatics 2022
244
Aramaki et al.
clinical events in discharge summaries [11], (i) Disease prevention. NLP can identify system in the EHR system that identifies epi-
risk factor identification in longitudinal series risk factors, estimate risk, or predict leptic outpatients with indications of surgery
of progress notes [12], and clinical decision events of disease development or read- with SVM. The system achieved ROC-AUC
support [13–15]. Over time, reproducibility of missions [12, 31, 32]. Wang et al. au- of 0.79 in recommending operation [24].
solutions and techniques found in shared tasks tomatically calculated CHA DS -VASc Fonferko-Shadrach et al. developed an NLP
2 2
have been demonstrated by researchers, which and HAS-BLED, the risk scores for the system to review clinic letters and auto-
has promoted advancements in clinical NLP. cerebral stroke of atrial fibrillation pa- matically extract symptoms, diagnosis, and
We surveyed how far NLP applications tients, by a rule-based approach. They medication history of preoperative patients.
have been proven to be replicable in real-world also identified patients with a high risk The system was based on an existing entity
clinical practice. We made no limitations on of cerebral stroke with positive predic- linking tool and demonstrated F1-score of
hospital departments in searching publications. tive values of 0.92–1.00 [33]. Buchan 0.911 [38].
We referred to (i) reviews and systematic et al. analyzed clinical notes of patients
reviews published in 2017 or later and (ii) orig- without a history of coronary artery Post-surgery
inal research articles published in 2020 or later disease (CAD) with named entity Perioperatively and postoperatively, NLP
on NLP applications for each hospital depart- recognition (NER) and support vector contributes to continuous quality improve-
ment. We searched PubMed for publications machine (SVM), and identified patients ment efforts. NLP can identify complications
using the keyword “natural language process- with later development of CAD with and their details in unstructured free-text
ing” for reviews and systematic reviews, and F1-score of 0.774 [34]; clinical records, even if they are not codified
“natural language processing”, and a hospital (ii) Early diagnosis. NLP can help clini- with ICD-10 (International Classification of
department name together for original research cians recognize diseases out of their th
articles. Because this article is not a systematic specialty that might otherwise be Diseases -10 revision) [29, 39]. Bucher et
review, we focused on studies that can directly misdiagnosed or overlooked without al. identified surgical site infections (SSIs)
contribute to daily clinical practice. Although proper transfer. Chase et al. achieved with an NLP pipeline that parses and extracts
NLP is also helpful in research-oriented appli- area under a receiver operating char- information from clinical notes reaching
cations, such as cohort building with patient acteristic curve (ROC-AUC) of 0.94 ROC-AUC of 0.912. The system also deter-
identification or phenotyping [16], evidence in classifying patients with and with- mined SSI subgroups based on the depth,
generation using clinical free-text [17–19], out multiple sclerosis using NER and the wound condition, and the outcome [29].
or semi-automation of meta-analysis [20] and Naïve Bayes classifiers. They also Furthermore, surgical outcomes can also be
systematic review [21–23], these are beyond identified patients suspected of undi- automatically extracted from unstructured
the scope of this article. agnosed multiple sclerosis [35]; free-text using NLP, which aids labor-inten-
(iii) Treatment support. Clinical decision sive manual chart review. In orthopedics,
support tools to summarize patient clin- hip dislocation after total hip arthroplasty
2.1 Applications in Different ical information and suggest treatment can be detected [40]. Tibbo et al. developed
Departments are beginning to be realized. Seol et al. an NLP system to automatically determine
integrated a clinical decision support Vancouver classification of periprosthetic
NLP-based technology has enabled infor- tool into the EHR system for pediat- femur fractures with the sensitivity of 0.786
mation extraction (IE) from various un- ric asthma outpatients, which warns and specificity of 0.948 [41].
structured free-text documents such as clinic of the risk of acute exacerbation and
letters, progress notes, discharge summaries, recommends an optimal treatment plan Oncology
and test reports. This technology can im- based on free-text and structure data in Oncology is another department where NLP
prove care quality in multiple departments, the EHR [25]. An RCT demonstrated plays an important role [30, 42].
which has been demonstrated mainly in improvement of patient outcomes and (i) IE and cancer registration. NLP helps
retrospective studies and sometimes in pro- significantly reduced physicians’ work- information retrieval on genetic, his-
spective studies [24–27]. NLP performance load for manual chart review. tological, and clinical characteristics
has also been validated in multicenter studies of cancer, which is essential in clinical
[28, 29]. See also Table 1 for details of the decision making and surveillance for
NLP systems introduced below. Pre-surgery effective public health interventions
Internal Medicine NLP has the potential to aid in identifying [43, 44]. The information includes
clinical conditions of preoperative, perioper- histological type, differentiation, Ki-67
NLP aids in the prevention, early diagnosis, ative, and postoperative patients [36, 37]. In index, TNM (classification of malignant
treatment, and prognostic prediction of a preoperative settings, NLP can (i) evaluate tumors) staging, test findings, treatment,
wide range of diseases, such as cardiovas- surgical indications and (ii) reduce the work- family history, and performance status.
cular, endocrine, metabolic, hepatobiliary, load of preoperative assessment. Wissel et Benjamin et al. automatically extracted
and neurological diseases [30]. al. implemented an automatic NLP scoring quantitative information of biomarkers
IMIA Yearbook of Medical Informatics 2022
245
Natural Language Processing: from Bedside to Everywhere
from breast cancer pathology reports. (iv) Surveillance. Radiology reports some- diseases with free-text discharge summaries.
They achieved an accuracy of 0.98 times point out incidental findings. Their system achieved a micro F1-score
with a rule-based approach on top of an NLP can help prevent such findings of 0.584 using multiple classifiers based
existing NER tool MetaMap [45, 46]; from being missed by the attending on pre-trained Robustly Optimized BERT
(ii) Clinical decision support. Precision physician by automatically sending pretraining Approach (RoBERTa) models
medicine is a tailor-made clinical alerts [49–51]. [72, 73]. More fundamentally, NLP can con-
practice considering individual patient tribute to psychiatric diagnostics. The Re-
demographics and cancer genetic Pathology search Domain Criteria (RDoC), a potential
characteristics. NLP can recommend NLP is helpful for both pathologists, whose counterpart of the Diagnostic and Statistical
optimal treatment plans by searching responsibility is increasing in the era of Manual of Mental Disorders (DSM), aims
biomedical articles and clinical trial personalized medicine, and clinicians, who to integrate brain research knowledge into
repositories using patient information refer to the diagnosis for treatment planning. psychiatric disease classification [74], for
as a query [13–15, 47]. Li et al. released (i) Support diagnosis. NLP can support which NLP shared tasks were held in 2016
a chatbot-style open access clinical pathologists by providing a better and 2019 [75, 76].
decision support tool [48]. computer-based image retrieval system
incorporating pathology reports [59] or Rehabilitation
Radiology by automated pathology reporting [60]; NLP is used in speech therapy by incorpo-
NLP can contribute to multiple stages of (ii) Support clinical practice. Information rating it into electronic devices for augmen-
the radiological clinical workflow [49–51]. on pathological diagnosis is used tative and alternative communication (AAC)
(i) Patient safety. NLP can help screen afterward by clinicians for better [77, 78]. Moreover, NLP has the potential
patients for contraindications to diag- treatment strategy. NLP helps convert to better unite the entire rehabilitation into
nostic imaging. Valtchinov et al. iden- unstructured pathology reports into a the healthcare process by enabling the inte-
tified implants with contraindication structured form [45, 57, 61]. Kim et al. gration of the International Classification of
to magnetic resonance imaging (MRI) automatically extracted descriptions of Functioning, Disability, and Health (ICF)
in clinical notes with accuracies of a specimen, procedure, and pathologic into EHRs, although there are still problems
0.83–0.91 with NER [52]; diagnosis from pathology reports re- to overcome [79].
(ii) Imaging protocol recommendation. gardless of clinical departments. Their
NLP can determine the use of contrast deep learning-based system, which Obstetrics and Gynecology
agents or optimal imaging protocols uses Bidirectional Encoder Represen-
based on free-text in ordering com- tations from Transformers (BERT), Publications on bedside NLP applications
ments or clinical records [53–56]. achieved accuracies of 0.9795–0.9839 were found in obstetrics and gynecology,
Chillakuru et al. developed a machine [57, 62]. At a more fine-grained level, although limited in number. Moon et al.
learning-based NLP system to recom- Odisho et al. extracted seventeen types showed the effectiveness of a rule-based
mend the use of contrast agents for of information from prostate cancer pa- NLP approach to highlight information
brain and spinal MRI with accuracies of thology reports and achieved a weight- discrepancies on surgical history due to
0.83–0.85, of which an online demo is ed F1-score of 0.972 for categorical misinterpretation during hospital transfer or
available. The system is based on term data and a mean accuracy of 0.930 for improper copy and paste [80]. Sterckx et al.
frequency-inverse document frequency numerical data. They applied document developed a birth risk prediction system to
vectorization, Gradient Boosting Deci- classification with convolutional neural support preterm birth treatment, which was
sion Tree (GBDT), word embeddings, network (CNN) to categorical data and based on GBDT. NER-based features im-
and shallow neural networks [54]. token classification with random forest proved prediction performance when com-
Some other scan optimization tools are to numerical data [61]. bined with structured data, with F1-score of
commercially available [55]; birth prediction within 24 hours over 0.80
(iii) Automated radiology reporting. As the Psychiatry [81]. Barber et al. used NLP for prognostic
workload of diagnostic radiologists In psychiatry, NLP can be used for IE from prediction of ovarian cancer surgery, where
rapidly grows [57], automated radiol- unstructured EHR and speech analysis postoperative readmission within 30 days
ogy report generation in cooperation on patient speech data [63, 64]. NLP can was predicted with ROC-AUC of 0.70 using
with computer vision AI is attracting help in the screening, early diagnosis, or preoperative CT radiology reports [82].
attention [58]. Most studies have dealt severity estimation of various diseases such Other Departments
with chest X-rays thus far, and further as depression [63], bipolar disorder [65],
application to computed tomography dementia [66–68], psychosis [69, 70], and NLP application is limited in ophthalmology
(CT), MRI, and nuclear medicine is schizophrenia [71]. Dai et al. showed that and anesthesiology, where most AI systems
expected; NLP automatically diagnosed psychiatric are devoted to automated image diagnosis
IMIA Yearbook of Medical Informatics 2022
246
Aramaki et al.
[83] or intraoperative monitoring with nu- (ii) Auto-structuring. Some clinical doc- slightly more standardized terms because
merical data [84]. However, some studies uments such as progress notes or they are exchanged between diagnosing
combine NLP for unstructured free-text nursing notes are required to be in a doctors and radiologists. Distributions of the
documents and AI for structured EHR data structured form. NLP allows healthcare appearing clinical terms in different types
to predict patient prognosis [85]. NLP also professionals to write such documents of clinical notes of different departments
has the potential to automatically pick up in an unstructured narrative by auto- also deviate substantially, leading to uneven
patient risk factors preoperatively. matic editing and structuring. Moen performance even when using an identical
As indicated above, NLP can improve et al. structured Finnish nursing notes model architecture [96].
the quality and efficiency of bedside clinical into paragraphs whose headings were To adapt for a wide range of clinical note
practice mainly by IE from unstructured selected from standardized taxonomy types with a single annotation scheme, some
free-text for various departments and dis- with an accuracy of 0.71 using a Long studies propose general-purpose annotation
eases, a part of which has already been put Short-Term Memory (LSTM)-based guidelines that define popular medical en-
to practical use. sentence classification [89]. Further- tities (e.g., diseases, drugs, tests, remedies,
more, patient-staff conversations can and body parts), as well as semantic rela-
be automatically structured once tran- tionships among them (e.g., “a medicine ‘is-
2.2 Cross-cutting Applications scribed [90, 91]; subscribed-for’ a disease” and “a symptom
Some NLP applications are not limited to (iii) Digital scribe. Digital scribe is different ‘was-found-in’ an anatomical part”) [96–99].
specific hospital departments but can be from dictation but similar to auto-struc- However, this approach increases the com-
helpful widely. We introduce such applica- turing except for using voice input. plexity of the resulting annotation schemes,
tions in this subsection. That is, clinicians have only to record making training annotators expensive. One
an outpatient conversation with some guideline of such schemes has more than 30
additional voice command, and the pages [100]; a temporal IE corpus provides
Text Simplification NLP system analyzes and summarizes a 63 pages-long guideline document [101].
Clinical texts can sometimes be difficult for the conversation and converts it into The complexity of annotation schemes
patients or clinicians in other departments a clinical document in a predefined can also generate ambiguous boundaries
due to jargon or abbreviations. Automated format [92–95]. Wang et al. developed between multiple entity types. For example,
text simplification with NLP can improve a digital scribe system, which was a general-purpose corpus [99] defines ‘Dis-
both patient-staff and staff-staff communi- 2.17–3.12 times faster than typing ease’ entity and ‘Signs or Symptoms’ entity
cation [86, 87]. Moen et al. developed an and dictation during patient encounter separately, the inter-annotator agreement of
NLP system to suggest replacements for documentation [95]. which was relatively low probably because
abbreviations in Finnish clinical texts that are of the annotators’ confusion.
difficult for patients. The system achieved
top-1 accuracy of 0.3464 with an unsuper- 3 Problems to be Addressed
vised approach using cosine similarity of 3.2 Task Formulation
word embeddings [87]. 3.1 Standard Annotation Schemes There are always several ways to formulate
Writing Support Most NLP-based IE techniques adopted in a medical/clinical problem into an NLP task.
the studies we referred to thus far use su- The difference in task formulation affects
Writing support with NLP can solve more pervised machine learning, which requires overall performance and how to create an
fundamental problems that illegible clinical high-quality, large datasets for training. annotated corpus. Careful design of an NLP
texts often result from a shortage of time of Creating such datasets relies on manual task setting translated from clinical needs
healthcare professionals for documentation. annotation and thus increases the cost. matters. Taking adverse drug event (ADE)
(i) Auto-completion. Auto-completion is The formats and conventions of writing detection as an example, we have at least
a real-time suggestion of the next word clinical documents differ not only in docu- three options in its task formulation: NER,
or clinical concept while a healthcare ment types (e.g., EHRs, radiology reports, relation extraction (RE), and text classifica-
professional writes a clinical docu- and nursing notes), but also in hospitals, tion. We represent these different approaches
ment. Gopinath et al. developed an departments, and even individual doctors. in Figure 1. The example sentence implies
auto-completion system for the emer- This textual diversity requires medical NLP that a medication “nivolumab” prescribed
gency department that suggests clinical researchers to create dedicated corpora for for a “laryngeal cancer” adversely caused
conditions, symptoms, medications, different applications by designing distinct “liver damage.” As we mentioned below,
and laboratory test items during the annotation schemes. For instance, doctors each approach has its own benefits and draw-
documentation of progress notes. The often write disease name abbreviations backs. This trade-off suggests that we must
system reduced the keystroke burden in EHRs owing to the nature of personal carefully design NLP approaches against
by 67% [88]; note-taking, while radiology reports contain given medical/clinical IE issues.
IMIA Yearbook of Medical Informatics 2022
no reviews yet
Please Login to review.