Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Zhao, Bo (2016)
Languages: English
Types: Doctoral thesis
Subjects: QA75
In the realm of knee pathology, magnetic resonance imaging (MRI) has the advantage of visualising all structures within the knee joint, which makes it a valuable tool for increasing diagnostic accuracy and planning surgical treatments. Therefore, clinical narratives found in MRI reports convey valuable diagnostic information. A range of studies have proven the feasibility of natural language processing for information extraction from clinical narratives. However, no study focused specifically on MRI reports in relation to knee pathology, possibly due to the complexity of knee anatomy and a wide range of conditions that may be associated with different anatomical entities.\ud In this thesis, we describe KneeTex, an information extraction system that operates in this domain. As an ontology-driven information extraction system, KneeTex makes active use of an ontology to strongly guide and constrain text analysis. We used automatic term recognition to facilitate the development of a domain-specific ontology with sufficient detail and coverage for text mining applications. In combination with the ontology, high regularity of the sublanguage used in knee MRI reports allowed us to model its processing by a set of sophisticated lexico-semantic rules with minimal syntactic analysis. The main processing steps involve named entity recognition combined with coordination, enumeration, ambiguity and co-reference resolution, followed by text segmentation. Ontology-based semantic typing is then used to drive the template filling process. We adopted an existing ontology, TRAK (Taxonomy for RehAbilitation of Knee conditions), for use within KneeTex. The original TRAK ontology expanded from 1,292 concepts, 1,720 synonyms and 518 relationship instances to 1,621 concepts, 2,550 synonyms and 560 relationship instances. This provided KneeTex with a very fine-grained lexicosemantic knowledge base, which is highly attuned to the given sublanguage. Information extraction results were evaluated on a test set of 100 MRI reports. A gold standard consisted of 1,259 filled template records with the following slots: finding, finding qualifier, negation, certainty, anatomy and anatomy qualifier. KneeTex extracted information with precision of 98.00%, recall of 97.63% and F-measure of 97.81%, the values of which are in line with human-like performance.\ud To demonstrate the utility of formally structuring clinical narratives and possible applications in epidemiology, we describe an implementation of KneeBase, a web-based information retrieval system that supports complex searches over the results obtained via KneeTex. It is the structured nature of extracted information that allows queries that encode not only search terms, but also relationships between them (e.g. between clinical findings and anatomical locations). This is of particular value for large-scale epidemiology studies based on qualitative evidence, whose main bottleneck involves manual inspection of many text documents.\ud The two systems presented in this dissertation, KneeTex and KneeBase, operate in a specific domain, but illustrate generic principles for rapid development of clinical text mining systems. The key enabler of such systems is the existence of an appropriate ontology. To tackle this issue, we proposed a strategy for ontology expansion, which proved effective in fast–tracking the development of our information extraction and retrieval systems.

Share - Bookmark

Cite this article