As medical records may cover a very long history of diseases (up to 30 years) and include a vast number of diagnoses, symptoms, results, medications, and laboratory values, we could highly benefit from advanced search capabilities in clinical information systems to allow for the retrieval of relevant data. We propose a three stage process:

  1. offline textual information extraction from medical records in transplant medicine;
  2. the generation of interesting facetted search capabilities on the results
    of the previous stage;
  3. the combination of the information extraction results with structured laboratory values.
    Here we focus on supporting physicians to identify groups of patients with similar and different
    attributes that are highly relevant for further treatments.

The patients meta data and the structured medical data is modelled and indexed as main objects with sub-objects which enables us to construct more complex search queries. We can distinguish between predefined facets built upon more simple attributes (e.g., gender, age, blood group, existence of findings) and complex dynamic facets (especially temporal relationships between events) which are assembled by the user while executing the facetted search.

The facetted search works by providing information about existing values and the expected cardinality of the result set in real-time; this works even before an attribute (e.g., ‘examination of an organ’) is restricted to a value (e.g., ‘thorax’). This feature is of great importance for the main
application of the system which is the identification of patient cohorts for medical studies. Additionally, the user may un-click any restriction made within his subsequent search process at any time, thereby giving him or her great flexibility in narrowing down the search.

The user interface groups the facets thematically in ‘Stammdaten’ (master data), ‘Anamnese’, ‘Diagnoses’ etc. Each facet shows the most frequent values and its cardinality and highlights the values that are common to all remaining patients. Additionally, there are lists with all (remaining) values that may be filtered directly by a completion mechanism (which is very helpful when dealing with different notations resulting from free text input for values), or the user may use free text search including wild-cards to search textual values.



Hans-Jürgen Profitlich