Home > TERATEC FORUM > Workshop 4

TERATEC 2019 Forum
Workshops - Wednesday June 12

Workshop 4 - 09:00 to 12:30
Digital and Data Sciences for precision medicine – from projects to reality

Dr Warehouse – a translational data warehouse

By Nicolas GARCELON, Responsable de la plateforme Data Science d’Imagine, Institut des maladies génétiques & Co-fondateur de la startup codoc

The repurposing of clinical data for research has become widespread with the development of clinical data warehouses. These data warehouses are modeled to integrate and explore structured data related to thesauri. These data come mainly from automata (biology, genetics, cardiology, etc.) or coded data (DRG). But clinicians produce textual data like hospital records (hospitalization, surgery, imaging, pathology, etc.). This mass of data, barely used by conventional data warehouses, is an essential source of information. Indeed, the free text allows to describe the clinical picture of a patient with more details, expressing the absence of signs and uncertainty. This wealth of information makes clinical text a valuable source for translational research. However, this requires algorithms and tools adapted for this type of data and for the users (i.e. doctors and researchers).

To meet these needs we designed a data warehouse (Dr Warehouse) focused on the clinical document. Through three use cases for translational research in the context of rare diseases, we have tried to address the problems inherent to textual data: (i) the recruitment of patients through a search engine adapted to textual data (negation and family history), (ii) automated phenotyping from textual data, and (iii) decision support based on phenotypical similarity among patients.

We were able to evaluate these methods on the data warehouse that we created and supplied to Necker-Enfants Malades, integrating approximately 660,000 patients and 5.5 million reports. These methods and algorithms have been integrated into Dr Warehouse, distributed under open source license since September 2017.

Nicolas Garcelon a un diplôme d’ingénieur agronome (2000) ainsi qu’un doctorat de santé publique en informatique biomédical. De 2001 à 2012, il a travaillé au département d’information médical du CHU de Rennes. Depuis 2012, il dirige la plateforme data science de l’institut Imagine, et depuis 2018 il est coordinateur du workpackage 1 de l’institut. Il est aussi chercheur dans l’équipe INSERM « Information Sciences to support Personalized Medicine ». Il conçoit et développe des logiciels hospitaliers pour les cliniciens et chercheurs afin de faciliter la saisie, et l’exploration des données médicales. Il a notamment créé Dr Warehouse®, déposé sous licence open source en septembre 2017. Dr Warehouse est un entrepôt de données biomédicales qui permet aux cliniciens de fouiller, de visualiser et d’analyser les données des patients de manière intuitive et efficace. Il a co-créé en novembre 2017 la spin-off codoc qui diffuse et installe Dr Warehouse dans les hôpitaux.


Register now and get your badge here

  • TERATEC Forum is strictly reserved for professionals.
  • Participation to exhibition, conferences and workshops is free (subject to seats available)
  • On line registration is obligatory to attend exhibition, conferences or the workshops.
  • The Vigipirate security plan being raised to its highest level, it is mandatory to register online in advance and come with an identity card order to participate in TERATEC Forum.
  • The badge is free of charge and give you access to all events TERATEC Forum.

For any other information regarding the workshops, please contact :

Jean-Pascal JEGU
Tel : +33 (0)9 70 65 02 10
2, rue de la Piquetterie


© Ter@tec - All rights reserved - Lawful mention