Research

My graduate and post-graduate research spans machine learning, natural language processing (NLP), and medical informatics. My thesis work focused on developing new methods for resolution of semantic ambiguities.

As a researcher at Boston Children’s Hospital and Harvard Medical School I work on applications of machine learning and NLP to health and medicine. Specifically, I am involved in several projects that focus on deep semantic analysis of clinical texts including relation extraction between medical entities, phenotype creation, and other types of data mining. I lead the development of the methods and software for relation extraction; our best performing methods were released open-source as a part of the clinical Text Analysis and Knowledge Extraction System (cTAKES) – the most widely adopted software for clinical text processing. The relation extractor later became the basis for the temporal relation extraction software developed for the THYME (Temporal Histories of Your Medical Event) project.