Pharmacovigilance is the field of science devoted to the collection, analysis, and prevention of Adverse Drug Reac- tions (ADRs). Efficient strategies for the extraction of information about ADRs from free text sources are essential to support the important task of detecting and classifying unexpected pathologies, possibly related to (therapy-related) drug use. Narrative ADR descriptions may be collected in different ways, e.g., either by monitoring social networks or through the so called “spontaneous reporting, the main method pharmacovigilance adopts in order to identify ADRs. The encoding of free-text ADR descriptions according to MedDRA standard terminology is central for report analysis. It is a complex work, which has to be manually implemented by the pharmacovigilance experts. The manual encoding is expensive (in terms of time). Moreover, a problem about the accuracy of the encoding may occur, since the number of reports is growing up day by day. In this paper, we propose MagiCoder, an efficient Natural Language Processing algorithm able to automatically derive MedDRA terminologies from free- text ADR descriptions. MagiCoder is part of VigiWork, a web application for online ADR reporting and analysis. From a practical point of view, MagiCoder reduces the encoding time of ADR reports. Pharmacologists have simply to review and validate the MedDRA terms proposed by MagiCoder, instead of choosing the right terms among the 70K terms of MedDRA. Such improvement in the efficiency of pharmacologists’ work has a relevant impact also on the quality of the following data analysis.
Our proposal is based on a general approach, not depending on the considered language. Indeed, we developed MagiCoder for the Italian pharmacovigilance language, but preliminarily analyses show that it is robust to language and dictionary changes.