News

Article by Max Savery Published in Scientific Data


Searching online for answers to health questions can be challenging for any user, especially those without medical expertise. One of the best ways to make this process easier is to provide an easily digestible summary of each web page. While it is impossible to manually curate summaries for every user’s specific query, recent advancements in deep learning algorithms now make it possible to automatically summarize health information. But before these algorithms can be allowed to provide summaries to actual users, it is necessary to evaluate the quality of the summaries they produce. For this reason, the paper "Question-driven summarization of answers to consumer health questions" introduces a new dataset, the MEDIQA-AnS collection (available at https://osf.io/fyg46/  ). Containing consumer questions, answers from reliable web pages, and manually written summaries of these answers, it can be used to evaluate a variety of automatic summarization approaches, on a wide range of health information summarization tasks. The paper can be found at https://doi.org/10.1038/s41597-020-00667-z