About
emrQA is a clinical question answering dataset that contains questions (along with question paraphrases, logical forms and answers) posed by physicians against clinical notes in electronic medical records. For e.g., Question: How was the patient's extensive liver metastases diagnosed? Paraphrase: What diagnosis was used for the patient's extensive liver metastases? Logical Form: {LabEvent (x) [date=x, result=x] OR ProcedureEvent (x) [date=x, result=x] OR VitalEvent (x) [date=x, result=x]} reveals ConditionEvent (|problem|) Answer: An abdominal and pelvic ct scan with iv contrast
For more details about emrQA, please refer to the paper:
Dataset
emrQA has 1 million question-logical forms and 400,000+ question answer evidence pairs.
Please visit our GitHub repository to create the dataset from i2b2 NER dataset:
Submission
To submit your model, please follow the instructions in the GitHub repository.
Citation
If you use emrQA in your research, please cite our paper by:
@article{pampari2018emrqa, title={emrQA: A large corpus for question answering on electronic medical records}, author={Pampari, Anusri and Raghavan, Preethi and Liang, Jennifer and Peng, Jian}, journal={arXiv preprint arXiv:1809.00732}, year={2018} }
Model | Code | Exact Match (%) | F1-score (%) | |
---|---|---|---|---|
1 Apr 13, 2020 |
Baseline Model University of Massachusetts - Amherst (Paper et al. 2020) |
00.00 | 00.19 | |