This project performs Named Entity Recognition (NER) on medical forum posts to identify mentions of Drugs, Diseases, Symptoms, and Adverse Drug Reactions (ADRs). It also links the extracted ADRs to SNOMED-CT medical codes.
The project uses the CADEC (Corpus of Adverse Drug Events) dataset. The dataset structure should be:
cadec/
├── text/ # Raw forum posts (.txt)
├── original/ # Ground truth annotations (.ann)
├── meddra/ # ADR annotations using MedDRA terminology
└── sct/ # Annotations linked to SNOMED-CT codes
The project is divided into several tasks:
Script: task1_entity_enumeration.py
Purpose: Parse .ann files from cadec/original to count unique entities (ADR, Drug, Disease, Symptom).
Output: Summary statistics of entities in the dataset.
Script: task2_llm_sequence_labelling.py
Interactive version: task2.ipynb
Purpose: Apply a pre-trained biomedical NER model to label forum posts.
Process:
- Read forum post from
cadec/text. - Apply NER pipeline.
- Merge sub-word tokens into complete entities.
- Map model labels to ADR, Drug, Disease, Symptom.
- Save predictions in
.annstyle (*_predicted_spans.json).
Script: task3_evaluate_predictions.py
Purpose: Evaluate NER model performance against ground truth (cadec/original).
Method: Strict matching of entity text and label. Calculates Precision, Recall, F1-score.
Script: task4.py
Purpose: Specialized evaluation for ADR label using cadec/meddra.
Output: Precision, Recall, F1-score for ADR detection.
Task 5 now contains three scripts for flexible evaluation metrics:
batch_evaluation.py: Computes relaxed precision, recall, and F1 by allowing overlapping spans as correct. Works with_predicted_spans.json.relaxed_eval.py: Computes relaxed evaluation at file and macro/micro levels. Reports skipped files.token_level.py: Computes token-level and word-presence F1 scores for predictions. Useful for finer-grained analysis.
Output: Evaluation metrics per file and macro averages.
Script: task6.py
Purpose: Normalize detected ADR entities by linking them to SNOMED-CT concepts.
Methods:
- Fuzzy string matching (
fuzzywuzzy) - Sentence embeddings (
sentence-transformers) for semantic similarity
Output:adr_sct_mappings.jsonwith mapping results.
Script: compare_adr_mapping_readable.py
Purpose: Generate human-readable comparison of fuzzy vs embedding mappings for ADRs.
Output: adr_comparison_readable.txt showing side-by-side matches.
| File / Directory | Description |
|---|---|
task1_entity_enumeration.py |
Counts and summarizes unique entities in CADEC. |
task2_llm_sequence_labelling.py |
Performs NER using a pre-trained biomedical model. |
task3_evaluate_predictions.py |
Standard evaluation against cadec/original. |
task4.py |
ADR-specific evaluation using MedDRA. |
task5_relaxed_eval.py |
Relaxed evaluation considering span overlaps. |
batch_evaluation.py |
Relaxed evaluation allowing overlapping spans per file. |
relaxed_eval.py |
Relaxed evaluation with macro/micro metrics and skipped files. |
token_level.py |
Token-level and word-presence evaluation metrics. |
task6.py |
Links ADR entities to SNOMED-CT using fuzzy & embedding methods. |
compare_adr_mapping_readable.py |
Human-readable ADR mapping comparison. |
adr_sct_mappings.json |
JSON storing ADR to SNOMED-CT mapping results. |
adr_comparison_only.json |
JSON containing comparison data between fuzzy & embedding methods. |
adr_comparison_readable.txt |
Readable text file displaying ADR mapping comparisons. |
cadec/ |
CADEC dataset (text and annotations). |
predictions/ |
Folder storing predicted span JSON files. |
venv/ |
Python virtual environment. |
- Install Python 3 and required packages:
pip install torch transformers fuzzywuzzy sentence-transformers
Run the scripts sequentially:
bash
Copy code
python task1_entity_enumeration.py
python task2_llm_sequence_labelling.py
python task3_evaluate_predictions.py
python task4.py
python batch_evaluation.py # Task 5 evaluation script 1
python relaxed_eval.py # Task 5 evaluation script 2
python token_level.py # Task 5 evaluation script 3
python task6.py
python compare_adr_mapping_readable.py # Optional
Review outputs in the predictions/ folder and evaluation metrics printed in the terminal.
About
This repository provides a complete workflow for medical NER, evaluation, and entity linking, useful for NLP research in pharmacovigilance and adverse drug event detection.