News

We introduce a novel schema for mechanisms that generalizes across many types of activities, functions and influences in scientific literature. This repository contains models, datasets and experiments described in our NAACL 2021 paper: Extracting a Knowledge Base of Mechanisms from COVID-19 Papers.

Please cite our paper if you use our datasets or models in your project. See the BibTeX.
Feel free to email us.

News

Check out the DYGIE++ repo for an example notebook: loading a model trained on MECHANIC and extracting relations using a Spacy interface.

Annotated datasets

We provide two annotated datasets:

Coarse-grained mechanism relations (Direct and Indirect)
Granular mechanism relations (Subject-Predicate-Object)

From project root, run scripts/data/get_mechanic.sh to download both datasets to the data directory.

Coarse-grained relations will be downloaded to data/mechanic/coarse/[train,dev,test].json. Development and test sets for are also available in tabular format: data/mechanic/coarse-gold/[dev,test]-gold.tsv
Granular relations will be downloaded to data/mechanic/granular/[train,dev,test].json. Tabular format:data/mechanic/granular-gold/[dev,test]-gold.tsv

Pretrained models

We provide models pre-trained on both datasets.

Downloads

From project root, run scripts/pretrained/get_mechanic_pretrained.sh to download all the available pretrained models to the pretrained directory. If you only want one model, here are the download links.

Dependencies

This code repository is forked from DYGIE++, Wadden 2019.

This code was developed using Python 3.7. To create a new Conda environment using Python 3.7, do conda create --name mechanic python=3.7.

This library relies on AllenNLP and uses AllenNLP shell commands to kick off training, evaluation, and testing.

We use the Allentune for hyperparameter search. For installing a compatible version of the Allentune library, please download the allentune git repo outside of dygiepp directory using:

git clone https://github.com/allenai/allentune.git

Then replace the files provided in this repository using command

cp -r allentune_files/[location of downloaded allentune]

The you can proceed with installing allentune by running

pip install --editable .

in allentune downloaded folder.

After installing allentune please proceed with installing required libraries for DyGIE++. The necessary dependencies can be installed with

pip install -r requirements.txt

Making predictions on existing datasets

To make a prediction, you can use allennlp predict. For example, to make a prediction with a pretrained granular relation model:

allennlp predict pretrained/mechanic-granular.tar.gz \
    data/mechanic/granular/test.json \
    --predictor dygie \
    --include-package dygie \
    --use-dataset-reader \
    --output-file predictions/granular-test.jsonl \
    --cuda-device 0 \
    --silent

For predicting coarse relations using a pretrained model:

allennlp predict pretrained/mechanic-coarse.tar.gz \
    data/mechanic/coarse/test.json \
    --predictor dygie \
    --include-package dygie \
    --use-dataset-reader \
    --output-file predictions/coarse-test.jsonl \
    --cuda-device 0 \
    --silent

Running these commands will provide json-formatted predictions.

Alternatively you can use the predict scripts provided by this library to generate both .tsv and .json file. You can use :

python predict_coarse.py --data_dir data/mechanic/coarse --device 0 --serial_dir pretrained/mechanic-coarse.tar.gz  --pred_dir predictions/coarse-test/

for coarse relation predictions and

python predict_granular.py --data_dir data/mechanic/granular --device 0 --serial_dir pretrained/mechanic-granular.tar.gz  --pred_dir predictions/granular-test/

for granular relation predictions.

Relation extraction evaluation metric

We report Precision/Recall/F1 measured by using exact and partial span-matching functions. Full details are described in our paper.

Training with Allentune

We use Allentune for hyperparameter tuning. To train a model for coarse relation extraction using Allentune, you can run the script below.

python scripts/train/train_coarse_allentune.py --data_dir data/mechanic/coarse/ --device 0,1,2,3 --serial_dir models/coarse/ --gpu_count 4 --cpu_count 12 --device 0,1,2,3

To train the model for granular relations:

python scripts/train/train_granular_allentune.py --data_dir data/mechanic/granular/ --serial_dir ./models/granular --gpu_count 4 --cpu_count 12 --device 0,1,2,3

The default number of training samples is set to 30. For more training options please use the --h command.

To obtain predictions for the development set over all Allentune runs:

python predict_coarse_allentune.py --data_dir data/mechanic/coarse/ --device 0 --serial_dir models/coarse/ --pred_dir predictions/coarse

for the coarse relation model and

python predict_granular_allentune.py --serial_dir ./models/granular --data_dir ./data/mechanic/granular/ --pred_dir ./predictions/granular/

for the granular relation model.

You can get test set predcitions by indicating only the run index you want to use for inference:

python predict_coarse_allentune.py --data_dir data/mechanic/coarse/ --device 0,1,2,3 --serial_dir models/coarse/  --pred_dir predictions/coarse

for coarse relations and

python predict_granular_allentune.py --serial_dir ./models/granular --data_dir ./data/mechanic/granular/ --pred_dir ./predictions/granular/ --test_data --test_index 17

for granular relations.

Citation

If using our dataset and models, please cite:

@inproceedings{mechanisms21,
    title={{Extracting a Knowledge Base of Mechanisms from COVID-19 Papers
}},
    author={Tom Hope and Aida Amini and David Wadden and Madeleine van Zuylen and Sravanthi Parasa and Eric Horvitz and Daniel Weld and Roy Schwartz and Hannaneh Hajishirzi},
    year={2021},
    journal={NAACL}
}

Contact us

Please don't hesitate to reach out.

Email: tomh@allenai.org, amini91@cs.washington.edu

Name	Name	Last commit message	Last commit date
Latest commit tomhoper Update README.md May 29, 2022 83d533d · May 29, 2022 History 49 Commits
allentune_files	allentune_files	allentune_files	May 21, 2021
categorization	categorization	categorizatio added	May 21, 2021
doc	doc	dygiepp-cofie	Sep 26, 2020
dygie	dygie	name change	Apr 11, 2021
notebooks	notebooks	dygiepp-cofie	Sep 26, 2020
scripts	scripts	fixing names	Apr 19, 2021
training_config	training_config	name change	Apr 11, 2021
.DS_Store	.DS_Store	scripts fixed	Apr 19, 2021
.gitignore	.gitignore	dygiepp-cofie	Sep 26, 2020
COFIE-G.png	COFIE-G.png	Add files via upload	Oct 8, 2020
COFIE.png	COFIE.png	Add files via upload	Oct 8, 2020
KG_create_and_eval.py	KG_create_and_eval.py	distil	Oct 21, 2020
KG_search_utils.py	KG_search_utils.py	dygiepp-cofie	Sep 26, 2020
README.md	README.md	Update README.md	May 29, 2022
create_kb_embeddings.py	create_kb_embeddings.py	kb	Oct 21, 2020
decode.py	decode.py	dygiepp-cofie	Sep 26, 2020
dygie_visualize_util.py	dygie_visualize_util.py	dygiepp-cofie	Sep 26, 2020
eval_granular_allentune.py	eval_granular_allentune.py	fixing names	Apr 19, 2021
eval_metric_coarse.py	eval_metric_coarse.py	fixing names	Apr 19, 2021
eval_metric_coarse_allentune.py	eval_metric_coarse_allentune.py	fixing names	Apr 19, 2021
eval_utils.py	eval_utils.py	dygiepp-cofie	Sep 26, 2020
predict_coarse.py	predict_coarse.py	fixing names	Apr 19, 2021
predict_coarse_allentune.py	predict_coarse_allentune.py	fixing names	Apr 19, 2021
predict_granular.py	predict_granular.py	fixing names	Apr 19, 2021
predict_granular_allentune.py	predict_granular_allentune.py	fixing names	Apr 19, 2021
requirements.txt	requirements.txt	dygiepp-cofie	Sep 26, 2020
scierc_pred_to_mechanic.py	scierc_pred_to_mechanic.py	fixing names	Apr 19, 2021
task_queries.py	task_queries.py	dygiepp-cofie	Sep 26, 2020
teaser.png	teaser.png	name change	Apr 11, 2021
teaserNEW.pdf	teaserNEW.pdf	name change	Apr 11, 2021
vocab.py	vocab.py	dygiepp-cofie	Sep 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News

Annotated datasets

Pretrained models

Downloads

Table of Contents

Dependencies

Making predictions on existing datasets

Relation extraction evaluation metric

Training with Allentune

Citation

Contact us

About

Releases

Packages

Contributors 3

Languages

AidaAmini/DyGIE-MECHANIC

Folders and files

Latest commit

History

Repository files navigation

News

Annotated datasets

Pretrained models

Downloads

Table of Contents

Dependencies

Making predictions on existing datasets

Relation extraction evaluation metric

Training with Allentune

Citation

Contact us

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages