先日終了したkaggleのNBMEコンペに参加し、49位で銀メダルを獲得しました。
参加期間中に、医療分野のtransformerモデルについて調べたので、まとめてみようと思います。(抜け漏れがあれば、コメントやSNSなどで教えていただけると大変助かります。)
BERT
SciBERT (2019)
base
uncased, cased
models: https://huggingface.co/models?sort=downloads&search=allenai%2Fscibert
github: https://github.com/allenai/scibert
paper: https://aclanthology.org/D19-1371/
BioBERT (2019)
base, large
cased
models: https://huggingface.co/models?sort=downloads&search=dmis-lab%2Fbiobert
github: https://github.com/dmis-lab/biobert
paper: https://academic.oup.com/bioinformatics/article/36/4/1234/5566506
ClinicalBERT (2019)
base
cased
models: https://huggingface.co/models?sort=downloads&search=emilyalsentzer%2Fclinicalbert
github: https://github.com/EmilyAlsentzer/clinicalBERT
paper: https://aclanthology.org/W19-1909/
BlueBERT (2019)
base, large
uncased
models: https://huggingface.co/models?sort=downloads&search=bionlp%2Fbluebert
github: https://github.com/ncbi-nlp/bluebert
paper: https://aclanthology.org/W19-5006/
PubMedBERT (2020)
base (largeは未公開)
uncased
models: https://huggingface.co/models?sort=downloads&search=microsoft%2Fbiomednlp-pubmedbert
github: None
paper: https://arxiv.org/abs/2007.15779
BioLinkBERT (2022)
base, large
uncased
models: https://huggingface.co/models?sort=downloads&search=michiyasunaga%2Fbiolinkbert
github: https://github.com/michiyasunaga/LinkBERT
paper: https://arxiv.org/abs/2203.15827
RoBERTa
BioMedRoBERTa (2020)
base
cased
models: https://huggingface.co/models?sort=downloads&search=allenai%2Fbiomedroberta
github: https://github.com/allenai/dont-stop-pretraining
paper: https://arxiv.org/abs/2004.10964
BioLMRoBERTa (2020)
base, large
cased
models: githubからダウンロード可能(Attribution-NonCommercial 4.0 International)
github: https://github.com/facebookresearch/bio-lm
paper: https://aclanthology.org/2020.clinicalnlp-1.17/
BART
BioBART (2022)
base, large
cased
models: https://huggingface.co/models?sort=downloads&search=ganjinzero%2Fbiobart
github: https://github.com/GanjinZero/BioBART
paper: https://arxiv.org/abs/2204.03905
ALBERT
BioM-ALBERT (2021)
xxlarge
uncased
models: https://huggingface.co/models?sort=downloads&search=sultan%2Fbiom-albert-xxlarge
github: https://github.com/salrowili/BioM-Transformers
paper: https://aclanthology.org/2021.bionlp-1.24/
ELECTRA
BioELECTRA (2021)
base
uncased
models: https://huggingface.co/models?sort=downloads&search=kamalkraj%2Fbioelectra
github: https://github.com/kamalkraj/BioELECTRA
paper: https://aclanthology.org/2021.bionlp-1.16/
BioM-ELECTRA (2021)
base, large
uncased
models: https://huggingface.co/models?sort=downloads&search=sultan%2Fbiom-electra
github: https://github.com/salrowili/BioM-Transformers
paper: https://aclanthology.org/2021.bionlp-1.24/
T5
SciFive (2021)
base, large
cased
models: https://huggingface.co/models?sort=downloads&search=razent%2Fscifive
github: https://github.com/justinphan3110/SciFive
paper: https://arxiv.org/abs/2106.03598
Others
その他に見つけたモデルを簡単にまとめておきます。
bioformer: https://huggingface.co/bioformers
pegasus-pubmed: https://huggingface.co/google/pegasus-pubmed
bigbird-pegasus-pubmed: https://huggingface.co/google/bigbird-pegasus-large-pubmed
sentence-transformer-biobert: https://huggingface.co/pritamdeka/S-BioBert-snli-multinli-stsb
コメント