如何更好的预训练一个BERT
Reference
Train No Evil- Selective Masking for Task-Guided Pre-Training
Don’t Stop Pretraining- Adapt Language Models to Domains and Task
Recent Advances in Language Model Fine-tuning
Reducing Toxicity in Language Models
[1] BioBERT- a pre-trained biomedical language representation model for biomedical text mining
[2] SciBERT- A Pretrained Language Model for Scientific Text
[3] PatentBERT - Patent Classification by Fine-Tuning BERT Language Model
[4] FinBERT- Financial Sentiment Analysis with Pre-trained Language Models
[5] SentiBERT- A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics
[6] DomBERT- Domain-oriented Language Model for Aspect-based Sentiment Analysis