参数 | 指代识别 | 实体消歧 |
Batch size | 32 | 24 |
Epochs | 30 | 1 |
Max length | 60 | 380 |
Learning rate | 1e−5 | 1e−5 |
Weight decay | 1e−2 | 1e−2 |
Warmup steps | 1000 | 1000 |
Dropout rate | 0.1 | 0.1 |