| 参数 | 指代识别 | 实体消歧 |
| Batch size | 32 | 24 |
| Epochs | 30 | 1 |
| Max length | 60 | 380 |
| Learning rate | 1e−5 | 1e−5 |
| Weight decay | 1e−2 | 1e−2 |
| Warmup steps | 1000 | 1000 |
| Dropout rate | 0.1 | 0.1 |