参数项
参数值
Torch.Size
[x, 1, 128]
Self Attention Hidden layer dimension
768
Dropout
0.5
Learning Rate
0.001
Epoch
10
Batch Size
8