训练配置

具体参数

Batch size

20

Hidden size

1500

Num steps

40

Init scale

0.05

Max grad_norm

10

Epoch start_decay

16

Max epoch

60

Lr decay

0.65

Dropout

1.1