出处 | 模型 | 准确率 | F1值 | 文本模态 | 视觉模态 | 语音模态 |
[51] | MHSAN | 78.70% | / | word2vec | 3D-CNN | openSMILE |
[52] | Multilogue-Net | 81.19% | 80.10% | CNN | 3D-CNN | openSMILE |
[53] | HFFN | 80.19% | 80.34% | Glove | FACET | COVAREP |
[54] | MMMU-BA | 82.31% | / | word2vec | 3D-CNN | openSMILE |
[55] | MARNN | 84.31% | / | word2vec | 3D-CNN | openSMILE |
[56] | MulT | 83% | 82.80% | Glove | FACET | COVAREP |