Abstract
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
PreviewTips
Download
To download attachments, please log in.
最后修改: 2021/11/30 11:06 | 作者: Cheng Xinyu