Abstract
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
PreviewTips
Download
To download attachments, please log in.
Last Modified: 2021/11/30 11:06 | Author: Cheng Xinyu