发表于: 2020/04/01 15:12 | 作者: NICA

本文的第一作者为博士生李丹,题目为:“Semi-supervised cross-modal image generation with generative adversarial networks”,发表于Pattern Recognition

  

以下为文章摘要:

 

Cross-modal image generation is an important aspect of the multi-modal learning. Existing methods usually use the semantic feature to reduce the modality gap. Although these methods have achieved notable progress, there are still some limitations: (1) they usually use single modality information to learn the semantic feature; (2) they require the training data to be paired. To overcome these problems, we propose a novel semi-supervised cross-modal image generation method, which consists of two semantic networks and one image generation network. Specifically, in the semantic networks, we use image modality to assist non-image modality for semantic feature learning by using a deep mutual learning strategy. In the image generation network, we introduce an additional discriminator to reduce the image reconstruction loss. By leveraging large amounts of unpaired data, our method can be trained in a semi-supervised manner. Extensive experiments demonstrate the effectiveness of the proposed method.

   

跨模态图像生成是多模态学习的一个重要方面。现有的方法通常利用语义特征来解决模态之间跨度大的问题。这些方法虽然取得了显著的进展,但仍然存在一些局限性:(1)它们通常使用单一的模态信息来学习语义特征;(2)它们要求训练数据是成对的。为了克服这些问题,我们提出了一种新的半监督跨模态图像生成方法,该方法由两个语义网络和一个图像生成网络组成。具体来说,在语义网络中,我们采用深度交互学习策略使图像模态辅助非图像模态进行语义特征学习。在图像生成网络中,我们引入了一个额外的鉴别器来减少图像重建的损失。通过利用大量的非成对数据,我们的方法可以以半监督的方式进行训练。大量实验证明了该方法的有效性。

 

文章链接:https://www.sciencedirect.com/science/article/pii/S0031320319303863?via%3Dihub