收录:
摘要:
This article presents a new text-to-image (T2I) generation model, named distribution regularization generative adversarial network (DR-GAN), to generate images from text descriptions from improved distribution learning. In DR-GAN, we introduce two novel modules: a semantic disentangling module (SDM) and a distribution normalization module (DNM). SDM combines the spatial self-attention mechanism (SSAM) and a new semantic disentangling loss (SDL) to help the generator distill key semantic information for the image generation. DNM uses a variational auto-encoder (VAE) to normalize and denoise the image latent distribution, which can help the discriminator better distinguish synthesized images from real images. DNM also adopts a distribution adversarial loss (DAL) to guide the generator to align with normalized real image distributions in the latent space. Extensive experiments on two public datasets demonstrated that our DR-GAN achieved a competitive performance in the T2I task. The code link: https://github.com/Tan-H-C/DR-GAN-Distribution-Regularization-for-Text-to-Image-Generation.
关键词:
通讯作者信息:
来源 :
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
ISSN: 2162-237X
年份: 2022
期: 12
卷: 34
页码: 10309-10323
1 0 . 4
JCR@2022
1 0 . 4 0 0
JCR@2022
ESI学科: COMPUTER SCIENCE;
ESI高被引阀值:46
JCR分区:1
中科院分区:1
归属院系: