Manuscript received October 14, 2022; revised July 22, 2022; accepted November 11, 2022.
Abstract—We address the open problem of unsupervised multimodal multi-domain image-to-image (I2I) translation using a generative adversarial network. Previous works, such as MUNIT and DRIT, are able to translate images among multiple domains, but they generate images of inferior quality and less diverse. Moreover, they require training n(n-1) generators and n discriminators for learning to translate images among n domains, which is computationally expensive. In this paper, we propose a simpler yet more effective framework for unsupervised multimodal multi-domain I2I translation. Our approach only consists of a mapping network, a encode-decoder pair (generator), and a discriminator. Our method assumes that the latent space can be decomposed into content and style sub-spaces by the encoder, where content space is deemed domain-invariant and style space is domain-dependent. Unlike MUNIT and DRIT that simply sample style codes from a standard normal distribution when translating, we employ a mapping network to learn the style of different domains, which yields better translation results. Translation is done through the decoder by keeping content codes and exchanging the style codes. To encourage diversity in translated images, we employ style regularizations and inject Gaussian noise into the decoder. Extensive experiments show that our framework is superior or comparable to state-of-the-art baselines.
Index Terms—Unsupervised multimodal multi-domain image-to-image translation, style codes, content codes, mapping network
Lei Luo and William H. Hsu are with the Computer Science Department, Kansas State University, Manhattan, KS, 66506, USA.
Shangxian Wang is with the Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, 21210, USA.
Cite: Lei Luo*, Shangxian Wang, and William H. Hsu, "UNMMIT: A Unified Framework on Unsupervised Multimodal Multi-domain Image-to-Image Translation," International Journal of Machine Learning vol. 13, no. 2, pp. 77-81, 2023.Copyright @ 2023 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).