Treffer: 生成式对抗网络及其计算机视觉应用研究综述.
Weitere Informationen
Objective The appearance of generative adversarial networks ( GANs) provides a new approach and a framework for the application of computer vision. (IAN generates high-quality samples with unique zero-sum game and adversarial training concepts, and therefore more powerful in both feature learning and representation than traditional machine learning algorithms. Remarkable achievements have been realized in the field of computer vision, especially in sample generation, which is one of the popular topics in current research. Method The research and application of different CAN models based on computer vision are reviewed based on the extensive research and the latest achievements of relevant literature. The typical (JAN network methods are introduced, categorized, and compared in experiments by using generation samples to present their performance and summarized the research status and development trends in computer vision fields, such as high-quality image generation, style transfer and image translation, text-image mutual generation, image inpainting, and restoration. Kinally, existing major research problems are summarized and discussed, and potential future research directions are presented. Result Since the emergence of OAN, many variations have been proposed for different fields, either structural improvement or development of theory or innovation in applications. Different (JAN models have advantages and disad- vantages in terms of generating examples, have significant achievements in many fields, especially the computer vision, and can generate examples such as the real ones. However, they also have unique problems, such as non-convergence, model collapse, and uncontrollability due to high degree-of-freedom. Priori hypotheses about the data in the original CAN. whose final goals are to realize infinite modeling power and fit for all distributions, hardly exits. In addition, the designs of CAN models are simple. A complex function model need not be pre-designed. and the generator and the discriminator can work normally with the back propagation algorithm. Moreover. CAN can use a machine to interact with other machines through continuous confrontation and learn the inherent laws in the real world with sufficient data training. Each aspect has two sides, and a series of problems are hidden behind the goal of infinite modeling. The generation process is extremely flexible that the stability and convergence of the training process cannot be guaranteed. Model collapse will likely occur and further training cannot be achieved. The original CAN has the following problems; disappearance of gradients, training difficulties , the losses of generator and discriminator cannot indicate the training process, the lack of diversities in the generated samples, and easy over-fitting. Discrete distributions are also difficult to generate due to the limitations of CAN. Many researchers have proposed new ways to address these problems, and several landmark models, such as DCGAN, CGAN, WCAN, WCAN-CP. EBCAN, BEGAN. InfoCAN, and I.SGAN, have been introduced. DCGAN combines GAN with GNN and performs well in the field of computer vision. Furthermore, DCGAN sets a series of limitations for the CNN network so it can be trained stably and use the learned feature representation for sample generation and image classification. CGAN inputs the conditional variable (c) with the random variable (z) and the real data (x) to guide the data generation process. The conditional variable (c) can be category labels, texts, and generated targets. The straightforward improvement proves to be extremely effective and has been widely used in subsequent work. WGAN uses the Wasserstein dis-organized, whereas other models have been able to express the outline of the objects roughly. However, the images generated by BEGAN have the sharpest edges and rich image diversities in the experiments. The discriminator of BEGAN draws lessons from EBGAN, and the loss of generator refers to the loss of WGAN. It also proposes a hyper parameter that can measure the diversity of generated samples to balance D and G and stabilize the training process. The internal texture of the generated images of InfoGAN is poor, and the shape of the generated objects is the same. As for the generator, in addition to the input noise (z). a controllable variable (c) is added, which contains interpretable information about the data to control the generative results, resulting in poor diversity. L5GAN can generate high quality examples because the object function of least squares loss replaces the cross-entropy loss, which partly solves the two shortcomings (i. e,low-quality and instability of training process) . Conclusion GAN has significant theoretical and practical values as a new generative model. It provides a good solution to problems of insufficient sample, poor quality of generation, and difficulties in extracting features. GAN is an inclusive framework that can be combined with most deep learning algorithms to solve problems that traditional machine learning algorithms cannot solve. However, it has theoretical problems that must be solved urgently. How to generate high-quality examples and a realistic scene is worth studying. Further GAN developments are predicted in the following areas; breakthrough of theory, development of algorithm, system of evaluation, system of specialism, and combination of industry. [ABSTRACT FROM AUTHOR]
目的生成式对抗网络(GAN) 的出现为计算机视觉应用提供了新的技术和手段,它以独特零和博弈与对抗训练的思想生成高质量的样本#具有比传统机器学习算法更强大的特征学习和特征表达能力( 目前在机器视觉领域尤其是样本生成领域取得了显著的成功,是当前研究的热点方向之一(方法(以生成式对抗网络的不同模型及其在计算机视觉领域的应用为研究对象#在广泛调研文献特别是N);的最新发展成果基础上,结合不同模型的对比试验#对每种方法的基本思想)方法特点及使用场景进行分析,并对(GAN) 的优势与劣势进行总结#阐述了(GAN); 研究的现状)在计算机视觉上的应用范围#归纳生成式对抗网络在高质量图像生成)风格迁移与图像翻译)文本与图像的相互生成和图像的还原与修复等多个计算机视觉领域的研究现状和发展趋势#并对每种应用的理论改进之处)优点)局限性及使用场景进行了总结#对未来可能的发展方向进行展望( 结果(GAN);的不同模型在生成样本质量与性能上各有优劣( GAN);模型在图像的处理上取得较大的成就#能生成以假乱真的样本#但是也存在网络不收敛)模型易崩溃)过于自由不可控的问题( 结论(GAN) 作为一种新的生成模型具有很高的研究价值与应用价值#但目前存在一些理论上的桎梏亟待突破,在应用方面生成高质量的样本)逼真的场景是值得研究的方向. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Image & Graphics is the property of Editorial Office of Journal of Image & Graphics and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)