USABILITY ANALYSIS OF STABLE DIFFUSION-BASED GENERATIVE MODEL FOR ENRICHING BATIK BAKARAN PATTERN SYNTHESIS
Abstract
The rapid development of technology today helps us in various fields of work. One of the fields that can utilize technology in helping their work is batik. Utilizing Deep Learning to manage data in the form of batik pattern images and typical bakaran batik patterns using the Generative Model method, namely Stable Diffusion which aims to produce better and more detailed batik pattern images by maintaining the original pattern of batik patterns and typical bakaran batik patterns. This research only uses datasets in the form of batik pattern images and typical bakaran batik patterns. The image data is processed augmentation first by performing the inverse on the image, resizing the image to 512x512, then randomly rotating the image, performing a random horizontal flip on the image, and performing the inverse again on the image. Pre-Training on image data to find the right parameters and conditions used in the training process. The result of this research is that the Stable Diffusion model version 1.4 and version 2.1 show good performance in processing and creating batik pattern images and batik patterns typical of Bakaran. In this study, the score calculation process for Stable Diffusion version 1.4 and version 2.1 was carried out using Inception Score and CLIP Score to calculate the images generated from the two versions. In the calculation using CLIP Score, the results obtained by version 1.4 are higher than version 2.1 for the same reason as Inception Score because the image produced by version 1.4 is more abstract. Of the two versions used is version 1.4 because the resulting image shows an abstract image that reflects a good batik pattern. Then, the version used to process batik patterns and batik patterns typical of Bakaran is Stable Diffusion version 1.4 which shows excellent performance in processing batik pattern images. The results of Stable Diffusion version 1.4 show good and abstract batik patterns in accordance with the characteristics of Bakaran batik.
Keywords
Full Text:
PDFReferences
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-Resolution Image Synthesis with Latent Diffusion Models.” arXiv, Apr. 13, 2022. Accessed: May 09, 2023. [Online]. Available: http://arxiv.org/abs/2112.10752
S. Motamed, P. Rogalla, and F. Khalvati, “Data augmentation using Generative Adversarial Networks (GANs) for GAN-based detection of Pneumonia and COVID-19 in chest X-ray images,” Informatics in Medicine Unlocked, vol. 27, p. 100779, 2021, doi: 10.1016/j.imu.2021.100779.
I. Peis, P. M. Olmos, and A. Artés-Rodríguez, “Unsupervised learning of global factors in deep generative models,” Pattern Recognition, vol. 134, p. 109130, Feb. 2023, doi: 10.1016/j.patcog.2022.109130.
B. Bosquet, D. Cores, L. Seidenari, V. M. Brea, M. Mucientes, and A. D. Bimbo, “A full data augmentation pipeline for small object detection based on generative adversarial networks,” Pattern Recognition, vol. 133, p. 108998, Jan. 2023, doi: 10.1016/j.patcog.2022.108998.
X. Li, Y. Makihara, C. Xu, Y. Yagi, and M. Ren, “Gait recognition invariant to carried objects using alpha blending generative adversarial networks,” Pattern Recognition, vol. 105, p. 107376, Sep. 2020, doi: 10.1016/j.patcog.2020.107376.
R. Wang et al., “Applications of generative adversarial networks in neuroimaging and clinical neuroscience,” NeuroImage, vol. 269, p. 119898, Apr. 2023, doi: 10.1016/j.neuroimage.2023.119898.
C. Athanasiadis, E. Hortal, and S. Asteriadis, “Audio–visual domain adaptation using conditional semi-supervised Generative Adversarial Networks,” Neurocomputing, vol. 397, pp. 331–344, Jul. 2020, doi: 10.1016/j.neucom.2019.09.106.
C. Ates, F. Karwan, M. Okraschevski, R. Koch, and H.-J. Bauer, “Conditional Generative Adversarial Networks for modelling fuel sprays,” Energy and AI, vol. 12, p. 100216, Apr. 2023, doi: 10.1016/j.egyai.2022.100216.
J. Gordon and J. M. Hernández-Lobato, “Combining deep generative and discriminative models for Bayesian semi-supervised learning,” Pattern Recognition, vol. 100, p. 107156, Apr. 2020, doi: 10.1016/j.patcog.2019.107156.
W. Wan, Y. Yang, and H. J. Lee, “Generative adversarial learning for detail-preserving face sketch synthesis,” Neurocomputing, vol. 438, pp. 107–121, May 2021, doi: 10.1016/j.neucom.2021.01.050.
S. U. Saeed et al., “Bi-parametric prostate MR image synthesis using pathology and sequence-conditioned stable diffusion.” arXiv, Mar. 03, 2023. Accessed: May 15, 2023. [Online]. Available: http://arxiv.org/abs/2303.02094
J. Shipard, A. Wiliem, K. N. Thanh, W. Xiang, and C. Fookes, “Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot Classification via Stable Diffusion.” arXiv, Apr. 16, 2023. Accessed: May 15, 2023. [Online]. Available: http://arxiv.org/abs/2302.03298
S. Lee et al., “Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion.” arXiv, May 08, 2023. Accessed: May 15, 2023. [Online]. Available: http://arxiv.org/abs/2305.03509
A. Nichol et al., “GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models.” arXiv, Mar. 08, 2022. Accessed: Nov. 01, 2023. [Online]. Available: http://arxiv.org/abs/2112.10741
A. Grover, M. Dhar, and S. Ermon, “Flow-GAN: Combining Maximum Likelihood and Adversarial Learning in Generative Models,” AAAI, vol. 32, no. 1, Apr. 2018, doi: 10.1609/aaai.v32i1.11829.
S. Barratt and R. Sharma, “A Note on the Inception Score.” arXiv, Jun. 21, 2018. Accessed: Nov. 01, 2023. [Online]. Available: http://arxiv.org/abs/1801.01973
K. G. Hartmann, R. T. Schirrmeister, and T. Ball, “EEG-GAN: Generative adversarial networks for electroencephalograhic (EEG) brain signals.” arXiv, Jun. 05, 2018. Accessed: Nov. 01, 2023. [Online]. Available: http://arxiv.org/abs/1806.01875
DOI: https://doi.org/10.24167/proxies.v7i2.12472
Copyright (c) 2024 Proxies : Jurnal Informatika
View My Stats