High-resolution virtual try-on with garment extraction using generative adversarial networks

Charters, Daniel J

High-resolution virtual try-on with garment extraction using generative adversarial networks

dc.contributor.advisor	Britz, Stefan S
dc.contributor.advisor	Bernicchi, Dino
dc.contributor.author	Charters, Daniel J
dc.date.accessioned	2025-01-23T09:17:42Z
dc.date.available	2025-01-23T09:17:42Z
dc.date.issued	2024
dc.date.updated	2025-01-23T08:00:21Z
dc.description.abstract	Image-based virtual try-on aims to depict an individual wearing a garment not originally worn by them. While existing literature predominantly focuses on garments from standalone images, this research addresses the use of images where the garment is already being worn by another individual. The study bridges a notable gap as most current systems are tailored for standalone garment images. The proposed system, given a pair of high-resolution images, extracts the garment from one, refines it using context-aware image inpainting, and subsequently transfers it onto the second image's subject. The methodology incorporates various off-the-shelf models, notably Part Grouping Network (PGN), Densepose, and OpenPose for pre-processing. A state-of-the-art context-aware inpainting model refines the garments, and the final synthesis leverages the HR-VITON architecture, producing images at a resolution of 768 × 1024. Distinctively, our model processes both standalone and garment-on-person images. Evaluating the models involves testing on 2 032 high-resolution images under both paired and unpaired conditions. Metrics such as RMSE, Peak Signal-to-Noise Ratio (PSNR), Learned Perceptual Image Patch Similarity (LPIPS), Structural Similarity (SSIM), Inception Score (IS), Fréchet Inception Distance (FID), and Kernel Inception Distance (KID) assessed the model's prowess. Benchmarked against HR-VITON, ACGPN, and CP-VTON, our model slightly trailed HR-VITON but notably surpassed ACGPN and CP-VTON. In realistic, unpaired conditions, the model achieved an IS of 3.152, an FID of 15.3, and a KID of 0.0063. This is compared to an IS of 3.398, an FID of 11.93, and a KID of 0.0034 achieved by HR-VITON on the same data. ACGPN has an FID of 43.29, and a KID of 0.0373, while CP-VTON has an FID of 43.28, while it has a KID of 0.0376. IS is not measured for both ACGPN and CP-VTON. An ablation study underscored the importance of context-aware inpainting in our network. The findings highlight the model's ability to generate convincing, high-resolution virtual try-on images from garment-on-person extractions, addressing a prevalent gap in the literature and offering tangible applications in high-resolution virtual try-on image generation.
dc.identifier.apacitation	Charters, D. J. (2024). <i>High-resolution virtual try-on with garment extraction using generative adversarial networks</i>. (). University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. Retrieved from http://hdl.handle.net/11427/40827	en_ZA
dc.identifier.chicagocitation	Charters, Daniel J. <i>"High-resolution virtual try-on with garment extraction using generative adversarial networks."</i> ., University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2024. http://hdl.handle.net/11427/40827	en_ZA
dc.identifier.citation	Charters, D.J. 2024. High-resolution virtual try-on with garment extraction using generative adversarial networks. . University of Cape Town ,Faculty of Science ,Department of Statistical Sciences. http://hdl.handle.net/11427/40827	en_ZA
dc.identifier.ris	TY - Thesis / Dissertation AU - Charters, Daniel J AB - Image-based virtual try-on aims to depict an individual wearing a garment not originally worn by them. While existing literature predominantly focuses on garments from standalone images, this research addresses the use of images where the garment is already being worn by another individual. The study bridges a notable gap as most current systems are tailored for standalone garment images. The proposed system, given a pair of high-resolution images, extracts the garment from one, refines it using context-aware image inpainting, and subsequently transfers it onto the second image's subject. The methodology incorporates various off-the-shelf models, notably Part Grouping Network (PGN), Densepose, and OpenPose for pre-processing. A state-of-the-art context-aware inpainting model refines the garments, and the final synthesis leverages the HR-VITON architecture, producing images at a resolution of 768 × 1024. Distinctively, our model processes both standalone and garment-on-person images. Evaluating the models involves testing on 2 032 high-resolution images under both paired and unpaired conditions. Metrics such as RMSE, Peak Signal-to-Noise Ratio (PSNR), Learned Perceptual Image Patch Similarity (LPIPS), Structural Similarity (SSIM), Inception Score (IS), Fréchet Inception Distance (FID), and Kernel Inception Distance (KID) assessed the model's prowess. Benchmarked against HR-VITON, ACGPN, and CP-VTON, our model slightly trailed HR-VITON but notably surpassed ACGPN and CP-VTON. In realistic, unpaired conditions, the model achieved an IS of 3.152, an FID of 15.3, and a KID of 0.0063. This is compared to an IS of 3.398, an FID of 11.93, and a KID of 0.0034 achieved by HR-VITON on the same data. ACGPN has an FID of 43.29, and a KID of 0.0373, while CP-VTON has an FID of 43.28, while it has a KID of 0.0376. IS is not measured for both ACGPN and CP-VTON. An ablation study underscored the importance of context-aware inpainting in our network. The findings highlight the model's ability to generate convincing, high-resolution virtual try-on images from garment-on-person extractions, addressing a prevalent gap in the literature and offering tangible applications in high-resolution virtual try-on image generation. DA - 2024 DB - OpenUCT DP - University of Cape Town KW - data science LK - https://open.uct.ac.za PB - University of Cape Town PY - 2024 T1 - High-resolution virtual try-on with garment extraction using generative adversarial networks TI - High-resolution virtual try-on with garment extraction using generative adversarial networks UR - http://hdl.handle.net/11427/40827 ER -	en_ZA
dc.identifier.uri	http://hdl.handle.net/11427/40827
dc.identifier.vancouvercitation	Charters DJ. High-resolution virtual try-on with garment extraction using generative adversarial networks. []. University of Cape Town ,Faculty of Science ,Department of Statistical Sciences, 2024 [cited yyyy month dd]. Available from: http://hdl.handle.net/11427/40827	en_ZA
dc.language.rfc3066	eng
dc.publisher.department	Department of Statistical Sciences
dc.publisher.faculty	Faculty of Science
dc.publisher.institution	University of Cape Town
dc.subject	data science
dc.title	High-resolution virtual try-on with garment extraction using generative adversarial networks
dc.type	Thesis / Dissertation
dc.type.qualificationlevel	Masters
dc.type.qualificationlevel	MSc

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis_sci_2024_charters daniel j.pdf
Size:: 51.3 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.72 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters