Kevin Li

Diffusion Explainer: Interactive Visual Learning for Stable Diffusion

Seongmin Lee
Ben Hoover
Hendrik Strobelt
Zijie J. Wang
Anthony Peng
Austin Wright
Haekyu Park
Alex Yang
Polo Chau
The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Abstract

Stable Diffusion is a popular image generation tool that turns text prompts into fascinating visuals. However, its complex model structure and operations make it hard to understand. We present Diffusion Explainer, a web-based interactive visualization tool that explains how Stable Diffusion transforms a text prompt into an image. Our tool provides both high-level summaries of the image generation process and detailed explanations of each component’s low-level operations by animating the transition across multiple levels. Diffusion Explainer is developed using modern web technologies and runs locally in users’ web browsers, making it accessible to a wider audience. Diffusion Explainer is available at the following public demo link: https://poloclub.github.io/diffusion-explainer. A video demo is available at https://youtu.be/bIQz72w-XaU.

Materials

Project
PDF
Code

BibTeX

			
# @inproceedings{li2022argoscholar, 
#   author = {Li, Kevin and Yang, Haoyang and Montoya, Evan and Upadhayay, Anish and Zhou, Zhiyan and Saad-Falcon, Jon and Chau, Duen Horng},
#   title = {Visual Exploration of Literature with Argo Scholar},
#   year = {2022},
#   isbn = {9781450392365},
#   publisher = {Association for Computing Machinery},
#   url = {https://doi.org/10.1145/3511808.3557177},
#   doi = {10.1145/3511808.3557177}
}