Fine-grained Defocus Blur Control for Generative Image Models

1University of Michigan, 2Adobe Research

TL;DR: We propose a text-to-image model that precisely controls the amount and location of defocus blur in generated images while preserving the scene content.


teaser image
(a) Text-to-Image Methods: Prior methods like Camera-Settings-as-Tokens1, fail to maintain scene content when adjusting aperture.
teaser image
(b) Ours (Text-to-Image + Lens Model): Our model produces images with controllable defocus blur while preserving the scene. Each image reflects the specified aperture and focus distance: lower focus distances sharpen the foreground, while higher ones shift focus to the background. The orange bar [ marks the in-focus region. Higher f-stops result in reduced blur, producing an all-in-focus image at f/22.

Abstract

Current text-to-image diffusion models excel at generating diverse, high-quality images, yet they struggle to incorporate fine-grained camera metadata such as precise aperture settings. In this work, we introduce a novel text-to-image diffusion framework that leverages camera metadata, or EXIF data, which is often embedded in image files, with an emphasis on generating controllable lens blur. Our method mimics the physical image formation process by first generating an all-in-focus image, estimating its monocular depth, predicting a plausible focus distance with a novel focus distance transformer, and then forming a defocused image with an existing differentiable lens blur model2. Gradients flow backwards through this whole process, allowing us to learn without explicit supervision to generate defocus effects based on content elements and the provided EXIF data. At inference time, this enables precise interactive user control over defocus effects while preserving scene contents, which is not achievable with existing diffusion models. Experimental results demonstrate that our model enables superior fine-grained control without altering the depicted scene.

Method

method diagram

Our model obtains its supervision from a differentiable lens model and training examples of images with shallow and deep depth-of-field. We train our model to generate an all-in-focus image using $G_{\theta}$. A depth model then predicts depth for this image, which, along with the image itself, is fed into a model that estimates the focus distance $f_d$ and depth scale $f_s$. Finally, a lens model combines EXIF data with these predictions to apply spatially varying blur, generating the final image. We train the all-in-focus generator using unsupervised DMD23 losses on our unpaired Deep DoF dataset and optimize the entire pipeline with DMD2 losses on the unpaired Shallow DoF dataset.

Results

Comparisons with SDXL + Dr.Bokeh

Ours vs SDXL result
Ours vs SDXL result
Ours vs SDXL result

Comparisons with Camera Settings as Tokens

Ours vs SDXL result
Ours vs SDXL result

Comparison with EBB! Dataset

Drag the slider on the images to compare all-in-focus and defocused images.

Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Prompt: A red and gray fire hydrant is positioned next to a dense green hedge along a sidewalk. The hydrant has a red cap. The surrounding area includes a concrete curb and scattered leaves on the pavement.
Prompt: Vibrant pink rose blooming in a lush garden, surrounded by green foliage and a rustic wooden fence.
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Prompt: Urban graffiti on a concrete block near a road, with greenery sprouting around its base.
Prompt: A green wheeled trash bin is placed on the street in front of a house with a brown wooden door and a gray and brown picket fence. The house has a peach-colored exterior with white window shutters. Another gray trash bin is partially visible behind the fence.
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Prompt: A weathered stone sphere with patches of moss, set against a backdrop of autumn foliage and a modern building.
Prompt: A gray Nissan Juke parked on a residential street with a hedge and house in the background.
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Prompt: Red and white bollard next to a tree with fallen autumn leaves on the ground.
Prompt: A weathered stone lantern, standing tall amidst the vibrant autumn foliage.
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
Generated image
Aperture: f/16
Defocused All-in-focus
Prompt: A yellow pole stands in a park surrounded by trees and greenery.
Prompt: A fire hydrant with the number 674 on it stands in a wooded area with trees and bushes in the background.
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus
Aperture: f/1.8
EBB! dataset image
Aperture: f/16
Defocused All-in-focus

Controllability

By varying the focus distance and the aperture, we show the controllability of defocus blur using our model.

More Examples

Related Work

BibTeX

@InProceedings{shrivastava2025defocusblur,
      title     = {Fine-grained Defocus Blur Control for Generative Image Models},
      author    = {Shrivastava, Ayush and Barnes, Connelly and Zhang, Xuaner and Zhang, Lingzhi and Owens, Andrew and Amirghodsi, Sohrab and Shechtman, Eli},
      booktitle = {arXiv},
      year      = {2025},
      url       = {https://arxiv.org/abs/2510.06215},
}