RealFill: Image completion using diffusion models

>>flavor+(OP)
There's definitely value in providing this functionality for photographs taken in the present.

But I think the real value -- and this is definitely in Google's favor -- is providing this functionality for photos you have taken in the past.

I have probably 30K+ photos in Google Photos that capture moments from the past 15 years. There are quite a lot of them where I've taken multiple shots of the same scene in quick succession, and it would be fairly straightforward for Google to detect such groupings and apply the technique to produce synthesized pictures that are better than the originals. It already does something similar for photo collages and "best in a series of rapid shots." They surface without my having to do anything.

>>jawns+ao
> ..fairly straightforward for Google to detect such groupings and apply the technique to produce synthesized pictures that are better than the originals.

Wouldn't an operation like this require some kind of fine-tuning? Or do diffusion models have a way of using images as context, the way one would provide context to an LLM?

>>fenoma+7b1
I think simpler algorithms (e.g. image histograms) can get you a long way. Regardless of the mechanism, Google Photos already has the capability to detect similar images, which is used to generate animated gifs.

zlacker