1. they made a lot of careful tweaks to the unet network architecture - it seems like they ran many different ablations here ("In total, our endeavor consumes approximately 512 TPUs spanning 30 days").
2. the model distillation is based on previous UFOGen work from the same team https://arxiv.org/abs/2311.09257 (hence the UFO graphic in the diffusion-gan diagram)
3. they train their own 8-channel latent encoder / decoder ("VAE") from scratch (similar to Meta's Emu paper) instead of using the SD VAEs like many other papers do
4. they use an internal dataset of 150m image/text pairs (roughly the size of laion-highres)
5. they also reran SD training from scratch on this dataset to get their baseline performance
"Google Assistant just made a dinner reservation for me... I knew this was coming... but mind blown!" https://www.reddit.com/r/googlehome/comments/ezv3us/google_a...
"Now, you can use it on all Pixel phones in 43 U.S. states.
All it takes is a few seconds to tell your Assistant where you'd like to go. Just ask the Assistant on your phone, “Book a table for four people at [restaurant name] tomorrow night.” The Assistant will then call the restaurant to see if it can accommodate your request. Once your reservation is successfully made, you’ll receive a notification on your phone, an email update and a calendar invite so you don’t forget."
https://blog.google/products/assistant/book-table-google-ass...
https://www.reddit.com/r/googlehome/comments/ezv3us/google_a...
Or the comments under https://youtu.be/-RHG5DFAjp8
It's probably hard to trigger these days because most places support OpenTable or similar.
2022-05 - google imagen research paper posted >>31484562
2022-12 - imagen developers leave google to form ideogram
2023-08 - ideogram ships a version of imagen, free, for anyone who wants to use it https://ideogram.ai/publicly-available
2023-12 - google "imagen 2" is officially "generally available for Vertex AI customers on the allowlist (i.e., approved for access)." >>38628417
We did that despite some moral ambivalence/uneasiness around AI "art".
For example, give me a "young and exciting Dana Meadows in front of a board of systems theory"
I'm not awful at photoshopping things, and sometimes that's the only way to get a specific image one has in mind. But it saves time and lets us concentrate on writing and researching instead.
TBH if an artist/illustrator came along and said "Let me do the episode icons even though you can't pay me yet" I'd feel inclined to ask the AI to step aside.
This article talks a bit about the lack of legal power to fight against deepfakes: https://mcolaw.com/theres-not-much-we-can-legally-do-about-d...