I can't figure out how to try this thing. The closest I got was this sentence:
"To get started with Imagen 2 on Vertex AI, find our documentation or reach out to your Google Cloud account representative to join the Trusted Tester Program."
It also includes a link to the TTP form, although the form itself seems to make no reference to Imagen being part of the program anymore, confusingly. (Instead indicating that Imagen is GA.)
- https://cloud.google.com/vertex-ai (marketing page)
- https://cloud.google.com/vertex-ai/docs (docs entry point)
- https://console.cloud.google.com/vertex-ai (cloud console)
- https://console.cloud.google.com/vertex-ai/model-garden (all the models)
- https://console.cloud.google.com/vertex-ai/generative (studio / playground)
VertexAI is the umbrella for all of the Google models available through their cloud platform.
It still seems there is confusion (at google) about this being TTP or GA. Docs say both, the studio has a request access link.
more... this page has a table with features and current access levels: https://cloud.google.com/vertex-ai/docs/generative-ai/image/...
Seems that some features are GA while others are still in early access, in particular image generation is still EA, or what they call "Restricted GA"
1. Go to console.cloud.google.com
2. Go to model garden
3. Search imagegeneration
4. End up at https://console.cloud.google.com/vertex-ai/publishers/google...
And for whatever reason that is where the documentation is.
Sample request
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/imagegeneration@002:predict"
Sample request.json {
"instances": [
{
"prompt": "TEXT_PROMPT"
}
],
"parameters": {
"sampleCount": IMAGE_COUNT
}
}
Sample response {
"predictions": [
{
"bytesBase64Encoded": "BASE64_IMG_BYTES",
"mimeType": "image/png"
},
{
"mimeType": "image/png",
"bytesBase64Encoded": "BASE64_IMG_BYTES"
}
],
"deployedModelId": "DEPLOYED_MODEL_ID",
"model": "projects/PROJECT_ID/locations/us-central1/models/MODEL_ID",
"modelDisplayName": "MODEL_DISPLAYNAME",
"modelVersionId": "1"
}
Disclaimer: Haven't actually tried sending a request...> Imagen 2’s dataset and model advances have delivered improvements in many of the areas that text-to-image tools often struggle with, including rendering realistic hands and human faces and keeping images free of distracting visual artifacts.
Instead they use
>The robin flew from his swinging spray of ivy on to the top of the wall and he opened his beak and sang a loud, lovely trill, merely to show off. Nothing in the world is quite as adorably lovely as a robin when he shows off - and they are nearly always doing it.
And show off the result being a photograph of a robin, cool. SDXL[0] can do the exact same thing given the same prompt, in fact even SD1.5 would be able to easily[1].
My kids found it organically and were happily creating all sorts of DALL·E 3 images.
(DALL-E pretends to do that, but it's actually just using GPT-4 Vision to create a description of the image and then prompting based on that.)
Live editing tools like https://drawfast.tldraw.com/ are increasingly being built on top of Stable Diffusion, and are far and away the most interesting way to interact with image generation models. You can't build that on DALL-E 3.
Then this: https://civitai.com/
And I have completely abandoned DALLE and will likely never use it again.
[1] https://cloud.google.com/vertex-ai/docs/generative-ai/image/...
What a shitshow.
It installs dozens upon dozens of models and related scripts painlessly.
Results: https://imgur.com/a/JIiuDt9
Results (these are the only two images I generated): https://imgur.com/a/JIiuDt9
[1] https://github.com/GoogleCloudPlatform/generative-ai/blob/ma...
[2] https://console.cloud.google.com/vertex-ai/publishers/google...
I still can't understand how it got released and advertized.