Elon Musk sues Sam Altman, Greg Brockman, and OpenAI [pdf]

>>modele+(OP)
Wouldn't you have to prove damages in a lawsuit like this? What damages does Musk personally suffer if OpenAI has in fact broken their contract?

>>BitWis+3T
A non-profit took his money and decided to be for profit and compete with the AI efforts of his own companies?

>>Kepler+yW
Yeah, OpenAI basically grafted a for-profit entity onto the non-profit to bypass their entire mission. They’re now extremely closed AI, and are valued at $80+ billion.

If I donated millions to them, I’d be furious.

>>a_wild+3Z
It's almost like the guy behind an obvious grift like Worldcoin doesn't always work in good faith.

What gives me even less sympathy for Altman is that he took OpenAI, whose mission was open AI, and turned it not only closed but then immediately started a world tour trying to weaponize fear-mongering to convince governments to effectively outlaw actually open AI.

>>api+901
Everything around it seems so shady.

The strangest thing to me is that the shadiness seems completely unnecessary, and really requires a very critical eye for anything associated with OpenAI. Google seems like the good guy in AI lol.0

>>Spooky+p11
It's a shame that Gemini is so far behind ChatGPT. Gemini Advanced failed softball questions when I've tried it, but GPT works almost every time even when I push the limits.

Google wants to replace the default voice assistant with Gemini, I hope they can make up the gap and also add natural voice responses too.

>>yaomin+t91
You tried Gemini 1.5 or just 1.0? I got an invite to try 1.5 Pro which they said is supposed to be equivalent to 1.0 Ultra I think?

1.0 Ultra completely sucked but when I tried 1.5 it is actually quite close to GPT4.

It can handle most things as well as ChatGPT 4 and in some cases actually does not get stuck like GPT does.

I'd love to hear other peoples thoughts on Gemini 1.0 vs 1.5? Are you guys seeing the same thing?

I have developed a personal benchmark of 10 questions that resemble common tasks I'd like an AI to do (write some code, translate a PNG with text into usable content and then do operations on it, Work with a simple excel sheet and a few other tasks that are somewhat similar).

I recommend everyone else who is serious about evaluating these LLMs think of a series of things they feel an "AI" should be able to do and then prepare a series of questions. That way you have a common reference so you can quickly see any advancement (or lack of advancement)

GPT-4 kinda handles 7 of the 10. I say kinda because it also gets hung up on the 7th task(reading a game price chart PNG with an odd number of columns and boxes) depending on how you ask: They have improved over the last year slowly and steadily to reach this point.

Bard Failed all the tasks.

Gemini 1.0 failed all but 1.

Gemini 1.5 passed 6/10.

>>nebula+gl1
>a personal benchmark of 10 questions that resemble common tasks

That is an idea worth expanding on. Someone should develop a "standard" public list of 100 (or more) questions/tasks against which any AI version can be tested to see what the program's current "score" is (although some scoring might have to assign a subjective evaluation when pass/fail isn't clear).

>>sema4h+OH1
Thats what a benchmark is, and they're all gamed by everyone training models, even if they don't intend to, because the benchmarks are in the training data.

The advantage of a personal set of questions is that you might be able to keep it out of the training set, if you don't publish it anywhere, and if you make sure cloud-accessed model providers aren't logging the conversations.

zlacker