I'd imagine if one was fully onboard with the AI/LLM commercialization train, there's no better spot than OpenAI right now.
And no one seems to have heard what Ilya Sutskever's status is at OpenAI as well.
It seems from the outside like he locked Tesla into a losing, NN-only direction for autonomous driving.
(I don’t know what he did at openai.)
Plugins were a failure. GPTs are a little better, but I still don't see the product market fit. GPT-4 is still king, but not by that much any more. It's not even clear that they're doing great research, because they don't publish.
GPT-5 has to be incredibly good at this point, and I'm not sure that it will be.
Relative to his level of fame, his actual level of contribution as far as pushing forward AI, I’m not so sure about.
I deeply appreciate his educational content and I’m glad that it has led to a way for him to gain influence and sustain a career. Hopefully he’s rich enough from that that he can focus 100% on educational stuff!
I really hope he’s next gig is at an actually Open AI company.
Are you sure about your perspective?
https://scholar.google.com/citations?view_op=view_citation&h...
https://github.com/karpathy/nanoGPT
And it's accompanying video series:
https://karpathy.ai/zero-to-hero.html
Another example (although I honestly don't remember if he made this one between jobs) is: https://github.com/karpathy/micrograd
He contributed to pushing forward AI, no “actual” about it. The loss of a great educator should be viewed with just as much sadness as the loss of a great engineer.
So, he has made theoretical contributions to the space, contributions to prominent private organizations in the space, and broadly educated others about the space. What more are you looking for?
Anybody have better info than my idle guess?
—Carver Mead, 1979 (employee at Xerox PARC), discussing why Xerox needed to focus more on adopting integrated circuits into the computers they had already developed, instead of continuing to just make increasingly-obsolete copiers.
(Eg Mercedes has achieved level 3 already).
He did pioneering research in image captioning - aligning visual and textual semantic spaces - the conceptual foundation of modern image generators. He also did an excellent analysis of RNNs - one of the first and best explanations of what happens under the hood of a language model.
[0]: >>39365288
----
"Xerox's top executives were for the most part salesmen of copy machines. From these leased behemoths the revenue stream was as tangible as the `click` of the meters counting off copies, for which the customer paid Xerox so many cents per page (and from which Xerox paid its salespersons their commissions). Noticing their eyes narrow [at R&D's attempts at asking to market their computer, one] could almost hear them thinking: 'If there is no paper to be copied, where's the `click`?' In other words: 'How will I get paid?' "
—Michael Hiltzik's "Dealers of Lightning" (p272)
Being a popular AI influencer is not necessarily correlated with being a good researcher though. And I would argue there is a strong indication that it is negatively correlated with being a good business leader / founder.
Here's to hoping he chills out and goes back to the sorely needed lost art of explaining complicated things in elegant ways, and doesn't stray too far back into wasting time with all the top sheisters of the valley.
Edit: the more I think about it, the more I realize that it probably screws with a person to have their tweets get b-lined to the front page of hackernews. It makes you a target for offers and opportunities because of your name/influence, but not necessarily because of your underlying "best fit"
In 1979, I doubt copiers were 'increasingly obsolete'; I'd expect the market was growing rapidly. Laser printers, email, the Internet, didn't yet exist; PCs barely existed, and not in offices. Almost everywhere would have used typewriters, I suppose.
[1] https://en.wikipedia.org/wiki/Mead%E2%80%93Conway_VLSI_chip_...
If OpenAI, Tesla and Google cannot retain him, then probably nobody can. Probably he'll be doing YouTube videos all day long.
>Laser printers, email, the Internet, didn't yet exist
Actually, all three did; the latter was in the form of ARPANET [to be technical, not "The Internet"].
For one potentially compelling example that happily (sadly?) isn’t using LLMs: the SimulaVR people are developing their own Linux fork of some kind, claiming it’s necessary for comfortable VR use for office work. And I sorta believe them!
https://en.wikipedia.org/wiki/Carver_Mead
Learning about the interconnectedness of all this historic intellectual "brain theft," keeps me excited for an AGI-future, post-copyright/IP. What are we going to accomplish [globally] when you can't just own brilliant ideas?!
To attract someone at Karpathy's level you would need a project that is both wildly challenging (and yet not the typical startup "challenging" because it's a poorly thought out idea) and requires the kind of resources (compute, data, human brains in vats, etc) that would make your place look far more interesting than OpenAI.
But, hardest of all, you would need startup founders that could tame their egos enough to let someone like Karpathy shine. I haven't talked to a Bay area startup founder in a while who wouldn't completely squander that kind of talent by attempting to force their own half-baked ideas on him, and then try to force him out months later when he couldn't ship those poorly thought out products citing lack of "leadership".
Here's a proposal. Send $500M to NVIDIA for GPUs. Send $100M to AMD to balance the odds. Spend $100M to build a new chip as a hail mary.
Spend the rest on $10M comp packages.
How close did I get?
Good.
I have no idea what's really going on inside that company but the way the staff were acting on twitter when Altman got the push was genuinely scary, major red flags, bad vibes, you name it, it reeked of it.
He lead a team of one of the most common uses of DNNs, if that isn't 'pushing AI forward', I think you're confused. It's certainly pushing it forward quite a bit more than the publishing game where 99% of the papers are ignored by the people actually building real applications of AI.
> Actually, all three did; the latter was in the form of ARPANET [to be technical, not "The Internet"].
True, but a technicality. Very few people knew they even existed, and they had zero impact on Xerox copier sales.
The guy has many tens of millions of dollars most likely.
I mean, I don't know why people still try to devalue educating the masses. Anyone who's had to knowledge share know how hard it is to make a concise but approachable explanation for someone who knows relatively little about the field.
In addition, he's still probably in a standing well above the 80% mark in terms of technical prowess. even without influencer fame I'm sure he can get into any studio he wishes.
if only we compensated that knowledge properly. Youtube seems to come the closest, but Youtube educators also show how much time you have to spend attracting views instead of teaching expertise.
> It makes you a target for offers and opportunities because of your name/influence, but not necessarily because of your underlying "best fit"
That's unfortunately life in a nutshell. The best fits rarely end up getting any given position. May be overqualified, filtered out in the HR steps, or rejected for some ephemeral reason (making them RTO, not accepting their counteroffer, potentially illegal factors behind closed doors, etc).
it's a crappy game so I don't blame anyone for using whatever cards they are dealt.
Television content for children is often called 'Children's Programming'
IMO governments, like websites, should be boring but effective, focused on small day to day improvements, not all flash and empty marketing chasing cultural trends...
Does not mean they did not exist. See citations, below:
https://en.wikipedia.org/wiki/Laser_printing (see 2nd intro paragraph)
https://en.wikipedia.org/wiki/History_of_email (see 3rd intro paragraph)
Take CS 231, for example, which stands as one of Stanford's most popular AI/ML courses. Think about the number of students who have taken this class from around 2015 to 2017 and have since advanced in AI. It's fair to say a good chunk of credit goes back to that course.
Instructors who break it down, showing you how straightforward it can be, guiding you through each step, are invaluable. They play a crucial role in lowering the entry barriers into the field. In the long haul, it's these newcomers, brought into AI by resources like those created by Karpathy, who will drive some of the most significant breakthroughs. For instance, his "Hacker's Guide to Neural Networks," now almost a decade old, provided me with one of the clearest 'aha' moments in understanding back-propagation.
Idk, I just tried Gemini Ultra and it's so much worse than GPT4 that I am actually quite shocked. Trying to ask it any kind of coding question ends up being this frustrating and honestly bizarre waste of time as it hallucinates a whole new language syntax every time and then asks if you want to continue with non-working, in fact non-existing, option A or the equally non-existent option B until you realise that you've spent an hour trying to make it at least output something that is even in the requested language and finally that it is completely useless.
I'm actually pretty astonished at how far Google is behind and that they released such a bunch of worthless junk at all. And have the chutzpah to ask people to pay for it!
Of course I'm looking forward to gpt-5 but even if it's only a minor step up, they're still way ahead.
I was surprised and touched by their loyalty, but maybe I missed something you noticed.
edit: as pointed out, this was indeed a pretty esoteric example. But the rest of my attempts were hardly better, if they had a response at all.
That blog post inspired Alec Radford at Open AI to do the research that produced the "Unsupervised sentiment neuron": https://openai.com/research/unsupervised-sentiment-neuron
Open AI decided to see what happened if they scaled up that model by leveraging the new Transformer architecture invented at Google, and they created something called GPT: https://cdn.openai.com/research-covers/language-unsupervised...
The language in question was only open sourced after GPT4's training date, so i couldn't compare. That's actually why I tried it in the first place. And yes, I do expect it to be better - GPT4 isn't perfect but I don't really it ever hallucinating quite that hard. In fact, its answer was basically that it didn't know.
And when I asked it questions with other, much less esoteric code like "how would you refactor this to be more idiomatic?" I'd get either "I couldn't complete your request. Rephrase your prompt and try again." or "Sorry, I can't help with that because there's too much data. Try again with less data." GPT-4 was helpful in both cases.
LiDAR directly measures the distance to objects. What Tesla is doing is inferring it from two cameras.
There has been plenty of research to date [1] that LiDAR + Vision is significantly better than Vision Only especially under edge case conditions e.g. night, inclement weather when determining object bounding boxes.
[1] https://iopscience.iop.org/article/10.1088/1742-6596/2093/1/...
"In fact, I’d go as far as to say that
The concept of attention is the most interesting recent architectural innovation in neural networks."
when the initial attention paper was less than a year old, and two years before the transformer paper.This isn’t a race to write the most lines of code or the most lines of text. It’s a race to write the most correct lines of code.
I’ll wait half an hour for a response if I know I’m getting at least staff engineer level tier of code for every question
I'd say that that his work on AI has been significant and his ability to teach has contributed to that greatly.
What do you know about his work?
He's been leading the vision team at Tesla, implementing in the field all the papers that were available in the subject of autonomous driving and vision (he explicitly wrote that). He has not published about it surely due to obligations with Tesla.
Sufficiently accurate responses can be fed into other systems downstream and cleaned up. Even code responses can benefit from this by restricting output tokens using the grammar of the target language, or iterating until the code compiles successfully.
And for a decent number of LLM-enabled use cases the functionality unlocked by these models is novel. When you're going from 0 to 1 people will just be amazed that the product exists.
But seriously, right now with full attention to LLMs, and many brains, there is no single key person. The question 'who said it first' isn't that important for the progress. With experts leaving OpenAI will gradually loose it's leadership. Others will catch up. Which is good in general, no one should have monopoly on AI. I wish it was that easy with hardware too...
As the popularity has exploded, and ethical questions have become increasingly relevant, it is probably worth taking some time to nail certain aspects down before releasing everything to the public for the sake of being first.
But so far nobody is even in the same ballpark. And not just freely distributed models, but proprietary ones backed by big money, as well.
It really makes one wonder what kind of secret sauce OpenAI has. Surely it can't just be all that compute that Microsoft bought them, since Google could easily match that, and yet...
People keep repeating this. I seriously don't know why. Stereo vision gives pretty crappy depth, ask anyone who has been playing around with disparity mapping.
Modern machine vision requires just one camera for depth. Especially if that one camera is moving. We humans have no trouble inferring depth with just one eye.
He might be talented, but if he can’t be trusted, he needs to go.
It is practically unusable and I'll likely cancel paid plan soon.
Here's a gem of educator. Check out his other videos.
I don’t see that as an issue though, just a natural consequence of his great work in teaching neural networks!
Even the HN discussion around this had comments like "this feels my baby learning to speak.." which are the same comparisons people were saying when LLMs hit mainstream in 2022
This is insane
It's magic, until it isn't.
I'd agree with that, however I've always wondered how easy it is for folks at that level to get hands on keyboards and not wind up spending their days polishing slide decks for talks instead.
If ChatGPT doesn't have product-market fit, what actually has?
Here are some hilarious highlights: https://twitter.com/Suhail/status/1757573182138290284
I don't think it's hugely surprising given the massive hype. No doubt OpenAI are doing impressive things, but it's normal for the market to over value it initially as everyone tries to get onboard, and then for it to fall back to a more sensible level.
- it costs too much
- it's ugly
- humans have only vision
TESLA Engineers wanted LIDAR badly, but they have been allowed to use it only on one model.
I think that autonomous driving in all conditions is mostly impossible. It will be widely available in very controlled and predictable conditions (highways, small and perfectly mapped cities etc).
And about Mercedes vs Tesla capabilities, it's mostly marketing... If you're interested I'll find an article that talked about that.
Strikes a balance between sounding engaging and soothing at the same time.
Personally, the chat UI is the main limiting factor in my own adoption, because a) it’s not in the tool I’m trying to use, and b) it’s quicker for me to do the work than describe the work I need doing.
He uses elegant hand-drawn notes rather than Manim - although 3blue1brown's open sourced visualization library is beautiful too, I think this makes it extra impressive.
I'd probably say they should:
- allocate the $500M to the new chip, $100M to each of AMD and NVIDIA, then:
- never officially hire any more staff (these founders are 10E6X devs!)
- start "subtly" liquidating the AMD and NVIDIA chips after a year ("Tell HN: IlPathAI are liquidating their GPUs? >I bought a used GH200 off eBay and the shipping label was covering up previous shipping label for Andrej's shipping container treehouse >Are they getting quick cash to finance their foundry run on the new chips? It's that good??").
- Release a vague "alignment" solution on a chatgpt-generated kickstarter clone, take 3 years to "develop" it.
- Raise a series A (maybe a pre-A, or a post-seed. Honestly, maybe even a re-seed with this valuation!) off the hype (some obviously stable diffusion-generated images help here).
- Sell 30% of their shares in a secondary, profit some billions.
- When everyone starts getting suspicious, time to take out those GH200s you "sold" on ebay out of storage (those buyers were just sockpuppets - investors from the family/"friends" round), repackage them in some angled magnesium alloy. Release them to great fanfare. Crowd briefly ecstatic, concern sets in - "this has the same performance as the GH200? That was like 4 years ago!".
- Call the "performance issues" some form of "early access syndrome" and succeed in shifting blame back onto the consumer.
- Release a "performance patch" that in actuality just freaking overclocks and overvolts the device, occasionally secretly training on the user's validation set using an RDMA exfiltration exploit. This gets them to 2028, when the modified firmware on all devices spontaneously causes a meltdown - should've written it in Rust - that should've been a signed int! The fans thought it was suddenly 1773, ran in reverse so fast the whole device melted (aww all that IP down the drain)!
- When asked how on earth that could make any sense, dodge the question with the news that "We just had the unfortunate news that one of the greybeards who wrote the firmware previously at Siemens and then the DoE, has programmed his last PLC. He died glowing peacefully last night surrounded by layered densities of gasses. We are too sad and bankrupt to go on."
- Declare bankruptcy
- Become alt-right pundits on Y.com (If they haven't already wrapped around to AA.com - they managed to grab that domain after some airline went bankrupt after an embarrassing incident involving a 787 Max, the latest Stable Diffusion model, a marketing executive, some loose screws, and a Boeing QA contractor back there who might Not Be Real).
- Start a war with a "totally harmless" post, later admit it was "poorly worded". - Use some saved funds to "find a way" for IlPathAI, Inc. to leave bankruptcy, pivot to a chat app (you actually just buy HipChat again). Resell that after reusing it for a few particularly juicy govt. tenders. Pivot to defense contracting. End up with enough money for the rest of the millennium.
- Write a joint autobiography called "The Alignment Problem", send it to your "kickstarter" backers. Print the book using old school metal typecasting because they forgot TeX, and the current language models only spit out hallucinated Marvel dialog. Screw up the kerning since you learned typesetting on the TikTok page of a French clockwork museum. Claim this was on purpose.
- The whole time, maintain an amazingly educational YouTube channel teaching Machine Learning to those who love to learn.
- Release "AGI" but it's actually just 5 lines of PyTorch that would have solved Tesla's FSD problems with mirrors. Send Douglas Hofstadter a very slightly smaller copy every day until he recurses into true AGI.
---
Well I started out serious at least (OK, only the first and second-to-last bullets were). I do genuinely believe that $100M would not be enough to produce competitive IP right now - You'd likely have to budget a majority of that to the final production run! I wonder how much you'd have to spend on making custom chips to break even with spending the money on research in the performance/model architecture side of things, on average.
They were loyal to money, nothing to be touched by.
I'm not particularly interested in having it outright program for me (other than say to sketch how to do something as inspiration, which I'll rewrite rather than copy) because I think typically I'd want to do it a certain way and it would take far longer to NLP an LLM to write it in whatever syntax than to WhateverSyntaxProgram it myself.
I could understand the sentiment when you think that OpenAI is really doubling down just on LLMs recently, and forgoing a ton of research in other fronts.
They’re rapidly iterating though, and it’s refreshing to see them try a bunch of new things so quickly while every other company is comparatively slow to release anything.
Also, all the evidence is in this thread. Clearly people unhappy with wasting time on LLMs, when the time that was wasted was the result of obviously bad output.
But try convincing a democracy that politicians should be paid more.
For lots of applications the speed/quality/price trade offs make a lot of sense.
For example if you are doing vanilla question answering over lots of documents then 3.5 or Mixtral are better than GPT4 because the speed is important.
People just get bored and go do something else for a while sometimes. Or he's got some beef.
I love using the smaller models like Starling LM 7B and Mistral 7B have been enough for many tasks like you mentioned.
[1] https://youtube.com/playlist?list=PLkt2uSq6rBVctENoVBg1TpCC7...
Me changing can never be used as an appraisal of my old organisation so.
Disclaimer: regarding money, if I get enough in max a year to rezire forever after that, I might be tempted. Which won't happen, because a) I'd just leave a year later anyway and b) nobody would pay me high 7 figures just to not quit.
Yeah putting people out of work on an industrial scale is probably gonna have a pretty big effect on global GDP
Edit: I forgot, NASA trained astronaut!
Add to that a company environment that seems to be built on money-crazed stock option piling engineers and a CEO that seems to have gotten power-crazed.. I mean they grew far too fast I guess..
Transformers were invented with the support of Google (by the researchers, not by Google).
Open community has been creating better and better models with a group effort; like how ML works itself, it's way easier to try 100,000 ideas on a small scale than it is to try a couple of ideas on a large scale.
Initially it felt like the singularity was at hand. You've played with it, got to know it, the computer was taking to you, it was your friend, it was exciting then you got bored with your new friend and it wasn't as great as you remember it.
Dating is often like this. You meet someone, have some amazing intimacy, then you get really get to know someone, you work out it wasn't for you and it's time to move on.
If it's a conversation with "format this loose data into XML" repeated several times and then a "now format it to JSON" I find often it has trouble determing that what you just asked for is the most important; I think the attention model gets confused by all the preceding text.
I've been anti-Google for a while now so I'm not biased.
I don't think openAI have this sown up.
I believe the basic pay is £86k. They're not brain surgeons or rocket scientists, so even that is not that bad.
But I believe the average gravy train bumps this up 3X with extras.
It's a literal gravy train of subsidies and expenses and allowances! Sure the basic pay is, well, it's arguably not that bad ... but the gravy on top is tremendous. Not to mention the network contacts which plug their gravy train into the more lucrative gravy superhighway later.
Does anyone here know?
For some advanced reasoning you're 100% right, but many times you're doing document conversion, summarizing, doing RAG, in all these cases GPT 3.5 performs as good if not better than GPT 4 (we can't ignore cost and speed) and it's very hard to distinguish between the two.
Multi languages would be so useful for me.
I see how most people would prefer a better but slower model when price is equal, but I'm sure many prefer a worse $2/mo model over a better $20/mo model.
Though the TTS side has some trouble switching languages if only single words are embedded. A single German word inside an English sentence can really get butchered. More training needed on multilingual texts (and perhaps preserving italics). But anyways this is really only an issue for early language learning applications in my experience.
Yeah, voters don't want to pay MPs more. Yet when voters are asked, they want highly intelligent, motivated people. They want them to have technical expertise, which means time spent in higher education. Then they want them to work a full time job in Parliament during the week, but also be open to constituency concerns on the weekend. And once all of this is pointed out, voters concede that maybe MPs deserve to be paid on par with professionals like doctors. (It's a different matter that UK doctors are underpaid).
> But I believe the average gravy train bumps this up 3X with extras.
Citation needed. They're on a shorter leash now with expenses. Don't go citing one or two bad apples either, show us what the median MP claims as expenses. According to you, it should be around £170k a year.
In general, politicians and their aides in the UK are underpaid. Most capable people find they're better off working in Canary Wharf or elsewhere in London. An example is the head of economic policy for the Labour Party earning £50k while writing policy for a £2 trn economy. (https://www.economist.com/britain/2023/01/19/british-politic...)
Altman saga, allowing military use and other small things step by step tarnish your reputation and pushes you to the mediocrity or worse.
Microsoft has many great development stories (read Raymond Chen's blog to be awed), but what they did at the end to other competitors and how they behave removed their luster, permanently for some people.
But used as autocomplete, it's definitively a time saver. Most of us read faster than we type.
People say that, but I don't get this line of reasoning. There was something new, I learned to work with it. At one point I knew what question to ask to get the answer I want and have been using that form ever since.
Nowadays I don't get the answer I want for the same input. How is that not a result of declining quality?
That on top of my own experiences, and heaps of anecdotes over the last year.
> How would they honestly be getting worse?
The models behind GPT-4 (which is rumored to be a mixture model)? Tuning, RLHF (which has long been demonstrated to dumb the model down). The GPT-4, as in the thing that produces responses you get through API? Caching, load-balancing, whatever other tricks they do to keep the costs down and availability up, to cope with the growth of the number of requests.
--
[0] - >>39361705
That would actually increase their standing in my eyes.
Not too far from where I live, Russian bombing is destroying homes of people whose language is similar to mine and whose "fault" is that they don't want to submit to rule from Moscow, direct or indirect.
If OpenAI can somehow help stop that, I am all for it.
3b1b's main selling point is the extreme level of polish on his visualizations - something that takes a lot of time (money) to develop
the sad part is that it takes extreme luck to make it on yt. i wish educating skills counted for more but unfortunately they don't, really.
https://www.youtube.com/watch?v=NfnWJUyUJYU&list=PLkt2uSq6rB...
I got some bad news for you then.
And, according to UN, Turkey has used AI powered, autonomous littering drones to hit military convoys in Libya [1].
Regardless of us vs. them, AI shouldn't be a part of warfare, IMHO.
[0]: https://www.theguardian.com/world/2023/dec/01/the-gospel-how...
[1]: https://www.voanews.com/a/africa_possible-first-use-ai-armed...
In my experience, stopping to talk even for a moment already makes it submit. This makes a real conversation with pauses for thought difficult, because of the need to hurry before it cuts off.
> Personally, the chat UI is the main limiting factor in my own adoption, because a) it’s not in the tool I’m trying to use, [...]
though I haven't tried it through some combination of it the effort to set it up & it not particularly appealing to me anyway. The best it could possibly be would be like pair programming (back seat) with someone who does things the same way as you, and reviewing their code. I read faster than I type, but probably don't review non-trivial code faster than I type it. (That's not a brag, I just mean I think it's harder and takes longer to reason about something you haven't written, to understand it, and be confident you're not missing anything or haven't (both) failed to consider xyz.)
You can create and encourage small teams, but then they need to coordinate somehow. Coordination & communication overhead grows exponentially. Then you get all the "no silos" guys and then its all over..
I am not saying this is anything but it's definetely tingling my "something's up" senses.
Nor should nuclear weapons, guns, knives, or cudgels.
But we don’t have a way to stop them being used.
- NPR: https://www.npr.org/2021/06/01/1002196245/a-u-n-report-suggests-libya-saw-the-first-battlefield-killing-by-an-autonomous-d
- Lieber Institute: https://lieber.westpoint.edu/kargu-2-autonomous-attack-drone-legal-ethical/
- ICRC: https://casebook.icrc.org/case-study/libya-use-lethal-autonomous-weapon-systems
- UN report itself (Search for Kargu): https://undocs.org/Home/Mobile?FinalSymbol=S%2F2021%2F229&Language=E&DeviceType=Desktop&LangRequested=False
- Kargu itself: https://www.stm.com.tr/en/kargu-autonomous-tactical-multi-rotor-attack-uav
From my experience, Turkish military doesn't like to talk about all the things they have.- price per thing you use it with matters (a lot)
- making sure that under no circumstances are the involved information leaked (included being trained on) matters a lot in many use cases, while OpenAI does by now have supports that the degree of you being able to enforce it is not enough for some use cases. In some cases this is a hard constraint due to legal regulations.
- geo politics matters, sometimes. Being dependent on a US service is sometimes a no go (using self hosted US software is most times fine, tho). Even if you only operate in the EU.
- it's much easier to domain adapt if the model is source/weight accessible in a reasonable degree, while GPT-4 has a fine tuning API it's much much less powerful a direct consequence of the highly proprietary nature of GPT-4
- a lot of companies are not happy at all if they become highly reliable on a single service which can change at any time in how it acts, the pricing model or it being available in your country at all. So basing your product on a less powerful but in turn replaceable or open source AI can be a good idea, especially if you are based in a country not at best terms with the US.
- do you trust Sam Altman at all? I do not and it seem short sighted to do so. In which case some of the points above become more relevant
- 3.5 level especially in combination with domain adoption can be "good enough" for some use cases
A teacher can usually adapt the content depending on its audience, I would not teach the research in my field at the same level to professionals, PhDs, master students, bachelor students, amateurs, or even school students.
If what I'm teaching is fairly complex, it requires a lot of background that I could teach, but I would not have the time to do so, because it would be to the detriment of other students. So, while I usually teach 'from scratch', depending on my audience I will obfuscate some details (that I can answer separately if a question is asked) and usually I will dramatically change the speed of the lessons depending on the previous background, because I need to assume that the student has the prerequisite background to understand at that speed fairly complex material.
As an example, I gave some explanations to a student from zero to transformers, it took several hours with lots of questions, the same presentation to a teacher not in the field took me 1h30 and to a PhD in a related field took 25 minutes, the content was exactly the same, and it was from scratch, but the background in the audience was fairly different.
He should teach more and take it seriously
Then he can go from being in the top .1% of income earners to the bottom .1%!/s
> Nowadays I don't get the answer I want for the same input. How is that not a result of declining quality?
Is it really the same input? An argument could easily be made that as you’ve gotten accustomed to ChatGPT, you ask harder questions, use less descriptive of language, etc.
Perhaps some automation/ai combination where you feed it learning videos and it helps create the "other" content.
Just imagine what valuation OpenAI would have as a grid monopolist combined with nVidia, ARM, Intel and AMD! Hundreds of trillions of dollars!
If that's the foundation your luster is built on - then it's not really ridiculous.
GPT popularized LLMs to the world with GPT-3, not too long before GPT-4 came out. They made a lot of big, cool changes shortly after GPT-4 - and everyone in their mother announced LLM projects and integrations in that time.
It's been about 9 months now, and not a whole lot has happened in the space.
It's almost as if the law of diminishing returns has kicked in.
Building business requires latticework of talented people doing job properly. And building a system of checks and controls for less trusted people.
My guess is it isnt, these systems are hard to trust, and the rhetoric "were aiming for AGI" suggests to me that they know this and AGI might be the only surefire way out.
If you tried to replace all of a devs duties with current LLMs it would be a disaster, making sense of all that info requires focus and background thinking processes simulataneously which i dont believe we have yet.
I don't think so. In order to be virtuous, one should have some skin in the game. I would respect dedicated pacifists in Kyiv a lot more. I wouldn't agree with them, but at least they would be ready to face pretty stark consequences of their philosophical belief.
Living in the Silicon Valley and proclaiming yourself virtuous pacifist comes at negligible personal cost.
I will check out the links. Thanks a lot.
My experience is limited. I got it to berate me with a jailbreak. I asked it to do so, so the onus is on me to be able to handle the response.
I'm trying to think of unethical things it can do that are not in the realm of "you asked it for that information, just as you would have searched on Google", but I can only think of things like "how to make a bomb", suicide related instructions, etc which I would place in the "sharp knife" category. One has to be able to handle it before using it.
It's been increasingly giving the canned "As an AI language model ..." response for stuff that's not even unethical, just dicey, for example.
The algorithm is very good at rewarding good content. (It’s also good at rewarding other things, but that is besides the point)
Note well: I haven't actually used it myself, so I'm speculating (guessing) rather than saying that this is how it is.
Talking to corporate HR is subjectively worse for most people, and objectively worse in many cases.
Google DeepMind is still an AI research powerhouse that is producing a ton of innovation both internal and publicly published.
Ironically that may be exactly what Sutskever thought about Altman.
The second that this tech was developed it became literally impossible to stop this from happening. It was a totally foreseeable consequence, but the researchers involved didn't care because they wanted to be successful and figured they could just try to blame others for the consequences of their actions.
The same nonsense happened with Apple, where like a month after they first released Apple Watch people were yelling "What's next???!!!! Apple is dying without Steve Jobs!"
Such an absurdly reductive take. Or how about just like nuclear energy and knives, they are incredibly useful, society advancing tools that can also be used to cause harm. It's not as if AI can only be used for warfare. And like pretty much every technology, it ends up being used 99.9% for good, and 0.1% for evil.
GPTs are also pretty good, and being able to invoke them in regular chat is also handy, but the lack of monetization and the ability to easily surface them outside of chatgpt is also kind of a problem. These problems are more fixable than the plugin issue IMO since I think the architecture of plugins is a limiting factor.
Miqu is pretty good. Sure, it's a leak...but there's nothing special there. It's just a 70b llama2 finetune.
I don't have logs detailed enough to be able to look it up, so I can't prove it. But for me learning to work with AI tools like ChatGPT consists specifically developing an intuition of what kind of answer to expect.
Maybe my intuition skewed a little over the months. It did not do that for open source models though. As a software developer understanding and knowing what to expect from a complex system is basically my profession. Not just the systems I build, maintain and integrate, but also the systems I use to get information, like search engines. Prompt engineering is just a new iteration of google-fu.
Since this intuition has not failed me in all those other areas and since OpenAI has an incentive to change the workings under the hood (cutting costs, adding barriers to keep it politically correct) and it is a closed source system that no-one from the outside can inspect, my bet is that it is them and not me.
If we cared about preventing LLMs from being used for violence, we would have poured more than a tiny fraction our resources into safety/alignment research. We did not. Ergo, we don't care, we just want people to think we care.
I don't have any real issue with using LLMs for military purposes. It was always going to happen.
Overall a chatbot like GPT-4 may be useful, but not that useful as it stands.
If you can write well, it's not really going to improve your writing. Granted, you can automate a few tasks, but it does not give you 10X or even 2X improvement as sometimes advertised.
It might be useful here and there for coding, but it's not reliable.
Maybe Congress needs the equivalent of UX and product types who actually care about what the people want... and can explain how it works to us in fancy how-to videos.
Any unionising effort consists of employees convincing other employees to join them. Some people will care more about the union's goals than others, and you can be certain that those who care more will pester those that care less to join their cause.
What happened at OpenAI was not a union effort, but I believe the comparison is excellent to understand normal dynamics of employee-based efforts.
To me it feels like it detects if the answer could be answered cheaper by code interpreter model or 4 Turbo and then it offloads them to that and they just kinda suck compared to OG 4.
I’ve watched it fumble and fail to solve a problem with CI, took it 3 attempts over 5 minutes real time and just gave up in the end, a problem that OG 4 can do one shot no preamble.
We may lack the motivation and agreement to ban particular methods of warfare, but the means to enforce that ban exists, and drastically reduces their use.
Watching tools decline is frustrating.
Anyway, I'm very sure there are good MP's, but I'll not go so far as to say these people are underpaid.
I plugged the question into AI ... see below. Not to mention the subsidised "everything". Holidays in mates villas (and what mates, eh). The "director" positions on various companies, and, and ... it's not just the monetary value of these things. It's an absolute gravy train.
Generated Hypothetical Answers: we can provide some hypothetical scenarios based on varying levels of responsibility:
Scenario 1: Backbench MP without additional roles:
Salary: £86,584
Maximum Expense Claims:
Office: £85,000
Accommodation (Constituency only): £9,300
Travel: Assuming moderate travel expenses, let's estimate £10,000
Other Expenses: £5,000
Total: £86,584 + £85,000 + £9,300 + £10,000 + £5,000 = £195,884Scenario 2: MP with Ministerial role and chairing a committee:
Salary: £86,584 + Ministerial salary (e.g., £50,000)
Expense Claims: Similar to Scenario 1, let's use the same estimates
Committee Chair allowance: £11,600Total: £86,584 + £50,000 + £85,000 + £9,300 + £10,000 + £5,000 + £11,600 = £257,484
Remember: These are just hypothetical examples, and the actual value for any individual MP can be significantly higher or lower depending on their specific circumstances.
Unfortunately, no deep piles of gold without deep piles of corpses. It is inevitable, though. Prompted by the US military, other countries have also always pioneered or acquired advance tech, and I don't see why AI would be any different: Never send a human to do a machine's job is as ominous now as it is dystopian as machines increasingly become more human-like.
To give a relevant example, graph theory concepts can be found both in so many real-world systems but also in programming languages and computer systems.
I can give you something analogous though: I’m a big fan of old school east coast hip-hop. You have the established mainline artists from back then (“Nas”, “Jay-Z”, “Big L”, etc), then you have a the established underground artists (say, “Lord Finesse” or “Kool G Rap”), and then you have the really really underground guys like “Mr. Low Kash ‘n Da Shady Bunch”, “Superscientifiku”, “Punk Barbarians”, “Harlekinz”, etc.
A lot of those in that third “tier” are every bit as good as the second tier. And both tiers contain a lot of artists that could hit the quality point of the mainline artists, they just never had access to the producer and studio time that the mainline did.
I know these artists because I love going digging for the next hidden gem. Spotify recommended me perhaps one or two of all the super-underground guys.
Ironically more West-coast style, but here is a great example (explicit!): https://youtu.be/BUwJMVKSMtY?t=129
Dude could’ve measured up to the best of the west coast. Spotify monthly listener count? 891.
Algorithms are sadly win-more.
Now I’m just silently hoping a math nerd will feel inclined to share their hidden math channel gems :+)
From wikipedia:
> Between 2009 and 2012, ANNs began winning prizes in image recognition contests, approaching human level performance on various tasks, initially in pattern recognition and handwriting recognition.
That was when Neural networks became a big thing every tech person knew about, 2014 it was already in full swing and you had neural networks do stuff everywhere, like recognizing faces or classifying images.
Do we, though? Sometimes, against smaller misbehaving players. Note that it doesn't necessarily stop them (Iran, North Korea), even though it makes their international position somewhat complicated.
Against the big players (the US, Russia, China), "threat of warfare and prosecution" does not really work to enforce anything. Russia rains death on Ukrainian cities every night, or attempts to do so while being stopped by AA. Meanwhile, Russian oil and gas are still being traded, including in EU.
In LLMs it’s even worse. To make it concrete, for how I use LLMs I will not only not pay for anything with less capability than GPT4, I won’t even use it for free. It could be that other LLMs could perform well on narrow problems after fine tuning, but even then I’d prefer the model with the highest metrics, not the lowest inference cost.
The todo comments can be prompted against, just tell it to always include complete runnable code as its output will executed in a sandbox without prior verification.
Keep in mind GPT-3.5 is not an overnight craze. It takes months before normal people even know what it is.
Deepfakes are going to become a concern of everyday life whether you stop OpenAI from generating them or not. The cat is out of the proverbial bag. We as a society need to adjust to treating this sort of content skeptically, and I see no more appropriate way than letting a bunch of fake celebrity porn circulate.
What scares me about deepfakes is not the porn, it's the scams. The scams can actually destroy lives. We need to start ratcheting up social skepticism asap.
When my kiddo was a sophomore in HS he decided that he wanted to be an engineer, and I thought that it would be really good for him to learn calc- my feeling was that if he got out of HS without at least getting through Calculus he'd have a really hard time.
So _I_ learned calculus. I started with basic math on Kahn and moved to the end of the Calc AB syllabus. I have, like, 500K points there. And I've watched a whole lot of STEM on YT.
Yesterday I finished a lab with Moritz Klein's Voltage Controlled Oscillators, where I was able to successfully understand the function of all the sections in the circuit.
I've been trying to follow Aaron Lanterman's Georgia Tech lectures on analog electronics.
The issue is that I have other stuff going on in my life. Like, my son studies more than I work at my full time job.
And I don't really have the pressure on me to learn the more advanced math that he's using. In fact, in the couple of years since he graduated HS, I've not really found a use for calc in my day-to-day work on any of the technical things I've done (mostly programming) and so I've lost a lot of it.
So, by contrast, my son who will be graduating as a BS in ME in May, has a far better and deeper understanding of the engineering material than I do.
And it's not just a time issue- I quit my programming job last summer because I have just enough work as a musician to pay the rent, which leaves me plenty of time to do stuff. And it's not that I don't know how to learn at a college level- I taught in an English Dept for 8 years and quit a PhD in the humities ABD.
That's all just my experience.
I love STEM (and trades education) material on Youtube, but I really think that it's missing something to think that you could get " a better undergraduate education in STEM on youtube".
https://www.3blue1brown.com/blog/some2
https://www.youtube.com/playlist?list=PLnQX-jgAF5pTZXPiD8ciE...
1. With advanced math I feel I retain at the n-1 level. Unless I’m using it, it fades. That’s frustrating but I don’t think it’s the fault of the deliverer.
I do think working through problems has to be part of the practice, I’ve bought workbooks to have something to try to drive the knowledge into muscle memory. It still fades, but maybe not as much.
2. Calculus, in particular seems super unimportant to real life. Stats and Linear Algebra, somewhat similar in Math Level, seem much more applicable. I’m very happy to see Stats being offered in high school now as an alternative to Calculus. For Calculus, you almost need to learn 3-4 rules and someone says “trust me, just memorize these, don’t spend too much time on this.” And you would be able to live a happy productive life.
Something I've been thinking a lot about is the transition into post scarcity and how we need to dramatically alter the incentive structures and payment allocations.
I've been asking this question for about a decade and still have no good solutions: "What do you do when x% of your workforce is unemployable?" (being that x% of jobs are removed without replacement. Imagine sophisticated and cheap robots. Or if needed, magic)
This is a thought experiment, so your answer can't be "there'll be new jobs." Even if you believe that's what'll happen in real life, it's not in bounds of the thought experiment. It is best to consider multiple values of x because it is likely to change and that would more reflect a post scarcity transition. It is not outside the realms of possibility that in the future you can obtain food, shelter, and medical care for free or at practically no cost. "Too cheap to meter" if you will.
I'll give you two answers that I've gotten that I find interesting. I do not think either are great and they each have issues. 1) jobs programs. Have people do unnecessary jobs simply so they create work wherein we can compensate them. 2) Entertainment. People are, on average, far more interested in watching people play chess against one another than computers, despite the computer being better. So reasons that this ,,might,, not go away.
MOCs are great for access, but they are not, and definitely should not be treated as, replacements. That I am certain will have a net negative result. I'm in grad school and there's something I tell students on the first day:
> The main value in you paying (tuition) and attending is not just to hear me lecture, but to be able to stop, interrupt, and ask questions or visit me in office hours. If you are just interested in lectures I've linked several on our website from high quality as well as several books, blogs, and other resources. Everyone should all use these. But you can't talk to a video or book, but you can to me. You should use all of these resources to maximize your learning. I will not be taking attendance.
I'm sure many of you have had lectures with a hundred students if you went to a large school (I luckily did not). You're probably aware how different that is from a smaller course. It's great for access and certainly is monetarily efficient, but its certainly not the most efficient for educating an individual. MOCs are great because they increase the ability of educators to share notes. We pull from one another all the time (with credit of course), because if someone else teaches in a better way than I do, I should update the way I teach. MOCs are more an extension of books. Youtube is the same, but at the end of the day you can't learn math without doing math. Even Grant states this explicitly.
To the general public sure but not research which is what produces the models.
The idea that diminishing returns has hit because there hasn't been a new SOTA model in 9 months is ridiculous. Models take months just to train. Open AI sat on 4 for over half a year after training was done just red-teaming it.
For both tier 2 and tier 3 its basically the same process. This is for Spotify btw, I have no idea how different the workflow would be for something like Apple Music.
Say the genre you want to dig around in is Hip-Hop. You are aware of Eminem and Mac Miller, and vaguely aware of a guy named Nas. By intuition you'd probably already be able to tell that Nas is more at the edge among the mainline artists.
You click on "Nas", and scroll down to Fans also like. Right now, for "Nas", it is showing "Mobb Deep", "Mos Def", "Rakim", "Big L", "Wu-Tang Clan", "Gang Starr", "Ghostface Killah", "Method Man" and "Common".
This is a mix T1 and T2. "Wu-Tang"s in there along with assorted members, but some of the other artists are much lesser known quantities.
Its a bit hard for me to decide what a Hip-Hop layman would consider the most unknown name here, but I'd venture it'd be "Big L". We click on him, do the same thing. Now we're really getting somewhere, with guys like "Inspectah Deck" and "Smif-n-Wessun". Click, dig, we get a bunch of names amongst which "Lord Finesse" stands out. The Show more at the end of Fans Like is also invaluable.
In total the dig order for me to get to the very bottom of the undeground is "Nas" > "Big L" > "Smif-n-Wessun" > "Lord Finesse" > "Channel Live" > "Ed OG & Da Bulldogs" > "Trends of Culture" > "Brokin English Klik" (358 monthly listeners).
I wouldn't consider each of those going a tier (layer) deeper. As a guy who knows waaay too much about Hip-Hop, I'd separate them into:
- T1: "Nas", "Big L"
- T2 "Smif-n-Wessun", "Lord Finesse"
- T3 "Channel Live", "Ed OG & Da Bulldogs", "Trends of Culture", "Brokin English Klik"
Perhaps "Brokin English Klik" should be in its own T4 and 3 tiers lacks the fidelity to be necessarily accurate. Not sure.
A little shortcut would be using "The Edge of $Genre" playlists. They're the pair playlists to "The Sound of $Genre" (broad slice) and "The Pulse of $Genre" (most popular) generated via everynoise.com, although as that guy got fired from Spotify its up in the air how long those will keep working.
Edit: oh, and if you run into a playlist that caters to that deep underground (in my case, that was "90's Tapes"*), that's worth its bytes in gold.
*https://open.spotify.com/playlist/2H0rNGEBShvHSGebM2m37c?si=...
TikTok's recommendation algorithm is probably one of the best. It puts content first, giving what seems only a passing weight to follower count.
That doesn't mean that having a big follower count doesn't increase you chance to go viral and gain a lot of views, but it is much more likely for great content from a small creator to go viral, than mediocre content from someone with 500.000 followers.
You can also see this in that successful TikTok profiles often have a much higher view-to-follower ratio than something like YouTube.
Actually for all the attention that the top Youtubers get (in terms of revenue), the reality is that it's going to be impossible to replace teaching income with popular Youtube videos alone.
Based on what I've seen, 1 million video views on Youtube gets you something like $5-10K. And that's with a primarily US audience that has the higher CPM / RPM. So your channel(s) would need to get to about 6 million views per year, primarily US driven, in order to get to earning a median US wage.
In reality you have to know the strengths and weaknesses of any tool, and small/fast LLM can do a tremendous amount within a fixed scope. The people at Mistral get this.
So the assertion that small models aren’t as good just isn’t correct. They are amazing at certain things, and are incredibly faster and cheaper than larger models.
This can be self-fulfilling.
In an organization beyond a certain size, there will be more almost-adequate-fits than there are leadership positions. This could be about like a recognized baseline which seems like it really needs to be scrutinized closely to see exactly who might be slightly above or below the line.
Or in a small company where there is not any almost-fit whatsoever, imagination can result in an ideal that is equally recognizable, but also might not be fully attainable.
Either way it could be OK but not exactly the best-fit.
If good fortune smiles and the rare more-than-adequate-fit appears anywhere on the horizon though, it's so unfamiliar they fly right over the radar.
LLM are not AGI, they are tools that have specific uses we are still discovering.
If you aren’t trying to optimize your accuracy to start with and just saying “I’ll run the most expensive thing and assume it is better” with zero evaluation you’re wasting money, time, and hurting the environment.
Also, I don’t even like running Mistral if I can avoid it - a lot of tasks can be done with a fine tune of BERT or DistilBERT. It takes more work but my custom BERT models way outperform GPT-4 on bounded tasks because I have highly curated training data.
Within specialized domains you just aren’t going to see GPT-4/5/6 performing on par with expert curated data.
Might be hard to imagine today but back then OCR and image recognition was typically done with normal statistical regression models, and the neural networks they had then were worse than those.
Just a friendly heads-up, it’s “bee-lined.”
I normally wouldn’t point that out, but “b-lined” could be read to suggest the opposite of your intention; a lower priority, a la “B-list celebrity.”
It's not like the technology is going to disappear.
Convinced hey do it on purpose.
to get 6m views you need to make one video a week that gets 114k views 6000000/52 = 115,384.61.
Ok, I’m going to call b/s here unless your expectations of Google have not gone way down over the years. Google was night and day different results twenty years ago vs ten years ago vs today. If 2004 Google search was a “10 out of 10”, then 2014 it was an “8 out of 10”, and today barely breaks a “5” in quality of results in comparison and don’t even bother with the advanced query syntax you could’ve used in the 00’s, they flat ignore it now.
(Also, side note, reread what you said in this post again. Just a friendly note that the overall tone comes across a certain way you might not have intended)
People don't participate in murder and they think others shouldn't either.
People don't participate in wars (which are essentially large scale murder) and they think others shouldn't.
Murder happens anyway. War happens anyway.
Yet if someone says 'war bad' people jump and say 'virtue signaling', but no one does that when people say 'murder bad'.
There's some really weird moral entanglement happening in the minds of people that are so eager to call out virtue signaling.
The specific policies of OpenAI or Google or whatnot are irrelevant. The technology is out of the bag.
You can easily talk while you’re doing something else.
1) Then the fans should start to get ready for the American Revolution, only three years to go...
2) But actually, all the fans will by that time have read your above comment, so they'll be prepared for what's to come.
3) But actually actually, fans running in reverse will only suck in cold air through the exhaust port and blow out warm through the intake (a bit like politicians?), so they'll still be cooling devices.
sure it would be nice if we could have Aristotelian philosopher kings style politicians but that's not human nature.
Members of Congress have plenty of support devoted to both what people say they want and what they actually positively respond to. That’s...the entire political side of the operation.
Natural language interfaces belong at the periphery, as the interface between the human and the machine. Other than that, I want my computers dumb as rocks, really fast, any totally predictable - which is basically the opposite of what you get from LLM's.
You can check his GitHub: https://github.com/karpathy
It may be that you're expecting it to do too much at once. Try giving smaller requests.
Agree he had a decent overall track record at Stanford, but that’s not how tenure works — it might have got his foot in the door as an assistant professor somewhere. He chose a much more lucrative path.