1. Audit / evaluation / quality assurance teams exist across multiple verticals from multinationals to government, and cannot reliably function when overly subservient to the production or “value creating” side
2. Boeing is a good and timely example of the consequences of said internal checks and balances collapsing under “value creation” pressure. That was a catastrophic failure which still can’t reasonably be compared to the downside of misaligned AI.
>>doktri+(OP)
I agree with you on both points, but they have QA which is 1. The long-term risk team was more of a research/futurology/navel-gazing entity rather than a qa/audit function. I would say if you have any possible safety/alignment test that you can feasibly run it should be part of the CI/CD pipline and be run during training also. That's not what that group was doing.