zlacker

Hey,

I worked on a really perf sensitive system and for perf tests we would run the last x commits each time to get rid of the busy vm syndrome.

It meant that the margin of error could be much less.

You might want to consider it as a mid way step between vm’s and scheduling on laptops (those poor laptop batteries!)

replies(1): >>imslav+z5

>>ed_ell+(OP)
That's a good way to address the noise on VMs! We do something different but in a similar spirit: when we compare to the main branch, we calculate the baseline based on 1-2 weeks worth of historical data on main (we identify the latest step change with a simple linear regression). This way we approximate the baseline based on ~100 data points which also helps to address the variance.

Of course re-running the code from main and the PR on the same VM side by side would be the best, and it would cost a lot more money (especially once you factor in GPUs). We considered it but opted to the strategy I outlined above, it's mainly a trade-off between accuracy vs costs