Is there any reference that explains the deep technicalities of backtesting and how it is supposed to actually influence your model development? It seems to me that one could spend a huge amount of effort on backtesting that would distract from building out models and tooling and that that effort might not even pay off given that the backtesting environment is not the real market environment.
https://en.wikipedia.org/wiki/Long-Term_Capital_Management was kind of an example of both of those. They based their predictions on past behaviour which proved incorrect. Also if other market participants figure a large player is in trouble and going to have to sell a load of bonds they all drop their bids to take advantage of that.
A lot of deviations from efficient market theory are like that - not deeply technical but about human foolishness.
We do not use it as a way to determine profitability.
By assessing risk is that just checking that it does dump all your money and that you can at least maintain a stable investment cache?
Are you willing to say more about correctness? Is the correctness of the models, of the software, or something else?
Correctness has to do with whether the algorithm performed the intended actions in response to the inputs/events provided to it, nothing more. For the most part correctness of an algorithm can be tested the same way most software is tested, ie. unit tests, but it's also worth testing the algorithm using live data/back testing it since it's not feasible to cover every possible scenario in giant unit tests, but you can get pretty good coverage of a variety of real world scenarios by back testing.