zlacker

[return to "GitHub Copilot available for JetBrains and Neovim"]
1. Spinna+ad[view] [source] 2021-10-27 18:42:37
>>orph+(OP)
How well can copilot write unit tests? This seems like an area where it could be really useful and actually improve software development practices.
◧◩
2. manque+Qe[view] [source] 2021-10-27 18:49:51
>>Spinna+ad
Writing tests for the sake of coverage is already practically useless which is what a lot of orgs do, This could maybe generate such tests. However it doesn't materially impact quality now, so not much difference if automated.

One of the main value props for writing meaningful unit tests, is it helps the developer think differently about the code he is writing tests for, and that improves quality of the code composition.

◧◩◪
3. Graffu+Vl[view] [source] 2021-10-27 19:23:53
>>manque+Qe
Why is that useless? Codebases I have worked on that had high code coverage requirements had very little bugs.

* It promotes actually looking at the code before considering it done

* It promotes refactoring

* It helps to prevent breaking changes for stuff that wasn't supposed to change

◧◩◪◨
4. tikhon+cW[view] [source] 2021-10-27 23:03:36
>>Graffu+Vl
I saw a cool study recently (summarized well here[1]) with an empirical experiment on how well code coverage predicts how well a test suite catches bugs. They found that the number of test cases correlated well with the test suite's effectiveness, but, when controlling for the number of tests, code coverage didn't.

It was a pretty thorough study:

> Our study is the largest to date in the literature: we generated 31,000 test suites for five systems consisting of up to 724,000 lines of source code. We measured the statement coverage, decision coverage, and modified condition coverage of these suites and used mutation testing to evaluate their fault detection effectiveness. We found that there is a low to moderate correlation between coverage and effectiveness when the number of test cases in the suite is controlled for.

Given their data, their conclusion seems pretty plausible:

> Our results suggest that coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness.

That's certainly how I approach testing: I value having a thorough test suite, but I do not treat coverage as a target or use it as a requirement for other people working on the same project.

[1]: https://neverworkintheory.org/2021/09/24/coverage-is-not-str...

[go to top]