GitHub Copilot available for JetBrains and Neovim

>>orph+(OP)
Here is a demo of it with Neovim & Codespaces: https://twitter.com/josebalius/status/1453413543232090118

>>ipnon+V5
There's a risk of it turning into a worse StackOverflow, by suggesting plausible-looking but subtly incorrect code. Here's two examples I found:

https://twitter.com/ridiculous_fish/status/14527512360594513...

>>orph+(OP)
So who knows what all that WASM is for in the Neovim plugin? Is there source for it? https://github.com/github/copilot.vim/tree/release/copilot/d...

>>alexel+Ad
A web search suggests that it’s this: https://github.com/nvim-treesitter/nvim-treesitter

>>Spinna+ad
On my current pet project, it has written almost all of the tests here: https://github.com/golergka/rs-lox/blob/master/src/compiler.... ad here: https://github.com/golergka/rs-lox/blob/master/src/scanner.r... (albeit not the screwed indentation) completely by itself. I didn't even have to write the function names, just a few macros to help it along and a couple of examples to teach it to use it.

>>Spinna+ad
Here is one: https://www.ponicode.com/

>>ericle+lf
QuickcCheck-type tools (generators for tests that know about the edge cases of a domain - e. g. for the domain of numbers considering things like 0, the infinities, various almost-and-just-over powers of two, NaN and mantissas for floats, etc.):

* QuickCheck: https://hackage.haskell.org/package/QuickCheck

* Hypothesis: https://hypothesis.readthedocs.io/en/latest/

* JUnit QuickCheck: https://github.com/pholser/junit-quickcheck

Fuzz testing tools (tools which mutate the inputs to a program in order to find interesting / failing states in that program). Generally paired with code coverage:

* American Fuzzy Lop (AFL): https://github.com/google/AFL

* JQF: https://github.com/rohanpadhye/JQF

Mutation / Fault based test tools (review your existing unit coverage and try to introduce changes to your _production_ code that none of your tests catch)

* PITest: https://pitest.org/

>>orph+(OP)
Copilot has been super fun! Here is a small website I generated via only comments - in the HTML and JS. The CSS needed a bit more massaging but it also auto generated.

https://spencer0.github.io/copilot-tests/

>>closep+Tf

    if commentErr != nil {
        hn.Upvote("https://news.ycombinator.com/item?id=29017491")
    }

>>orph+(OP)
Copilot is crazy. The other day, I was writing a Python function that would call a Wikipedia API. I pulled from the internet an example of a GET request, and pasted it as a comment in my code.

  # sample call: https://en.wikipedia.org/w/api.php?action=query&format=json&list=geosearch&gscoord=37.7891838%7C-122.4033522&gsradius=10000&gslimit=100

Then I defined a variable,

  base_url = "https://en.wikipedia.org/w/api.php?"

Then, like magic, Copilot suggested all the remaining keys that would go in the query params. It even knew which params were to be kept as-is, and which ones would come from my previous code:

  action = "query"  # action=query
  format = "json"  # or xml
  lat = str(latitude.value)  # 37.7891838
  lon = str(longitude.value)  # -122.4033522
  gscoord = lat + "%7C" + lon
  ...
  api_path = base_url + "action=" + action + "&format=" + format + ... + "&gscoord=" + gscoord

As a guy who gets easily distracted while programming, Copilot saves me a lot of time and keeps me engaged with my work. I can only imagine what it'll look like 10 years from now.

>>rectan+aq
Only if the threshold of originality is passed. I feel many of Copilot snippets are so tiny that this threshold is not reached. But not a judge here.

https://en.wikipedia.org/wiki/Threshold_of_originality

>>miohta+ct
The threshold of originality is clearly being reached some of the time:

> Copilot regurgitating Quake code, including sweary comments (twitter.com/mitsuhiko)

https://news.ycombinator.com/item?id=27710287

>>orph+(OP)
I just tried installing for neovim - but got an error running vim-plug's `:PlugInstall`

I added an issue. https://github.com/github/feedback/discussions/6847

Anyone else install in neovim?

>>kessle+jo
Saw a review the JetBrains page that Rider isn't officially supported, but they got it to work anyway, although they don't say how

https://plugins.jetbrains.com/plugin/17718-github-copilot/re...

>>orph+(OP)
Note that you have to have a copilot account and GA is still waitlisted.

I filed a PR because it was a bit frustrating to go through the entire setup and then find out I needed to be granted access.

https://github.com/github/copilot.vim/pull/2

>>throwe+0x
Looks like I can download it from the plugin page and manually install the zip file. Thanks!

https://plugins.jetbrains.com/plugin/17718-github-copilot/ve...

EDIT: Hmm, it installed but it refuses to run.

EDIT2: Looks like you can force the plugin to work by editing the plugin.xml contained in github-copilot-intellij-1.0.1.jar within the plugin archive. Just remove the line that includes Rider as incompatible. The same should work for CLion.

>>relati+sB
code lacks context sensitive escaping

  api_path = base_url + urllib.parse.urlencode({
    'action': action,
    'format': letThisBeVariable,
    ...
    'gscoord': str(latitude.value) + '|' + str(longitude.value)
  })

see: https://docs.python.org/3/library/urllib.parse.html#urllib.p...

Mantra: when inserting data into a context (like an url) escape the data for that context.

>>relati+sB
I usually try to avoid working with URLs as bare strings like this, both for readability and correctness (URL encoding is tricky). With ‘requests’ you can do something like pass a dictionary of your query params and it takes care of forming the actual request URL.

https://docs.python-requests.org/en/latest/user/quickstart/#...

>>yeptha+Ja
I don't know if this is true, but I would assume that the tokenizers they used for Codex use actual language parsers which would drop invalid files like this and make this attack infeasible.

When I was playing around a couple years ago with the Fastai courses in language modeling I used the Python tokenize module to feed my model, and with excellent parser libraries like Lark[0] out there it wouldn't take that long to build real quality parsers.

Of course I could be totally wrong and they might just be dumping pure text in, shutter.

[0]: https://github.com/lark-parser/lark

>>sterli+vG
It's dicey according to the GPL FAQ [1]. It goes against what GPL authors want: their work being used in proprietary projects.

This could have been prevented very simply: GitHub avoiding training Copilot on GPL code.

What they can still do is offer a new model excluding GPL code for people who care about it.

[1] - https://www.gnu.org/licenses/gpl-faq.en.html#SourceCodeInDoc...

>>orph+(OP)
So it looks like for neovim there is not yet the option to cycle through suggestions?

Looks like for VSCode the shortcut on Linux is Alt-], see: https://github.com/github/copilot-docs/blob/main/docs/visual...

But for neovim, it doesn't mention anything about it in the docs: https://github.com/github/copilot.vim/blob/release/doc/copil...

And, nothing happens when pressing Alt-].

>>belter+wt
This attack has been studied!

https://arxiv.org/abs/2007.02220

Although our own work shows Copilot is pretty good at adding security flaws on its own:

https://arxiv.org/abs/2108.09293

>>orph+(OP)
I have many thoughts about Copilot, but here are two.

First, as much as I don't like the idea of Copilot, it seems to be good for boilerplate code. However, the fact that boilerplate code exists is not because of some natural limitation of code; it exists because our programming languages are subpar at making good abstractions.

Here's an example: in Go, there is a lot of `if err == nil` error-handling boilerplate. Rust decided to make a better abstraction and shortened it to `?`.

(I could have gotten details wrong, but I think the point still stands.)

So I think a better way to solve the problem that Copilot solves is with better programming languages that help us have better abstractions.

Second, I personally think the legal justifications for Copilot are dubious at best and downright deception at worst, to say nothing of the ramifications of it. I wrote a whitepaper about the ramifications and refuting the justifications. [1]

(Note: the whitepaper was written quickly, to hit a deadline, so it's not the best. Intro blog post at [2].)

I'm also working on licenses to clarify the legal arguments against Copilot. [3]

I also hope that one of them [4] is a better license than the AGPL, without the virality and applicable to more cases.

Edit: Do NOT use any of those licenses yet! I have not had a lawyer check and fix them. I plan to do so soon.

[1]: https://gavinhoward.com/uploads/copilot.pdf

[2]: https://gavinhoward.com/2021/10/my-whitepaper-about-github-c...

[3]: https://yzena.com/licenses/

[4]: https://yzena.com/yzena-network-license/

>>Matthi+xD
I'm surprised no one has suggested using `requests` considering how easy, safe and readable it is:

    >>> import requests, pprint
    >>> 
    >>> 
    >>> url = "https://en.wikipedia.org/w/api.php"
    >>> resp = requests.get(
    ...     url, 
    ...     params=dict(
    ...         action="query",
    ...         list="geosearch",
    ...         format="json",
    ...         gsradius=10000,
    ...         gscoord=f"{latitude.value}|{longitude.value}"
    ...     )
    ... )
    >>> 
    >>> pprint.pprint(resp.json())
    {'batchcomplete': '',
     'query': {'geosearch': [{'dist': 26.2,
                              'lat': 37.7868194444444,
                              'lon': -122.399905555556,
                              'ns': 0,
    ...

>>e0a74c+SQ
For what it's worth, Copilot can do it.

I typed the following prompt:

    def search_wikipedia(lat, lon):
        """
        use "requests" to do a geosearch on Wikipedia and pretty-print the resulting JSON
        """

And it completed it with:

    r = requests.get('https://en.wikipedia.org/w/api.php?action=query&list=geosearch&gsradius=10000&gscoord={0}|{1}&gslimit=20&format=json'.format(lat, lon))
    pprint.pprint(r.json())

>>Graffu+Vl
I saw a cool study recently (summarized well here[1]) with an empirical experiment on how well code coverage predicts how well a test suite catches bugs. They found that the number of test cases correlated well with the test suite's effectiveness, but, when controlling for the number of tests, code coverage didn't.

It was a pretty thorough study:

> Our study is the largest to date in the literature: we generated 31,000 test suites for five systems consisting of up to 724,000 lines of source code. We measured the statement coverage, decision coverage, and modified condition coverage of these suites and used mutation testing to evaluate their fault detection effectiveness. We found that there is a low to moderate correlation between coverage and effectiveness when the number of test cases in the suite is controlled for.

Given their data, their conclusion seems pretty plausible:

> Our results suggest that coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness.

That's certainly how I approach testing: I value having a thorough test suite, but I do not treat coverage as a target or use it as a requirement for other people working on the same project.

[1]: https://neverworkintheory.org/2021/09/24/coverage-is-not-str...

>>bastar+lO
Stackoverflow content is generally creative commons not GPL unless I'm missing something.

(https://stackoverflow.com/help/licensing)

>>rectan+7B
> It doesn't seem to me as though the distinction between "Copilot reproduced the code and the engineer copy/pasted/saved it" versus "Copilot inserted the code" is crucial.

I think that could be crucial.

If I read a computer science book, and from that produce a unique piece of code which was not present in the book, I have created a new work which I hold copyright over.

If I train a machine learning algorithm on a computer science book, and that ML algorithm produces some output, that output does not have a new copyright.

In essence, there must be originality for a work to be under a new copyright, and that is likely a requirement that it must be a human author. See this wikipedia page: https://en.wikipedia.org/wiki/Threshold_of_originality#Mecha...

Similarly, if copilot synthesizes a bunch of MIT code and produces a suggestion, that may be MIT still, while if a human does the exact same reading and writing, if it is an original enough derivative, it may be free of the original MIT license.

>>treesp+a21
> The fast inverse square root algorithm referenced here didn't originate from Quake

Where did it come from then? And what license did the original have?

> and is in hundreds of repositories - many with permissive licenses like WTFPL and many including the same comments.

If the original was GPL or proprietary, then all of this copies with different licenses are violating the license of the original. Just because it exists everywhere does not mean Copilot can use it without violating the original license.

> It's not really a large amount of material, either.

No, but I would argue that it is enough for copyright because it is original.

> GitHub claims they haven't found any "recitations" that appeared fewer than 10 times in the training data.

Key word is "claim". We can test that claim. Or rather, you can, if you have access to Copilot, you can try the test I suggested at https://news.ycombinator.com/item?id=28018816 . Let me know the result. Even better, try it with:

    // Computes the index of them item.
    map_index(

because what's in that function is definitely copyrightable.

> With the exceptions mentioned above, what you get back from asking for more code won't just be more and more of a particular work. Realistically I think you'd be able to get significantly more from Google Books.

That can only be tested with time. Or with the test I gave above.

I think that with time, more and more examples will appear until it is clear that Copilot is a problem.

Nevertheless, a court somewhere (I think South Africa) recently ruled that an AI cannot be an inventor. If an AI cannot be an inventor, why can it hold copyright? And if it can't hold copyright, I argue it's infringing.

Again, only time will tell which of us is correct according to the courts, but I intend to demonstrate to them that I am.

>>chii+7X
And you'll get the same "works for me" results as you would from SO. I put my thoughts down a while back: https://bostik.iki.fi/aivoituksia/random/minimum-viable-copy...

>>redleg+x91
Totally agree. Co-pilot is just stressing more our roles as code reviewers. It’s not even a new idea - people working in program synthesis have done this for a while now. I’ve written about this a couple of years ago: https://medium.com/@marceloabsousa/the-software-shift-toward...

>>namelo+vH1
You know, perhaps this is tangential to the point that you're making at best, but i still couldn't help but to notice:

> The three parts take roughly the same portion of time, and when I'm writing tests

that bit and have some strong feelings about it. At my current dayjob, writing tests (if it was even done for all code) would easily take anywhere between 50% and 75% of the total development time.

I wish things were easy enough for writing test code not to be a total slog, but sadly there are too many factors in place:

  - what should the test class be annotated with and which bits of the Spring context (Java) will get started with it
  - i can't test the DB because the tests don't have a local one with 100% automated migrations, nor an in memory one because of the need to use Oracle, so i need to prevent it from ever being called
  - that said, the logic that i need to test involves at least 5 to 10 different service calls, which them use another 5 to 20 DB mappers (myBatis) and possibly dozens of different DB calls
  - and when i finally figure out what i want to test, the logic for mocking will definitely fail the first time due to Mockito idiosyncrasies
  - after that's been resolved, i'll probably need to stub out a whole bunch of fake DB calls, that will return deeply nested data structures
  - of course, i still need all of this to make sense, since the DB is full of EAV and OTLT patterns (https://tonyandrews.blogspot.com/2004/10/otlt-and-eav-two-big-design-mistakes.html) as opposed to proper foreign keys (instead you end up with something like target_table and target_table_row_id, except named way worse and not containing a table name but some enum that's stored in the app, so you can't just figure out how everything works without looking through both)
  - and once i've finally mocked all of the service calls, DB calls and data initialization, there's also validation logic that does its own service calls which may or may not be the same, thus doubling the work
  - of course, the validators are initialized based on reflection and target types, such as EntityValidator being injected, however actually being one of ~100 supported subclasses, which may or may not be the ones you expect due to years of cruft, you can't just do ctrl+click to open the definition, since that opens the superclass not the subclass
  - and once all of that works, you have to hope that 95% of the test code that vaguely correseponds to what the application would actually be doing won't fail at any number of points, just so you can do one assertion

I'm not quite sure how things can get that bad or how people can architect systems to be coupled like that in the first place, but at the conclusion of my quasi-rant i'd like to suggest that many of the systems out there definitely aren't easily testable or testable at all.

That said, it's nice that at least your workflow works out like that!

>>Kronis+mc2
Thanks for sharing this. I can feel you because I have been working on a similar project but slightly better, however, it's painful still for me. I wrote a comment last month [0] that is more or less related to what you've said. Basically, you want to write fewer tests that really matter, while the infrastructure should be fast and parallelizable.

Sadly it's easier said than done, since it's not an easy thing to fix for an existing system. We've spent quite some time improving things to ease the pain on writing tests, it was getting better but would never reach the level if we were aware of this problem in the first place - there are tens of thousand tests and we cannot rewrite them all.

I'm not too familiar with your tech stack. But there are two things you mentioned that are especially tricky to handle for testing: DB and service calls.

For DB, there are typically two ways to handle it: Use real DB, or mock it.

Real DB makes people more confident, and don't need to mock too many things. The problem is it can be slow and not parallelizable, or worse, like your case there's no impotent environment at all. We had automated migrations, but the test was run against the SQL Server on the same machine, so it was not parallelizable so the tests took more than a day to run on a single machine. On CI there are tens of machines but still takes hours to finish. In the end, we generalized things a little bit, and used SQLite for testing in a parallel manner. (Many people suggest against this because it's different from production, but the tradeoff really saved us). A more ideal approach is to have SQL sandboxing like Ecto (written in Elixir). Another ideal approach is to have in memory lib that is close to DB, for example, the ORM Entity Framework has an in-memory implementation, which is extremely handy because it's written in C# itself.

If there's no way to leverage real DB, you have to mock it. One thing that might help you is to leverage the Inversion of Control pattern to deal with DB access, there are many doctrines like DDD repositories, Hexagonal, Clean Architecture but essentially they're similar on this point. In this way, you'll have a clean layer to mock, and you can hide the patterns like EAV under those modules. As you leverage them enough, they will evolve and there would be helpers that could simplify the mocking process. According to your description, the best bet I would say is to evolve toward this direction if there's no hope on using real DBs, as you can tuck as much as domain logic into the "core" without touching any of the infrastructures. So that the infrastructure tests could be just very simple and generic.

For service calls, the obvious thing is to mock those calls. The not so obvious thing is to have well-defined service boundaries in the first place. I cannot stress this enough. When people failed to do this, they will feel they're spending a lot of time mocking services, while at the same time they feel they've tested nothing because most things are mocked. Microservices were getting too much hype over the years, but very few people pay enough attention on how to define services boundaries. The ideal microservice should be mostly independent, while occasionally calling others. DDD strategic design is a great tool for designing good service boundaries (while DDD tactic design is yet another hype, just like how people care more about Jira than real Agile, making good things toxic). We were still struggling with this because refactoring microservice is substantially harder than refactoring code within services, but we do try to avoid more mistakes by carefully designing bounded contexts across the system.

With that said, when the service boundaries are well-defined, and if you have things like SQL sandboxing, it's a breeze to test things because most of the data you're testing against is in the same service's DB, and there are very few service calls need to be mocked.

[0] https://news.ycombinator.com/item?id=28642506#28679372

>>anigbr+gZ1
> Frankly, I think it's on you to lay out your argument in full rather than assume everyone is privy to your thought process.

No, it's on you to not assume you know everything about my thought process before I show you otherwise.

Could I have communicated better? Yes. But I didn't assume you knew everything about my thought process. I thought it wasn't necessary for you too until you assumed that you knew my argument better than I did.

> You seem to coming at this as if the law is a purely mechanistic thing that can quickly resolve disputes, overlooking how these things play out in the real world, like Oracle v google going on for a decade or the even longer litigation involving SCO and IBM.

Once again, you are assuming. Yes, I know law is not mechanistic. Yes, I know going to court would take a long time.

Going to court is not the only thing I am doing. I also created new licenses, which I would not have if I only cared about what happened in court.

Going to court would be to attempt to argue for and enforce my viewpoint (indirectly). It would be a last-ditch attempt.

The first thing I am doing is creating new licenses specifically meant to "poison the well" for machine learning on code in general and Copilot in particular. [1]

With those licenses, I hope to make companies nervous about using Copilot for anything that might be using my licenses. This hesitation may only apply to code with my licenses, but the FAQ for those licenses ([2] is an example) are also designed to make lawyers nervous about the GPL and other licenses.

If I succeed in making the hesitation big enough, then Copilot as a paid service would be dead, and hopefully enough companies will prohibit the use of Copilot, as is already being done. [3]

Going to court, then, would only happen if I found someone infringing.

This will be especially helped by the fact that the vast majority of the code under those licenses will be in a language I'm building right now. If there's open source code in the language, then I can search that code for infringements caused by Copilot.

> I mean, what makes you so sure the court is going to give you a quick judgment on the infringement, or that it's going to agree with you about the size of code fragment that that is sufficient to infringe?

Do you think I would be stupid enough to pick an example to bring before court that would not be obviously infringing?

Winning in court is not just about being right, it's also about picking your battles, and I would be very choosy.

> Surely you can can agree that sufficiently small code fragments won't meet this threshold because they're too basic or obvious.

Yes, and as I said above, I won't use any of those.

> Because your whole argument here rests upon that assumption, it comes off as a wish fulfillment scenario where Copilot disappears because nobody likes the risk calculus;

You realize that this is the entire basis for the cybersecurity industry? The entire point is to make it economically infeasible for bad guys to do bad things in cyber space; it's to make the "risk[/reward] calculus" skew in favor of the good guys so much so that bad guys just stop operating.

Making the risk calculus riskier for your opponent is how wars and legal cases are fought too, but such tactics are not confined to the warroom or courtroom. That's why my opening salvo is licenses to sow doubt, to change the perception of the risk calculus. Battles like this are won by "winning minds," which in this case means convincing enough people to be nervous about it.

> your stated goal of 'making Copilot a dead product' seems more emotional than rational.

This is something where you are partially right. There is a lot of emotion behind it, not because I'm an emotional person (I'm actually on the spectrum and less emotional than the average person), but because I objectively considered the ramifications of what GitHub is doing with Copilot, realized how bad those ramifications were, and that lit a fire under me.

I wrote about the ramifications and refuted the dubious legal justifications in a whitepaper [4] for the FSF call for papers [5]. (Intro blog post at [6].)

But if you will read through the paper, you will find that there is rationality in my thoughts. I just happen to think this is a fight worth taking. Thus, the emotion.

> In reality it will take you a long time to get a result, and if enough people find Copilot useful (which I suspect they will), legal departments will adapt to that risk calculus and just figure out the cost of blowing or buying you off in the event that their developers carelessly infringe.

"Buying me off" would include checking that Copilot didn't output my code, and if it did, to follow the license. I'm not sure they would like the added work to use something that is supposed to save work on the easiest part of programming. But even if they did, I would be satisfied.

And that points to another part of my "thought process": the reason that I think I've got a chance is because I think the "reward" side of the risk/reward calculus is not very high with Copilot because it is the easiest part of programming.

Almost everything in programming is harder than writing boilerplate, and as I said in another comment [7], I think there are still better ways of reducing boilerplate. In fact, the language I am working on is designed to help with that. So my perception, which I acknowledge could be wrong, is that the reward for using Copilot is not high, which means I may not have to raise the risk level much for people to change their minds about it.

But the most important point would be to make legal departments and courts recognize that copyright still has teeth, or rather, argue well enough to convince people of that fact, despite what GitHub is saying.

> If it sufficiently improves industrial productivity, it will become established while you're trying to litigate and afterwards people will just avoid crossing the threshold of infringement.

This would be a win in my book too. I am going to be the first person to write boilerplate code in my language, which means that anyone who writes in this language will be "copying" me. I don't care about the boilerplate, though; they can copy that as much as they want.

> Honestly, this exchange makes me glad that I don't publish software and thus don't care about license conditions on a day to day basis.

I feel you on that. The only reason I do is because I feel like my future customers deserve the blueprints to the software they are using the same way the buyers of a building deserve to get the building's blueprints from the architect. If I did not have that opinion, I would probably not publish either.

[1]: https://gavinhoward.com/2021/07/poisoning-github-copilot-and...

[2]: https://yzena.com/yzena-network-license/#frequently-asked-qu...

[3]: https://news.ycombinator.com/item?id=27714418

[4]: https://gavinhoward.com/uploads/copilot.pdf

[5]: https://news.ycombinator.com/item?id=27998109

[6]: https://gavinhoward.com/2021/10/my-whitepaper-about-github-c...

[7]: https://news.ycombinator.com/item?id=29019777

Edit: Clarification and fix typo.

>>josefx+lP1
> With the exact same comments?

Yep: https://github.com/search?p=1&q=evil+floating+point+bit+leve...

> Quite sure the FSF would be perfectly fine with that.

I believe the person republishing GCC code under MIT would be liable.

Also, I'm not recommending that you use code you know has been incorrectly licensed. Just that in cases where certain "folk code" is seemingly widely available under permissive terms, Copilot isn't doing much that an honest human wouldn't.

A better example against Copilot would be trying to get it to regurgitate some code that has a simple known origin and is always under a non-permissive license.

>>throwa+3P4
https://johnaaronnelson.com/i-cant-anymore-with-copilot/

TLDR; the subtleness of its wrongness destroys my ability to follow my train of thought. I always had to take myself out of my train of thought and evaluate the correctness of the suggestion instead of just writing.

For simple things, intellisense, autocomplete and snippets are far more effective.

For anything more complex, I already know what I want to write.

For exploratory stuff I RTFM

copilot was ineffective at every level for me.

zlacker

GitHub Copilot available for JetBrains and Neovim