zlacker

[parent] [thread] 11 comments
1. esskay+(OP)[view] [source] 2022-10-17 08:44:25
Hi Ryan, thanks for posting here.

So I had something similar happen to the OP a couple of days ago. I'm on friendly terms with a competing codebase's developer and have confirmed the following with them, both mine and it are closed source and hosted on github.

Halfway through building something I was given a block of code by copilot, which contained a copyright line with my competitors name, company number and email address.

Those details have never, ever been published in a public repository.

How did that happen?

replies(2): >>elcome+84 >>omgomg+Gb2
2. elcome+84[view] [source] 2022-10-17 09:26:23
>>esskay+(OP)
> Those details have never, ever been published in a public repository.

The most simple answer would be that this is false, it was published somewhere but you are not aware of it.

replies(3): >>grecy+O7 >>trucul+S7 >>eloisi+Dc
◧◩
3. grecy+O7[view] [source] [discussion] 2022-10-17 10:14:06
>>elcome+84
An equally simple answer is that copilot is pulling code (or at least analyzing) from repositories that are not public.
replies(1): >>elcome+dv
◧◩
4. trucul+S7[view] [source] [discussion] 2022-10-17 10:14:39
>>elcome+84
Is it possible to verify with GitHub code search (cs.github.com)?
◧◩
5. eloisi+Dc[view] [source] [discussion] 2022-10-17 11:00:31
>>elcome+84
IMO that doesn’t absolve Microsoft at all. If someone uploads ripped MP3s to the internet somewhere, it doesn’t mean you could aggregate them, burn CDs and sell them.
◧◩◪
6. elcome+dv[view] [source] [discussion] 2022-10-17 13:19:34
>>grecy+O7
I think that's very unlikely, they said and repeated that they are not using private code. People catching them lying on this would be very bad for GitHub.
replies(3): >>inkedd+6y >>andrep+jw1 >>grecy+2s2
◧◩◪◨
7. inkedd+6y[view] [source] [discussion] 2022-10-17 13:35:14
>>elcome+dv
Yet here we are.
◧◩◪◨
8. andrep+jw1[view] [source] [discussion] 2022-10-17 17:33:23
>>elcome+dv
This is some highly impressive logic right here.

Proposition: "They don't use private code".

Proof: "They said they don't use private code. Either the private code appearing is published somewhere else, or they are using private code. Lying would be bad. Therefore the code is published somewhere else, and they don't use private code".

replies(1): >>afiori+3C3
9. omgomg+Gb2[view] [source] 2022-10-17 21:09:36
>>esskay+(OP)
Well, they have been published now.

If this can leak so easy, it makes me wonder how safe api keys are. They are supposed to be hidden away, we know, but so is proprietary code.

◧◩◪◨
10. grecy+2s2[view] [source] [discussion] 2022-10-17 22:51:54
>>elcome+dv
Bugs and unexpected behaviour catch us all.

I’m not saying they’re intentionally lying, but that one possible explanation is it looking through non public repositories

replies(1): >>elcome+FD3
◧◩◪◨⬒
11. afiori+3C3[view] [source] [discussion] 2022-10-18 09:58:22
>>andrep+jw1
I would say that the logic is more like:

Proposition: "They either do not use private code or they did something very very stupid."

Proof: "Not using private code is very easy (for example google does not train its models on workspace users' data, which is why they get inferior features) and they promised multiple time not to use private code so doing in would be hard to justify"

◧◩◪◨⬒
12. elcome+FD3[view] [source] [discussion] 2022-10-18 10:14:38
>>grecy+2s2
They would definitely notice such a bug. This would at least double or triple the amount of data they use. This is not something you can do by mistake.
[go to top]