zlacker

GitHub accused of varying Copilot output to avoid copyright allegations

submitted by belter+(OP) on 2023-06-10 13:46:56 | 124 points 101 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts
1. belter+i7[view] [source] 2023-06-10 14:40:46
>>belter+(OP)
https://storage.courtlistener.com/recap/gov.uscourts.cand.40...
3. rolph+Rc[view] [source] 2023-06-10 15:14:36
>>belter+(OP)
[The judge overseeing the case has permitted the plaintiffs to remain anonymous in court filings because of credible threats of violence [PDF] directed at their attorney. The Register understands that the plaintiffs are known to the defendants.]

https://storage.courtlistener.com/recap/gov.uscourts.cand.40...

◧◩
14. rolph+Qh[view] [source] [discussion] 2023-06-10 15:46:58
>>taneq+he
proper attribution to the writer seems to be a big part of this. there is also suggestion ms knows, all about it but passes the liability buck to the end user of copilot suggestions.

[Lawyer and developer Matthew Butterick announced last month that he'd teamed up with the Joseph Saveri Law Firm to investigate Copilot. They wanted to know if and how the software infringed upon the legal rights of coders by scraping and emitting their work without proper attribution under current open-source licenses.]

https://www.theregister.com/2022/11/07/in_brief_ai/

https://www.theregister.com/2022/10/19/github_copilot_copyri...

◧◩◪
25. arp242+1o[view] [source] [discussion] 2023-06-10 16:28:04
>>former+Sf
Some more here:

https://storage.courtlistener.com/recap/gov.uscourts.cand.40...

https://storage.courtlistener.com/recap/gov.uscourts.cand.40...

https://storage.courtlistener.com/recap/gov.uscourts.cand.40...

Friendly people.

I've received emails like that too over the years. What hugely controversial thing do I do? I have a website where I sometimes write about $stuff and I post on HN. Keeping the basic info private is probably a good thing especially if they're based in the US, because "SWATting" etc, but beyond that it doesn't seem "credible" in the sense that it's very likely someone will show up at their door with a gun.

Since the first two are redacted, I wonder if they sent them with their real names.

◧◩◪◨⬒
30. willia+0p[view] [source] [discussion] 2023-06-10 16:34:04
>>l__l+4n
This might help shed some light:

https://en.wikipedia.org/wiki/Idea%E2%80%93expression_distin...

◧◩◪
59. seanhu+vJ[view] [source] [discussion] 2023-06-10 18:17:34
>>Brian_+Ud
Generally the reason behind adding randomness to machine learning is avoiding "local minima" in the search space of the optimization function(s) used for training the model. If your training produces a very smooth descent to an optimum it can lead to the model converging on a solution that is not globally the best. Adding some randomness helps to avoid this.

Specifically for GPT models, the temperature parameter is used to get outputs wihch are a bit more "creative" and less deterministic. https://help.promptitude.io/en/ai-providers/gpt-temperature

◧◩◪◨⬒⬓⬔
63. moyix+ZL[view] [source] [discussion] 2023-06-10 18:29:38
>>edgyqu+DC
I think you're misremembering here; as far as I know (and as far as I can tell from searching just now) MS has never sued ReactOS. There was a claim made back in 2006 on the mailing list that a portion of syscall.S was copied, and this caused ReactOS to do their own audit:

https://en.wikipedia.org/wiki/ReactOS#Internal_audit

80. Walter+S61[view] [source] 2023-06-10 20:22:50
>>belter+(OP)
One of the specific complaints is:

https://devclass.com/2022/10/17/github-copilot-under-fire-as...

It's a 25 or so line function that looks like a pedestrian implementation of a sparse matrix transpose algorithm. The author should have been patented it to protected it, not copyrighted it.

◧◩
90. popalc+gI1[view] [source] [discussion] 2023-06-11 01:20:51
>>JVille+Ip
When it comes to copyright infringement vs free use in US law there are certain requirements that have to be met, one of which is "transformativity"

This concept has specific technical meaning -

https://www.nolo.com/legal-encyclopedia/fair-use-what-transf...

It seems obvious to me that to call model weights "lossy compression" is not only incorrect from a technical (software dev) point of view, but also from this legal perspective.

The weights serve a different purpose than the original works from which they are derived, and wouldn't/couldn't POSSIBLY exist were it not for the original work of the authors of the models.

It's bad practice to go around espousing strong and condemnatory opinions about topics you don't have a full grasp of. In this case, it's both the technical details and the legal system.

It makes you look like a fool and costs you your credibility amongst peers in future encounters.

[go to top]