zlacker

[parent] [thread] 11 comments
1. cmrdpo+(OP)[view] [source] 2023-06-10 15:44:42
Copilot is to license violations (esp of copyleft licenses) what cryptocurrency mixers are for money laundering.

My employer (IMHO smartly) forbids use of LLMs in company IP and company laptops, etc. Many others I'm sure are doing the same, and many others will follow.

replies(3): >>bushba+t5 >>theRea+57 >>fooste+Nc
2. bushba+t5[view] [source] 2023-06-10 16:21:18
>>cmrdpo+(OP)
Once the ip rules are figured out it’ll open the door to a lot of usecases. This reminds me more of p2p file sharing being precursor to paid streaming services.
3. theRea+57[view] [source] 2023-06-10 16:32:12
>>cmrdpo+(OP)
Nobody uses copilot intentionally to violate copyright law. People do use crypto mixers intentionally to violate money laundering laws.
replies(2): >>SpicyL+5b >>cmrdpo+Hj6
◧◩
4. SpicyL+5b[view] [source] [discussion] 2023-06-10 16:50:59
>>theRea+57
Nobody affirmatively says “yes, my goal is to violate copyright law, and Copilot is the best tool I’ve found”. But it doesn’t seem impossible to me that the value of Copilot comes partially from the fact that it can copy paste code from copyrighted repositories in ways which would be illegal for you or I to do. I’m not sure it’s proven yet but I wouldn’t be shocked if it is in the future.
replies(1): >>shagie+Dm
5. fooste+Nc[view] [source] 2023-06-10 17:00:24
>>cmrdpo+(OP)
Sorry your employer forbids the use of tooling that makes your life better and reduces drudgery. Perhaps you should vote with your feet and find a less Luddite employer.
replies(2): >>reaper+Te >>indror+Jn
◧◩
6. reaper+Te[view] [source] [discussion] 2023-06-10 17:14:43
>>fooste+Nc
Sorry your employer forbids the use of tooling that makes your life better and reduces drudgery. Perhaps you should vote with your feet and find a less Luddite employer.

Does your company allow you to outsource your work to people in a poorer nation for a fraction of the cost that you are paid? Why not? Perhaps you should vote with your feet and find a less Luddite employer.

replies(1): >>Dylan1+To
◧◩◪
7. shagie+Dm[view] [source] [discussion] 2023-06-10 17:56:04
>>SpicyL+5b
It provides the same value as someone who copies and pastes code from Stack Overflow or any of the predecessors without concerning themselves with the license.

I am certain that I can find code from Linux or gcc or emacs on Stack Overflow that is under a GPL license and not compatible with the CC license Stack Overflow uses... and yet it's there. What's more, people will copy that code into their own ignoring the CC license too.

How is that really any different than using Copilot if the original license and attribution are something to respect.

Note that I do think that the original license is something to respect which is why for any of the code that I write that has copyright that matters on it (toy program for home? meh. Hobby project repo that I'm working on that I'll publish? yep. Employer's code for work? absolutely.) I either don't touch questionable sources or run a license check on it when using it.

The key thing is that I don't consider the use of Copilot to be any more controversial than copying from Stack Overflow - which has been done by countless programmers for a decade before Copilot existed and no one got up in arms about it then.

replies(1): >>cmrdpo+ll6
◧◩
8. indror+Jn[view] [source] [discussion] 2023-06-10 18:00:40
>>fooste+Nc
My company forbids the use of LLMs that aren't validated (and we make one).

Our managers get emails if we make calls to known LLMs, and there's guidance on locally running LLMs and using their output ("it's okay for small things maybe, but be careful"). Why?

Because legal's job is to protect the company from legal threats. Sometimes that means making some awkward choices, like handwringing over the use of GPL licensed software in publicly exposed example code (such as sample apps) purely because some aspects of the GPL haven't been tested in American courts, much less international ones.

So the use cases for LLMs there are mostly source-to-source transformative ("Turn this function and documentation into javadoc format please") or similar -- stuff where you can show that the LLM isn't introducing anything that might maybe possibly have any hint of externally licensed software.

replies(1): >>renewi+Xs
◧◩◪
9. Dylan1+To[view] [source] [discussion] 2023-06-10 18:06:17
>>reaper+Te
If you have the skills for that, hell yes find an employer that will let you do it, either explicitly or implicitly.
◧◩◪
10. renewi+Xs[view] [source] [discussion] 2023-06-10 18:21:49
>>indror+Jn
Wild. I suppose it's good that people who like these conditions can find employers like this and people like me who don't can find employers not like this.

I could never countenance operating under these conditions.

◧◩
11. cmrdpo+Hj6[view] [source] [discussion] 2023-06-12 16:07:30
>>theRea+57
Copilot is a product -- at least indirectly -- of Microsoft, a company who for about a decade made very public pronouncements about how they disagreed with the GPL (or copyleft generally), found it problematic, and tried actively to discourage its use.

Today's MS isn't really the same, and they've clearly made their peace with Linux. But it still happens that the GPL is in some fundamental ways at odds with commercial exploitation of open source code. So any corporate entity is going to struggle with it because at best it requires being very careful in distribution, or trying to negotiate or cut a deal with the licensee. At worst it can lead to legal problems and IP leakage on your own product.

So, not claiming any conspiracy. Or intent to violate intentionally. But it is in the convenient interests of companies like MS/OpenAI/GitHub to treat open source work as effectively public domain rather than under copyright, and to push the limits there.

The risk to an employer is of course the accidental introduction of such copylefted material into their code-base through copilot or similar tools.

I suspect two sources of disconnect with the broader community on hackernews that doesn't seem to see the issue here:

a) Much of the folks on this forum are working in the full-stack/web space where fundamentally novel, patented, or conceptually difficult algorithms and datastructures are rare. For them Copilot is an absolute blessing in helping to reduce the tedium of boilerplate. However in the embedded systems, operating systems, compiler, game engine dev, database internals etc. world there are other aspects at work. In certain contexts, Copilot has been shown to reproduce complicated or difficult code taken from copyrighted or copylefted (or maybe even patented sources) without attribution. And apparently now with some explicit obfuscation.

To put it another way: it's unlikely that Copilot's going to violate licenses with its assistance with turning your value/model objects from one structure to another, or writing a call into a SQL ORM. But it's quite possible that if I'm writing a DB join algorithm or some complicated math in a rendering engine or a compiler optimization phase that it could "crimp notes" from a source under restrictive license... because those things are absolutely in its learning set and the LLM doesn't "know" about the licensing behind them.

b) Either misunderstanding of, or lack of knowledge of, or outright hostility to... copylefted or attribution licenses which require special handling.

◧◩◪◨
12. cmrdpo+ll6[view] [source] [discussion] 2023-06-12 16:13:59
>>shagie+Dm
Browsing Stack Overflow and even blindly copy and pasting is an intentional action done by research by the user, and the source of the material pasted is known or discoverable.

Using Copilot is an automated process, and the source of the material used in learning is deeply obfuscated in the learning model.

That's why I make the analogy back to cryptocurrency mixers.

[go to top]