zlacker

[parent] [thread] 16 comments
1. Waterl+(OP)[view] [source] 2022-10-16 22:30:27
I think people may be drastically over-valuing their code. If it was emitting an entire meaningful product, that would be something else. But it’s emitting nuts and bolts.

If the issue is more specifically copyright infringement, then leverage the legal apparatus in place for that. Their lawyers might listen better.

This is not a strongly held opinion and if you disagree I would love to hear your constructive thoughts!

replies(10): >>jacoop+h1 >>chiefa+52 >>heavys+C5 >>jimlon+J8 >>Havoc+ib >>summer+Sb >>ironma+Cf >>kitsun+gS >>blueha+331 >>jeppes+S61
2. jacoop+h1[view] [source] 2022-10-16 22:40:20
>>Waterl+(OP)
I mean it starts like this, but if Copilot gets a pass, companies might just use AI as a way to launder code and avoid complying with Free licenses.
replies(2): >>Shamel+p9 >>benhur+BN
3. chiefa+52[view] [source] 2022-10-16 22:47:41
>>Waterl+(OP)
To some extent I agree with your opening. That is, plenty of cases CP is showing how mundane most code is. It's one commodity stitched to another stitched to another.

That's not considering any legal / license issues, just a simple statement about the data used to train CP.

4. heavys+C5[view] [source] 2022-10-16 23:18:57
>>Waterl+(OP)
If they're that trivial and valueless, Microsoft should have no problem coming up with their own training sets instead of stealing them en masse from the public.

If I create something, I get to define the terms of its use, reproduction, distribution, etc. "Value" plays no part in whether someone can appropriate and distribute that creation without permission from the creator.

5. jimlon+J8[view] [source] 2022-10-16 23:49:17
>>Waterl+(OP)
>If it was emitting an entire meaningful product, that would be something else. But it’s emitting nuts and bolts.

Take a single C file or even a long function from the leaked Windows NT codebase and include it in your code. See how happy Microsoft will be with it. They spent millions of dollars on their legal teams. Eroding copyright protections will harm the weakest most. How many open source contributors can afford copyright lawyers?

◧◩
6. Shamel+p9[view] [source] [discussion] 2022-10-16 23:55:57
>>jacoop+h1
Having worked for a couple of big companies with IT, you should know they are effectively all breaking the law already in this regard (except for maybe hardware companies) because it’s basically impossible to enforce and no one cares.

The best way to make sure your code isn’t copied is not to publish it.

replies(1): >>drran+RU
7. Havoc+ib[view] [source] 2022-10-17 00:09:29
>>Waterl+(OP)
Copyright makes no such distinction
8. summer+Sb[view] [source] 2022-10-17 00:14:02
>>Waterl+(OP)
> I think people may be drastically over-valuing their code. If it was emitting an entire meaningful product, that would be something else. But it’s emitting nuts and bolts.

Please refrain yourself from this kind of blatant gaslighting. You're not the one to assess its value or usefulness and your point is at most tangential to the issue. The problem is that the model systematically took non-public domain code without any permits from the author, not whether it's useful or not. It's worth to hear this complaint and Copilot team should be more accountable for this problem since this could lead to more serious copyright infringement fights for its users.

9. ironma+Cf[view] [source] 2022-10-17 00:54:39
>>Waterl+(OP)
The OP was a professor of mine, and his library represents the product of thousands of hours of research. Probably every line in there is extremely valuable.
◧◩
10. benhur+BN[view] [source] [discussion] 2022-10-17 07:50:09
>>jacoop+h1
But then those companies can get in legal trouble, not Github.
11. kitsun+gS[view] [source] 2022-10-17 08:39:30
>>Waterl+(OP)
This is about standards. Laws for thee and not for me? It's just particularly hypocritical that the same companies that will sue anyone for violating their copyright have no issue violating copyright themselves.
◧◩◪
12. drran+RU[view] [source] [discussion] 2022-10-17 09:06:43
>>Shamel+p9
Can you name one of these big, rich, and careless companies, please?
replies(3): >>jefftk+G51 >>viridi+Ox2 >>Shamel+U87
13. blueha+331[view] [source] 2022-10-17 10:40:24
>>Waterl+(OP)
I suppose on the one hand you are right, people may well over-value their code. However the argument isn't really about the value or any monetary damage done through this. It's about a violation of ownership and trust.

Right or wrong, copyright doesn't care about how valuable something is. Everything is equally (not in reality but in theory) protected. GitHub is a platform many people have trusted with protecting ownership of their copyrighted code through reasonable levels of security.

I think the big discussion point here is around ensuring that this tool is acting correctly and respecting rights of an individual. It's very easy for a large company to accidentally step on people and not realise it or brush it away. People want to make sure that isn't happening and right now there are some very compelling examples where it looks like this is happening. The fact that this isn't opt-in and there's no way to opt-out your public repositories means the choice has been taken away from people. Previously you were free to license your code as you see fit, now we have some examples of where that license may not be being respected as a result of an expensive GitHub feature.

I think this is where the conversation is centring. It's not about whether your code is valuable or not. It's whether a large company is making profit by stepping on an individuals right of ownership or not.

On the note of leveraging legal apparatus to figure it out I think you're right. The problem is what individual open source maintainer is going to have the funds to bring a reasonable equal legal challenge to such a large organisation? I maintain a relatively well used open source project and I sure as hell don't. Realistically my option is to either spend a lot of personal time and resources to challenge it (if I think wrong-doing is happening) or just suck it up. Given that there's no easy way to figure out if wrong-doing is happening because it's all in the AI soup, it makes it even harder to consider that approach.

I think the point is a lot less about the value of the code, and much more about a massively organisation playing hard and fast with an individuals rights.

None of this is to say GitHub have actually done anything wrong here. I'm sure we'll figure that out in time, but it would be great if they could figure out a way to provide more concrete explanations.

◧◩◪◨
14. jefftk+G51[view] [source] [discussion] 2022-10-17 11:03:26
>>drran+RU
I'll name a counterexample: Google (used to work there) is very careful with the provence of external code, to the point that for simpler things it's often easier to write something internally than use the standard external thing.
15. jeppes+S61[view] [source] 2022-10-17 11:13:42
>>Waterl+(OP)
Github copilot is a paid product.

It doesn't matter if I think my code is valuable, it's that Github is using everyone's code for their own profit - without opt-in, attribution, or paying a license.

◧◩◪◨
16. viridi+Ox2[view] [source] [discussion] 2022-10-17 18:12:31
>>drran+RU
I can, roughly. One of the big international US based financial institution. Zero real concern for any licensing associated with software, across multiple teams I'd worked on in multiple lines of business. You find a library that works, you use it. Present in systems that touch dollars in the trillions per week.

I always found this weird while I was working at this company, but then, they have no reason to care about ephemeral threats that have never been brought to bear in a meaningful way. No consequences = no reason to spend literal billions retooling the entire tech side of your company over a decade.

◧◩◪◨
17. Shamel+U87[view] [source] [discussion] 2022-10-18 22:59:00
>>drran+RU
Uh, vaguely? [Someone who isn't me] is aware of this happening at an american retailer.

It basically happens like this:

"Oh this code solves our problems and has a nice community around it for network effects!"

**developers proceed to adopt codebase without checking the license**

**months later**

"Oh, huh this license has some interesting language in it..."

Then the employee doesn't mention it; because the risk of having to re-do a bunch of work feels higher than the risk of getting in trouble for violating a license. Basically, unless it's Oracle; people just kinda shrug it off as a "wontfix".

My whole thing is that any system depending on people to read and follow a license is quite flawed in terms of enforcement, and is largely designed specifically so that powerful encumbents can make claims, not individual developers.

Laws have to be enforced or people will ignore them. If there's no practical way to enforce a law that doesn't involve violating freedoms - you're kinda fucked.

[go to top]