I’m not sure what would be acceptable output for a code generation tool if rewriting the examples isn’t ok and reimplementing something that performs the same function still isn’t ok. Are we automatically granting de-facto code patents on all published code now?
Why would it be? If a function performs the data transform I need you better believe i'm copy pasting that sucker with a hyperlink to where I found it
But then again, I'm not trying to win in court.
[Lawyer and developer Matthew Butterick announced last month that he'd teamed up with the Joseph Saveri Law Firm to investigate Copilot. They wanted to know if and how the software infringed upon the legal rights of coders by scraping and emitting their work without proper attribution under current open-source licenses.]
https://www.theregister.com/2022/11/07/in_brief_ai/
https://www.theregister.com/2022/10/19/github_copilot_copyri...
I understand why these might feel different to you, but textbooks and stack overflow are also proprietary, licensed pieces of work. I don’t see why there would be much of a legal distinction.
There are two worlds.
In one, everytime someone publishes code with a license attached, they've taken a chunk out of the set of valid lines of software capable of being permissibly written without license encumberance. This is the world the poster you are replying to is imagining we're headed toward, and this case basically does a fantastic job of laying a test case/precedent for.
The other world, is one where everyone accepts all programming code is math, and copyrighting things is like erecting artificial barriers to facilitate information asymmetry. I.e. trying to own 2 + 2. In this second hypothetical world, we summarily reject IP as a thing.
The 2nd world is what I'd rather live in, as the first truly feels more and more like hell to me. However, given the first one is the world we're in, I'd like to see the mental gymnastics employed to undermine Microsoft's original software philosophy.
EDIT: Voir dire will be a hoot. Any wagers on how many software people make it onto the jury if any?
Don't know how you would even write code in your own style. As soon as you start altering it, the result is different. It's more/less efficient.
- Code is not intellectual property; I don't see this as easily defensible. It takes time, effort, and in some cases seriously heavy resources to come up with some of the tech companies rely on. Should all private companies rescind copyright on literally everything their staff write?
- Intellectual property is a nonsense concept altogether; in this case, I don't think you're ever going to get your way in the court of public opinion.
code that reverts to a conserved sequence of bytes interchanged ,no functional variations.
code that is so common knowledge it has become street graffiti, belongs in world 2
versus code that creates a functionality not available by direct command, is innovative and should be attributed. this sounds like what 1st world should be.
Non trivial include names, comments, logging, error checking, structure, ordering of operations that aren’t sequential.
If this were true of copyright, we would’ve run out of permissible novels a long time ago. There’s plenty to complain about with how software IP works, but copyright seems pretty sane. The alternative of protecting IP via trade secret is not a world I want to live in. That seems bad for open source.
https://en.wikipedia.org/wiki/Idea%E2%80%93expression_distin...
there really are a lot of other scenarios that involve writing software, to make software. Its not possible to list them all.. the list changes while I type
in simpl terms:
mov bax eax ; an obvious function; no IP
mov eax eax ; seems useless unless you know what de-referencing is. probably IP
this is of course example not considering granularities at level of patents on a language, or macro directives
The central idea of programming languages is that the grammar is very restrictive compared to natural languages. It's quite likely that, with the exception of variable names and whitespace, some function you wrote to implement a circular buffer is coincidentally identical to code that exists in Sony's or Lockheed Martin's codebases.
Plus there's the birthday problem -- coincidences can happen way more than you expect. And even with prose, constraints like non-fiction can narrow things down quickly. If everyone on HN had to write a theee-sentence summary of, say, how a bicycle works, there would probably be coincidentally identical summaries.
Aside from obligatory syntactic bits, what is the most common line of code across all software ever developed?
It'll probably be C or Java. HTML doesn't count.
And it's probably something boring like:
i++;It was ASM code I think, and their defense was that there was basically one way to write a function that does this.
Even if a programming grammar is more restrictive, there’s some length where things become almost certainly unique.
Look at FizzBuzz. If you were to set strict requirements on performance (and allow for reiterative testing), the results from different groups of people would be identical. They would reach the same conclusion because that's how code works, it's far more aligned to math than it is creative writing.
So you cannot take an existing code solution and translate it to your own style. You are altering the program, the efficiency, and therefore the solution itself. Even when you do something like changing 1 single variable name!
Lots of little things.