zlacker

GitHub Copilot, with “public code” blocked, emits my copyrighted code

submitted by davidg+(OP) on 2022-10-16 19:33:52 | 914 points 768 comments
[view article] [source] [go to bottom]

NOTE: showing posts with links only show all posts
9. res0na+i7[view] [source] 2022-10-16 20:35:30
>>davidg+(OP)
The repo he linked to on twitter is a public repo though. Am I missing something?

https://twitter.com/DocSparse/status/1581462433335762944

◧◩◪
22. msbarn+I8[view] [source] [discussion] 2022-10-16 20:47:02
>>machin+H7
> I'm pretty sure DALL-E was trained only on not copyright material

Nope. DALL-E generates images with the Getty Watermark, so clearly there’s copyrighted materials in its training set: https://www.reddit.com/r/dalle2/comments/xdjinf/its_pretty_o...

◧◩◪
34. willia+5a[view] [source] [discussion] 2022-10-16 20:58:07
>>ghowar+99
I’m a programmer and a songwriter and I am not worried about these tools and I don’t think they are bad for society.

What did the photograph do to the portrait artist? What did the recording do to the live musician?

Here’s some highfalutin art theory on the matter, from almost a hundred years ago: https://en.wikipedia.org/wiki/The_Work_of_Art_in_the_Age_of_...

◧◩◪
38. a4isms+ma[view] [source] [discussion] 2022-10-16 21:01:20
>>tpxl+O7
> one set of rules for the poor, another set of rules for the masses.

Conservatism consists of exactly one proposition, to wit:

There must be in-groups whom the law protects but does not bind, alongside out-groups whom the law binds but does not protect.

—Composer Frank Wilhoit[1]

[1]: https://crookedtimber.org/2018/03/21/liberals-against-progre...

◧◩◪◨
49. ghowar+vb[view] [source] [discussion] 2022-10-16 21:10:11
>>willia+5a
Do you know what's different about the photograph or the recording?

They are still their own separate works!

If a painter paints a person for commission, and then that person also commissions a photographer to take a picture of them, is the photographer infringing on the copyright of the painter? Absolutely not; the works are separate.

If a recording artist records a public domain song that another artist performs live, is the recording artist infringing on the live artist? Heavens, no; the works are separate.

On the other hand, these "AI's" are taking existing works and reusing them.

Say I write a song, and in that song, I use one stanza from the chorus of one of your songs. Verbatim. Would you have a copyright claim against me for that? Of course, you would!

That's what these AI's do; they copy portions and mix them. Sometimes they are not substantial portions. Sometimes, they are, with verbatim comments (code), identical structure (also code), watermarks (images), composition (also images), lyrics (songs), or motifs (also songs).

In the reverse of your painter and photographer example, we saw US courts hand down judgment against an artist who blatantly copied a photograph. [1]

Anyway, that's the difference between the tools of photography (creates a new thing) and sound recording (creates a new thing) versus AI (mixes existing things).

And yes, sound mixing can easily stray into copyright infringement. So can other copying of various copyrightable things. I'm not saying humans don't infringe; I'm saying that AI does by construction.

[1]: https://www.reuters.com/world/us/us-supreme-court-hears-argu...

◧◩◪◨⬒
57. insani+Xb[view] [source] [discussion] 2022-10-16 21:13:48
>>Samoye+0b
> This is more like a wealthy person stealing your entire art catalog, laundering it in some fancy way, and then claiming they are the original creator.

If I take a song, cut it up, and sing over it, my release is valid. If I parody your work, that's my work. If you paint a picture of a building and I go to that spot and take a photograph of that building it is my work.

I can derive all sorts of things, things that I own, from things that others have made.

Fair use is a thing: https://www.copyright.gov/fair-use/

As for talking about the originals, would an artist credit every piece of inspiration they have ever encountered over a lifetime? Publishing a seed seems fine as a nice thing to do, but pointing at the billion pictures that went into the drawing seems silly.

◧◩
58. Hamuko+Yb[view] [source] [discussion] 2022-10-16 21:13:57
>>jijji+29
https://nedroidcomics.tumblr.com/post/41879001445/the-intern...
◧◩◪
60. anonyd+6c[view] [source] [discussion] 2022-10-16 21:15:29
>>mjr00+D7
Not to mention Microsoft could countersue using their enormous patent war chest, which they have a history of doing[0]

[0] https://techcrunch.com/2012/03/22/microsoft-and-tivo-drop-th...

64. jen20+Mc[view] [source] 2022-10-16 21:22:57
>>davidg+(OP)
Not entirely sure how this could happen! “naikrovek” assured me not three days ago on this very site that I was “detached from reality” [1] for thinking that this would happen again.

To be fair I thought it might be at least a week or two.

[1]: https://news.ycombinator.com/item?id=33194643

◧◩◪◨
76. Rimint+Vd[view] [source] [discussion] 2022-10-16 21:34:58
>>insani+ib
> That can't possibly be a valid claim, right?

It has indeed happened.

https://boingboing.net/2018/09/05/mozart-bach-sorta-mach.htm...

Sony later withdrew their copyright claim.

There are two pieces to copyright when it comes to public domain:

* The work (song) itself -- can't copyright that

* The recording -- you are the copyright owner. No one, without your permission, can re-post your recording

And of course, there is derivative work. You own any portion that is derivative of the original work.

◧◩
81. stefan+je[view] [source] [discussion] 2022-10-16 21:39:59
>>thorum+k9
Seems other people tried it? https://twitter.com/larrygritz/status/1581713252144517120
88. ahmedb+Ye[view] [source] 2022-10-16 21:45:49
>>davidg+(OP)
“ AI-focused products/startups lack a business model aligning the incentives of both the company and the domain experts (Data Dignity)”

https://blog.barac.at/a-business-experiment-in-data-dignity

Yes I am quoting myself

◧◩◪◨⬒⬓
96. lupire+Cf[view] [source] [discussion] 2022-10-16 21:51:53
>>insani+pc
YouTube does this moderation in order to avoid legal pressure from copyright holders, as in

https://en.m.wikipedia.org/wiki/Viacom_International_Inc._v.....

◧◩◪◨
104. codefr+Yf[view] [source] [discussion] 2022-10-16 21:55:25
>>c7b+af
There have been a number of stories about musicians being copyright claims. Here is the first result on Google

https://www.radioclash.com/archives/2021/05/02/youtuber-gets...

For being sued for looking at source here is the first result on Google

https://www.wired.com/story/missouri-threatens-sue-reporter-...

◧◩◪
107. psychp+6g[view] [source] [discussion] 2022-10-16 21:56:19
>>bayind+Fd
I disagree.

Those are effectively cases of cryptomnesia[0]. Part and parcel of learning.

If you don't want broad access your work, don't upload it to a public repository. It's very simple. Good on you for recognising that you don't agree with what GitHub looks at data in public repos, but it's not their problem.

[0] https://en.m.wikipedia.org/wiki/Cryptomnesia

◧◩
109. heavys+ag[view] [source] [discussion] 2022-10-16 21:57:36
>>kweing+v6
Your post is a good example of the tu quoque fallacy[1].

[1] https://en.wikipedia.org/wiki/Tu_quoque

◧◩◪
120. vghfgk+Jg[view] [source] [discussion] 2022-10-16 22:03:02
>>ghowar+99
The best proposal I’ve heard to deal with the societal/economic problems this sort of A.I. poses were made by Jaron Lanier: https://youtu.be/rGqiswuJuQI?t=1190 …I can see why his proposals of providing (micro-)compensation to people whose tremendous efforts end up being mined by these algorithms would not be popular with researchers/companies who stand to benefit vastly (…presumably including the investors who own this site?). The lobbying power/political power/ awareness/financial resources of your average (atomised) artist/programmer/musician etc. is pretty much nil in comparison… Forgive the clumsy analogy, but I have a feeling the whole thing might end up something like a haulage company that doesn’t want to pay any taxes to help fix the roads though?
◧◩◪◨
127. banana+8h[view] [source] [discussion] 2022-10-16 22:07:11
>>c7b+af
https://twitter.com/mpoessel/status/1545178842385489923

Among many others. Classical music may have fallen into public domain, but modern performances of it is copyrightable, and some of the big companies use copyright matching systems, including YouTube's, that often flags new performances as copies of recordings.

◧◩
129. yjftsj+fh[view] [source] [discussion] 2022-10-16 22:08:02
>>deworm+re
> It prints this code because you have it open in another editor tab.

People upthread have reproduced and demonstrated that that's not the issue here.

EDIT: Actually, OP says "The variant it produces is not on my machine." - https://twitter.com/DocSparse/status/1581560976398114822

> Wish people who don't know at all how it works stopped acting all outraged when they're laughably wrong.

Physician, heal thyself.

◧◩◪◨⬒
132. vghfgk+yh[view] [source] [discussion] 2022-10-16 22:09:53
>>lbotos+od
…yes, as I understand it there are ‘mechanical’ rights vs. publishing rights… (for example hip hop artists may recreate a sample to avoid paying mechanical royalties, but still end up paying for publishing) https://www.lawinsider.com/dictionary/mechanical-rights
◧◩◪◨⬒
164. frob+4l[view] [source] [discussion] 2022-10-16 22:40:34
>>codefr+Yf
Just to be clear, because it's in the title, the reporter was threatened with a lawsuit for looking at source code. I cannot find anyone acually sued for it. BTW, here's an article saying said reporter wasn't sued: https://www.theregister.com/AMP/2022/02/15/missouri_html_hac...

Anyone with a mouth can run it and threaten a lawsuit. If fact, I threaten to sue you for misinformation right now unless you correct your post. Fat lot of good my threat will do because no judge in their right mind would entertain said lawsuit because it's baseless.

◧◩
165. jacoop+fl[view] [source] [discussion] 2022-10-16 22:41:57
>>jonnyc+c5
They avoided answering this question at all costs.

Because it exposes their direct hypocrisy in this, its fair use for OSS but not for us.

Questions here are very important, and its no surprise GitHub avoided answering anything about CoPilot's legality:

https://sfconservancy.org/GiveUpGitHub/

◧◩◪◨⬒⬓
184. ghowar+Zm[view] [source] [discussion] 2022-10-16 22:59:34
>>c7b+Eg
> Are you sure?

Yes, I'm sure.

> I'm not familiar with the exact data set they used for SD and whether or not Disney art was included, but my understanding is that their claim to legality comes from arguing that the use of images as training data is 'fair use'.

They could argue that. But since the American court system is currently (almost) de facto "richest wins," their argument will probably not mean much.

The way to tell if something was in the dataset would be to use the name of a famous Disney character and see what it pulls up. If it's there, then once the Disney beast finds out, I'm sure they'll take issue with it.

And by the way, I don't buy all of the arguments for machine learning as fair use. Sure, for the training itself, yes, but once the model is used by others, you now have a distribution problem.

More in my whitepaper against Copilot at [1].

[1]: https://gavinhoward.com/uploads/copilot.pdf

◧◩◪
187. LelouB+on[view] [source] [discussion] 2022-10-16 23:02:06
>>eyelid+5g
Some day, I think the Windows source will be public, at least for reference purposes.

They already have one open source part I know of, the new conhost[0].

[0] https://github.com/microsoft/terminal

◧◩
194. 9wzYQb+Yn[view] [source] [discussion] 2022-10-16 23:07:45
>>kweing+v6
> [people] are usually indifferent or positive about other applications of AI

That sounds like the pro-innovation bias: https://en.m.wikipedia.org/wiki/Pro-innovation_bias

◧◩◪◨
201. ghowar+Oo[view] [source] [discussion] 2022-10-16 23:14:05
>>epolan+go
Because they centralize control, as I said in [1].

Put another way, AI's are tools that give more power to already powerful entities.

[1]: https://news.ycombinator.com/item?id=33227303

◧◩◪
209. defasd+9p[view] [source] [discussion] 2022-10-16 23:16:57
>>lupire+Nf
Code Snippets Data

Depending on your preferred telemetry settings, GitHub Copilot may also collect and retain the following, collectively referred to as “code snippets”: source code that you are editing, related files and other files open in the same IDE or editor, URLs of repositories and files paths.

https://github.com/features/copilot/#faq

◧◩◪◨⬒⬓
213. keving+np[view] [source] [discussion] 2022-10-16 23:18:57
>>jzb+eo
Not safe for work, but one example I saw going around:

https://twitter.com/ebkim00/status/1579485164442648577

Not sure if this was fed the original image as an input or not.

Also seen a couple cases where people explicitly trained a network to imitate an artist's work, like the deceased Kim Jung Gi.

◧◩
226. defasd+hq[view] [source] [discussion] 2022-10-16 23:26:47
>>bmitc+le
> We built a filter to help detect and suppress the rare instances where a GitHub Copilot suggestion contains code that matches public code on GitHub. You have the choice to turn that filter on or off during setup. With the filter on, GitHub Copilot checks code suggestions with its surrounding code for matches or near matches (ignoring whitespace) against public code on GitHub of about 150 characters. If there is a match, the suggestion will not be shown to you. We plan on continuing to evolve this approach and welcome feedback and comment.

From the FAQ https://github.com/features/copilot/

◧◩◪◨⬒
235. matkon+4r[view] [source] [discussion] 2022-10-16 23:37:45
>>paulgb+aq
https://alexanderwales.com/wp-content/uploads/2022/08/image....

Left: “Girl with a Pearl Earring, by Johannes Vermeer” by Stable Diffusion Right: Girl with a Pearl Earring by Johannes Vermeer

This specific one is not copyright violation as it is old enough for copyright to expire. But the same may happen with other images.

from https://alexanderwales.com/the-ai-art-apocalypse/ and https://alexanderwales.com/addendum-to-the-ai-art-apocalypse...

◧◩◪◨⬒⬓
237. matkon+9r[view] [source] [discussion] 2022-10-16 23:38:22
>>jzb+eo
https://alexanderwales.com/wp-content/uploads/2022/08/image....

Left: “Girl with a Pearl Earring, by Johannes Vermeer” by Stable Diffusion Right: Girl with a Pearl Earring by Johannes Vermeer

This specific one is not copyright violation as it is old enough for copyright to expire. But the same may happen with other images.

from https://alexanderwales.com/the-ai-art-apocalypse/ and https://alexanderwales.com/addendum-to-the-ai-art-apocalypse...

◧◩◪◨
270. ghowar+zw[view] [source] [discussion] 2022-10-17 00:23:41
>>csalle+kt
I explained more in my comment at [1].

The big difference is that cars were a tool that helped regular people by being a force multiplier. Stable Diffusion and DALL-E are not force multipliers in the same way. Sure, you may now produce images that you couldn't before, but there are far fewer profitable uses for images than for cars. Images don't materially affect the world, but cars can.

[1]: https://news.ycombinator.com/item?id=33227303

◧◩◪
277. willia+Tx[view] [source] [discussion] 2022-10-17 00:40:24
>>SrslyJ+rp
It absolutely does.

Here’s like the first link after a DuckDuckGo search for “copyright utilitarian”:

However, copyright law does not extend to useful items. Therefore, complications may arise when sculptural works are also “useful” items. In these instances, copyright law will protect purely artistic elements of a useful article as long as the useful item can be identified and exists independently of the utilitarian aspects of the article (this concept is sometimes called the “separability test”). 17 U.S.C. §. A “useful article” is an article that has a purpose beyond pure aesthetic value.

https://www.rtlawoffices.com/articles/can-i-copyright-my-des...

Is it safe to assume the rest of the downvotes were from people who were also incorrect?

◧◩
292. zahrc+Zy[view] [source] [discussion] 2022-10-17 00:50:49
>>kweing+v6
Quod licet Iovi, non licet bovi[0]

[0] https://en.m.wikipedia.org/wiki/Quod_licet_Iovi,_non_licet_b...

◧◩◪◨⬒⬓⬔
297. willia+iz[view] [source] [discussion] 2022-10-17 00:53:46
>>ghowar+if
Asked to give practical advice to starting writers, he said, “Read.”

https://www.nytimes.com/2022/09/30/books/early-cormac-mccart...

◧◩◪◨⬒
300. reacha+pz[view] [source] [discussion] 2022-10-17 00:55:12
>>Americ+5p
Copyright is for original whole works. Utility functions don’t fall under that I don’t think.

I suppose whoever wants to pay the fees would “own” these things ?

https://www.copyright.gov/circs/circ61.pdf

◧◩◪
305. Shamel+Tz[view] [source] [discussion] 2022-10-17 01:00:10
>>numpad+xz
> The data can comfortably be downloaded with img2dataset (240TB in 384, 80TB in 224)

https://laion.ai/blog/laion-5b/

Not exactly what you asked, but hopefully useful? The model weights are about 4 GiB I believe.

◧◩◪◨
310. aeyes+wA[view] [source] [discussion] 2022-10-17 01:05:18
>>IncRnd+Wn
It is also allowed to be used under LGPL terms: https://github.com/DrTimothyAldenDavis/SuiteSparse/blob/mast...

But that doesn't make it any better.

◧◩
321. Krishn+bD[view] [source] [discussion] 2022-10-17 01:35:09
>>kweing+v6
> I know artists who are vehemently against DALL-E, Stable Diffusion, etc. and regard it as stealing, but they view Copilot and GPT-3 as merely useful tools.

An example: https://twitter.com/DaveScheidt/status/1578411434043580416

> I also know software devs who are extremely excited about AI art and GPT-3 but are outraged by Copilot.

The fear is not unwarranted though. I can clearly see AI replacing most jobs (not just in tech) but art, crafts, music and even science. There probably will be no field untouched by AI in this decade and completely replaced by next decade.

We have multiple extinction events for humanity lined up: Climate Change, Nuclear Apocalypse and now AI.

We will have to not just work towards reducing harm to the Planet, but also work towards stopping meaningless Wars and figuring out how to deal with unemployment and economic crisis that is looming on the horizon. The only ones to suffer in the end would be the "elites" (or will they be the first depending on how quickly Civilization goes towards Anarchy?).

Can't say for sure. But definitely gloomy days ahead.

◧◩
322. cercat+fD[view] [source] [discussion] 2022-10-17 01:35:30
>>kweing+v6
I often quote this comment regarding AI advances and jobs [0]:

> Yes, many of us will turn into cowards when automation starts to touch our work, but that would not prove this sentiment incorrect - only that we're cowards.

>> Dude. What the hell kind of anti-life philosophy are you subscribing to that calls "being unhappy about people trying to automate an entire field of human behavior" being a "coward". Geez.

>>> Because automation is generally good, but making an exemption for specific cases of automation that personally inconvenience you is rooted is cowardice/selfishness. Similar to NIMBYism.

It's true cowardice to assume that our own profession should be immune from AI while other professions are not. Either dislike all AI, or like it. To be in between is to be a hypocrite.

For me, I definitely am on the side of full AI, even if it automates my job away, simply because I see AI as an advancing force on mankind.

[0] https://news.ycombinator.com/item?id=32461138#32463198

328. willia+ED[view] [source] 2022-10-17 01:39:19
>>davidg+(OP)
Copyright only covers the expressive parts and not the utilitarian parts.

Here is some reading material for those of you who disagree with reality:

https://en.wikipedia.org/wiki/Abstraction-Filtration-Compari...

https://en.wikipedia.org/wiki/Idea–expression_distinction

https://h2o.law.harvard.edu/cases/5004

https://www.loeb.com/en/insights/publications/2020/04/johann...

◧◩◪◨
331. _ryanj+bE[view] [source] [discussion] 2022-10-17 01:45:22
>>binary+7C
Good call. Done. https://twitter.com/docsparse/status/1581461734665367554?s=4...
◧◩◪◨⬒
333. kmeist+nE[view] [source] [discussion] 2022-10-17 01:46:49
>>bigiai+hy
This is already a problem with anyone who ever copypastes from Stack Overflow. You're all violating CC-BY-SA[0] and nobody really cares about this.

[0] https://stackoverflow.com/help/licensing

◧◩◪◨⬒⬓⬔⧯▣▦▧▨
338. SAI_Pe+UF[view] [source] [discussion] 2022-10-17 02:00:43
>>datafl+zt
SCO v. IBM[1] included claims of sections as small as "…ranging from five to ten to fifteen lines of code in multiple places that are of issue…" in some of the individual claims of the case.

[1] https://en.wikipedia.org/wiki/SCO_Group,_Inc._v._Internation....

◧◩◪◨⬒
345. nl+sG[view] [source] [discussion] 2022-10-17 02:04:26
>>paulgb+aq
> n those cases, it seems like not such a leap to say that the AI has obviously seen that artist’s work and that the output is a derivative work.

"Copying" a style is not a derivative work:

> Why isn't style protected by copyright? Well for one thing, there's some case law telling us it isn't. In Steinberg v. Columbia Pictures, the court stated that style is merely one ingredient of expression and for there to be infringement, there has to be substantial similarity between the original work and the new, purportedly infringing, work. In Dave Grossman Designs v. Bortin, the court said that:

> "The law of copyright is clear that only specific expressions of an idea may be copyrighted, that other parties may copy that idea, but that other parties may not copy that specific expression of the idea or portions thereof. For example, Picasso may be entitled to a copyright on his portrait of three women painted in his Cubist motif. Any artist, however, may paint a picture of any subject in the Cubist motif, including a portrait of three women, and not violate Picasso's copyright so long as the second artist does not substantially copy Picasso's specific expression of his idea."

https://www.thelegalartist.com/blog/you-cant-copyright-style

◧◩
362. mike_d+EI[view] [source] [discussion] 2022-10-17 02:28:01
>>_ryanj+2z
> When you use Google to translate from English to Spanish, it’s not like the service has ever seen that particular sentence before.

But that is exactly how it works. Translation companies license (or produce) huge corpuses of common sentences across multiple languages that are either used directly or fed into a model.

Third party human translators are asked to assign rights to the translation company. https://support.google.com/translate/answer/2534530

◧◩◪◨⬒⬓
383. pabs3+nL[view] [source] [discussion] 2022-10-17 03:02:01
>>Thorre+dJ
The replies appear to be here:

https://nitter.net/ryanjsalva/with_replies

◧◩
390. stevew+wM[view] [source] [discussion] 2022-10-17 03:20:13
>>kweing+v6
The last time I happened to point this out[1], all I got was a bunch of HNers nitpicking the words I chose, but not addressing the core issue.

I have to assume this is just people being protective of their own profession and consequently, setting up a high bar for what constitutes as performance in that profession.

[1] https://news.ycombinator.com/item?id=32895251#32895709

◧◩◪◨
396. BeefWe+1O[view] [source] [discussion] 2022-10-17 03:40:11
>>crazyg+SH
> Well probably no, they didn't pick and choose at all, they just "chose" everyone who put code online with a license. Which is a legal statement of ownership by each of those people, and implies legal liability as well.

What you're describing is a choice. They chose which people to believe, with zero vetting.

> The point is that with ML training data, such a vast quantity is required that it's unreasonable to expect humans to be able to research and guarantee the legal provenance of it all.

I'm not sure what you're presenting here is actually true. A key part of ML training is the training part. Other domains require a pass/fail classification of the model's output (see image identification, speech recognition, etc.) so why is source code any different? The idea that "it's too much data" is absolutely a cop-out and absurd, especially for a company sitting on ~$100B in cash reserves.

Your argument kind of demonstrates the underlying point here: They took the cheapest/easiest option and it's harmed the product.

> A crawler simply believes that licenses, which are legally binding statements, are made by actual owners, rather than being fraud. It does seem reasonable to address the issue with takedowns, however.

Yes, and to reiterate, they chose this method. They were not obligated to do this, they were not forced to pick this way of doing things, and given the complete lack of transparency it's a large leap of faith to assume that their training data simply looked at LICENSE files to determine which licenses were present.

For what it's worth, it doesn't seem that that's what OpenAI did when they trained the model initially in their paper[1]:

    Our training dataset was collected in May 2020 from 54 mil-
    lion public software repositories hosted on GitHub, contain-
    ing 179 GB of unique Python files under 1 MB. We filtered
    out files which were likely auto-generated, had average line
    length greater than 100, had maximum line length greater
    than 1000, or contained a small percentage of alphanumeric
    characters. After filtering, our final dataset totaled 159 GB.
I have not seen anything concrete about any further training after that, largely because it isn't transparent.

[1]: https://arxiv.org/pdf/2107.03374.pdf

◧◩◪◨⬒⬓⬔
402. blende+KP[view] [source] [discussion] 2022-10-17 04:06:56
>>rtkwe+Lt
Have you been following the Andy Warhol Prince drawing case?

It is current at the SCOTUS so we should see a ruling for the USA sometime in the next year or so.

https://en.m.wikipedia.org/wiki/Andy_Warhol_Foundation_for_t...

◧◩◪◨⬒⬓⬔⧯▣▦
413. mattkr+SS[view] [source] [discussion] 2022-10-17 04:50:17
>>boulos+oI
That’s a really apt comparison, since the Supreme Court just heard Andy Warhol Foundation for the Visual Arts v. Goldsmith, which hinges on whether Warhol’s use of a copyrighted photo of Prince as the basis for “Orange Prince” was Fair Use.

Warhol’s estate seems likely to lose and their strongest argument is that Warhol took a documentary photo and transformed it into a commentary on celebrity culture. Here, I don’t even see that applying: it just looks like a bad copy.

https://www.scotusblog.com/2022/10/justices-debate-whether-w...

437. firean+hX[view] [source] 2022-10-17 05:58:09
>>davidg+(OP)
This exact code can be found 1000 times on github and many of those are MIT licensed https://github.com/search?q=%22cs+*cs_transpose+%28%22&type=.... Copilot, or any other developer or person, has no way of knowing where the original implementation came from or it's original license. The cat is out of the bag, get used to it.
◧◩
443. choppa+eY[view] [source] [discussion] 2022-10-17 06:10:13
>>_ryanj+2z
Your long long paragraph about neighboring code editors is disproven: https://news.ycombinator.com/item?id=33227395

You’re really not going to solve this problem with marketing (“blog posts”) or some pro-Github story from data scientists. You need a DMCA / removal request feature akin to Google image search and you need work on understanding product problems from the customer perspective.

◧◩◪◨⬒⬓⬔⧯▣
458. atchoo+j01[view] [source] [discussion] 2022-10-17 06:32:28
>>Thorre+rI
The photograph of the art, which will be more recent, might have copyright protections.

It looks like it wouldn't in the UK, probably wouldn't in the US but would in Germany. The cases seem to hinge on the level of intellectual creativity of the photograph involved. The UK said that trying to create an exact copy was not an original endeavour whereas Germany said the task of exact replication requires intellectual/technical effort of it's own merit.

https://www.theipmatters.com/post/are-photographs-of-public-...

◧◩◪◨⬒⬓⬔
485. Stagna+341[view] [source] [discussion] 2022-10-17 07:12:56
>>ghowar+Zm
>The way to tell if something was in the dataset would be to use the name of a famous Disney character and see what it pulls up.

I tried out of curiosity. Here[1] are the first 8 images that came up with the prompt "Disney mickey mouse" using the stable diffusion V1.4 model. Personally I don't really see why Disney or any other company would take issue with the image generation models, it just seems more or less like regular fan art.

[1]: https://i.imgur.com/cIHBCRe.png

◧◩◪◨
494. yywwbb+O51[view] [source] [discussion] 2022-10-17 07:32:23
>>dmix+l01
The first few decades of the 19th century were exceptionally grim in the UK though. Poverty and inequality both increased and a reactionary government enacted draconian policies curtailing freedom of speech as Britain was probably closest to the brink of a social revolution as it ever was. It took several decades for things to actually start improving for most common people and most of the actual progress in that area only occurred in the 1940s and 50s.

See https://en.m.wikipedia.org/wiki/Peterloo_Massacre for example

◧◩◪◨⬒⬓⬔
516. d1sxey+X91[view] [source] [discussion] 2022-10-17 08:18:58
>>Ineffa+p61
If someone uploads something and says 'hey, this is some code, this is the appropriate licence for it', it is their mistake, it is in violation of Github's terms of service, and may even be fraudulent. [0].

I'm also not sure that Copilot is just reproducing code, but that's a separate discussion.

> If I reproduced part of a book from a source that claimed incorrectly it was released under a permissive license, I would still be liable for that misuse. Especially if I was later made aware of the mistake and didn’t correct it.

I don't believe that's correct in the first instance (at least from a criminal perspective). If someone misrepresents to you that they have the right to authorise you to publish something, and it turns out they don't have that right, you did not willingly infringe and are not liable for the infringement from a criminal perspective[1]. From a civil perspective, likely the copyright owner could still claim damages from you if you were unable to reach a settlement. A court would probably determine the damages to award based on real damages (including loss of earnings for the content creator), rather than anything punitive if it's found that

Further, most jurisdictions have exceptions for short extracts of a larger copyrighted work (e.g. quotes from a book), which may apply to Copilot.

This is my own code, I wrote it myself just now. Can I copyright it?

``` function isOdd (num) { if (num % 2 === 0) { return true; } else { return false; } } ```

What about the following:

``` function isOddAndNotSunday (num) { const date = new Date(); if (num % 2 === 0 && date.getDay() > 0) { return true; } else { return false; } } ```

Where do we draw the line?

[0]: https://docs.github.com/en/site-policy/github-terms/github-t... [1]: https://www.law.cornell.edu/uscode/text/17/506

547. enriqu+bg1[view] [source] 2022-10-17 09:22:40
>>davidg+(OP)
Just a heads-up that the person who writes this is Tim Davis[0], author of the legendary CHOLMOD solver[1], which hundreds of thousands of people use daily when they solve sparse symmetric linear systems in common numerical environments.

Even if CHOLMOD is easily the best sparse symmetric solver, it is notoriously not used by scipy.linalg.solve, though, because numpy/scipy developers are anti-copyleft fundamentalists and have chosen not to use this excellent code for merely ideological reasons... but this will not last: thanks to the copilot "filtering" described here, we can now recover a version of CHOLMOD unencumbered by the license that the author originaly distributed it under! O brave new world, that has such people in it!

[0] https://people.engr.tamu.edu/davis/welcome.html

[1] https://github.com/DrTimothyAldenDavis

◧◩◪◨⬒
572. tauwau+Ej1[view] [source] [discussion] 2022-10-17 10:08:36
>>concor+qh1
Had to find this after a long time

IT Crowd Piracy Warning https://www.youtube.com/watch?v=ALZZx1xmAzg

◧◩
587. Jameso+6n1[view] [source] [discussion] 2022-10-17 10:43:26
>>divide+bl1
Gitea is very nice: https://gitea.io/en-us/
◧◩◪◨⬒⬓⬔
635. robert+Zx1[view] [source] [discussion] 2022-10-17 12:12:36
>>LtWorf+Kk1
Are you sure? According to Statistica there were over 700k bankruptcies in the US from 2000-2020 [0]. How many bailouts have there been?

[0] https://www.statista.com/statistics/817918/number-of-busines...

◧◩◪◨⬒⬓⬔⧯
647. LtWorf+hD1[view] [source] [discussion] 2022-10-17 12:54:32
>>robert+Zx1
A lot more. For example, but not only: https://www.businessnewsdaily.com/15639-trump-covid-19-sba-l...
◧◩◪◨
657. Dave3o+fI1[view] [source] [discussion] 2022-10-17 13:22:27
>>CapsAd+Qi
> How responsible should Microsoft be for someone's badly licensed code on their platform?

That's actually a very real problem that mega money has been spent on. The same legal problem appears on sites like YouTube around fair use and copyright. In terms of fair use that doesn't apply here see:

https://softwareengineering.stackexchange.com/questions/1217...

Regardless platforms are partially responsible for the content that their users upload into them. Most try to absolve themselves of this responsibility with their terms of service but legally that's just not possible.

Personally I'm an advocate for fair use but I'm also an advocate for strong copyright laws and their enforcement. In the short time the internet has been available to most people in the world there is a habit of stealing others work and claiming it as your own. Quite often this is for some financial gain.

◧◩◪◨⬒⬓
675. MSFT_E+8Y1[view] [source] [discussion] 2022-10-17 14:37:47
>>cdogl+Uh1
https://theweek.com/feature/briefing/1016752/the-real-cost-o...

Maybe not right this moment but our actions have consequences in the future.

For those who only see the next quarter, they're stoked.

For those who understand infinite growth is impossible and would simply like a livable world, they're horrified.

◧◩◪◨⬒⬓
686. Camper+1w2[view] [source] [discussion] 2022-10-17 16:48:18
>>fsloth+h41
Sounds a lot like Oracle's justification for owning the Java API ( https://en.wikipedia.org/wiki/Google_LLC_v._Oracle_America,_.... ) in which de minimis things like variable and structure declarations were used by Oracle to justify a copyright-maximal approach that would have utterly laid waste to open source development.

The code in question is not something that anyone needs to own. Rather, it's what anyone would write, faced with the same problem. It's stupid to make humans do a robot's job in the name of preserving meaningless "IP rights".

709. paulis+uY2[view] [source] 2022-10-17 18:50:23
>>davidg+(OP)
Here is one repository on github with the code:

The repo: https://github.com/Shreeyak/cleargrasp

https://github.com/Shreeyak/cleargrasp/blob/master/api/depth...

It looks like the license of the repo is Apache 2.0

◧◩◪◨⬒⬓
718. latexr+Nn3[view] [source] [discussion] 2022-10-17 21:07:40
>>bryanr+nu1
> only emotionally crippled people like fashion

Please don’t straw man¹. That’s neither what I said, nor what intended to convey, nor what I believe.

¹ https://news.ycombinator.com/newsguidelines.html

◧◩◪◨⬒⬓⬔⧯
728. ISL+pM3[view] [source] [discussion] 2022-10-17 23:39:50
>>bugfix+OT
Find a good and ambitious copyright attorney with some free capacity.

Also, register your code with the copyright office.

Edit: Apparently, with the #1 post on HN right now, you could also just go here: https://githubcopilotinvestigation.com/

◧◩◪◨⬒⬓⬔
756. matkon+YF5[view] [source] [discussion] 2022-10-18 15:04:21
>>rtkwe+Lt
It is a clear case of derivative work (see also https://commons.wikimedia.org/wiki/Commons:Derivative_works - internal docs, but their explanation of copyright status tends to be well done)
◧◩◪◨⬒⬓⬔⧯
757. matkon+iG5[view] [source] [discussion] 2022-10-18 15:05:16
>>Fillig+fA
It is a clear case of derivative work (see also https://commons.wikimedia.org/wiki/Commons:Derivative_works - internal docs, but their explanation of copyright status tends to be well done)

This specific one would not be a problem, but doing it with a still copyrighted work would be.

◧◩◪◨⬒⬓⬔
768. robert+Vrf[view] [source] [discussion] 2022-10-21 08:39:05
>>latexr+uo3
As I say, interest is being useful to someone else, via the loan of capital.

Gambling - I don't do it, but I'd need more specifics to see why gambling is bad in this sense. It's a voluntary pursuit that I think is a bad idea, but that doesn't make it illegal.

Price gouging is still being useful, just at a higher price. Someone could charge me £10 for bread and if that was the cheapest bread available, I'd buy it. If it is excessive and for essential goods, it is increasingly illegal, however. 42 out of 50 states in the US have anti-gouging laws [0], which, as I say, isn't what I'm talking about. I'm talking about legal things.

Underpaying workers - this certainly isn't illegal, unless it's below minimum wage, but also "underpaying" is an arbitrary term. If there's a regulatory/legal/corrupt state environment in which it's hard to create competitors to established businesses, then that's bad because it drives wages down. Otherwise, wages are set by what both the worker and employer sides will bear. And, lest we forget, there is still money coming into the business by it being useful. Customers are paying it for something. The fact that it might make less profit by paying more doesn't undermine that fundamental fact.

As for supporting laws to undermine competitors, that is something people can do, yes. Microsoft, after their app store went nowhere, came out against Apple and Google charging 30% for apps. Probably more of a PR move than a legal one, but businesses trying to influence laws isn't bad, because they have a valid perspective on the world just as we all do, unless it's corruption. Which is (once more, with feeling) illegal, and so out of scope of my comment. And again, unless the laws are there to establish a monopoly incumbent, which is pretty rare, and definitely the fault of the government that passes the laws, the company is still only really in existence because it does something useful enough to its customers that they pay it money.

[0] https://www.law360.com/articles/1362471

[go to top]