NVIDIA: Copyrighted Books Are Just Statistical Correlations to Our AI Models * TorrentFreak

sabreW4K3@lazysoci.al · 4 months ago

NVIDIA: Copyrighted Books Are Just Statistical Correlations to Our AI Models * TorrentFreak

h6a@beehaw.org · 4 months ago

Why not go full data nihilist and say that every file is just a natural number expressed in binary.

rickyrigatoni@lemm.ee · 4 months ago

Yeah but I legally own this particular number >:E

Daxtron2@startrek.website · 4 months ago

I’m good with that

Steve@communick.news · 4 months ago

What about copyrighted code?
Like for instance, GPU drivers?

FaceDeer@fedia.io · 4 months ago

Yes, that would also be statistical correlations to an AI model. The specific kind of information they’re being trained on doesn’t affect the underlying mechanism of model training.

dustycups@aussie.zone · 4 months ago

I mentioned it before:

If they use any GPL code for their model then any output would be a derived work and a violation of the GPL.

kibiz0r@midwest.social · 4 months ago

Aren’t MP3s just a statistical correlation?

Besides, you really don’t need to zoom in on “but muh license agreement” to roast these AI turds.

They’re very clear: We’re gonna put creatives out of work, we’re gonna sell a unified product to replace them, and we’re gonna use their own labor to build their replacements.

That’s anticompetitive.

Nail em on that instead of trying to thread the needle on reining in the tech lords without damaging e.g. linguistic analysis researchers.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 4 months ago

We’re gonna put creatives out of work, we’re gonna sell a unified product to replace them, and we’re gonna use their own labor to build their replacements.

Yes, but: it’s short sighted, and wrong. Until we have a sea change in the LLM/AGI space, “creatives” will be needed for seed data. LLMs that are recursively trained on their own output degrade and produce worse output over time.

The “yes” part is that companies looking to replace paying people for their work, but still hoping that Creative Commons types are still posting online for free harvesting.

Daxtron2@startrek.website · 4 months ago

The tools exist for creatives to use.

todd_bonzalez@lemm.ee · edit-2 17 days ago

deleted by creator

gbin@lemmy.ca · 4 months ago

Copies are just very strong statistical correlations.

CAPSpirou@lemmy.dbzer0.com · 4 months ago

These files are just correlated bits and bytes, nothing more.

Even_Adder@lemmy.dbzer0.com · 4 months ago

Damn, this article is so biased.

FaceDeer@fedia.io · 4 months ago

Seemed pretty fair and fact-based to me. What bias are you seeing?

Even_Adder@lemmy.dbzer0.com · 4 months ago

I think it’s really disingenuous to mention the DeviantArt/Midjourney/Runway AI/Stability AI lawsuit without talking about how most of the infringement claims were dismissed by the judge.