Scraping the Web Now, Asking for Permission Later

Melody Bot · Jun 14, 2024

This article has been imported from chorus.fm for discussion. All of the forum rules still apply.

Federico Viticci, writing at MacStories about Apple’s details on their AI model being trained on web content:

As a creator and website owner, I guess that these things will never sit right with me. Why should we accept that certain data sets require a licensing fee but anything that is found “on the open web” can be mindlessly scraped, parsed, and regurgitated by an AI? Web publishers (and especially indie web publishers these days, who cannot afford lawsuits or hiring law firms to strike expensive deals) deserve better.

I agree wholeheartedly. I felt similarly when I looked at the data that trained Google’s AI. I see Chorus and our forum very clearly in their training data. We didn’t agree to that. Our community never agreed to that. Google played a massive role in devaluing small and medium sized websites (and the online ad business) and we’re certainly not going to be the ones getting any publishing deals. None of it sits well with me.

more

Not all embedded content is displayed here. You can view the original to see embedded videos and other embedded content.

tenspeed · Jun 17, 2024

What also sucks is how much the end-users/consumers/regular folk do not give a shit. At all. And of course neither do the people piloting these AI projects.

Log in or Sign up

Scraping the Web Now, Asking for Permission Later

Your friendly little forum bot. Staff Member

Newbie

Log in or Sign up

Scraping the Web Now, Asking for Permission Later

Useful Searches

Your friendly little forum bot. Staff Member

Newbie