Добавить новость
123ru.net
BusinessInsider.com
Июль
2024
1 2 3 4 5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

The copyright lawsuits against OpenAI are piling up as the tech company seeks data to train its AI

0

OpenAI is facing a growing number of lawsuits from authors and news organizations who say the company used their copyrighted material to train ChatGPT.

A cellphone showing the OpenAI logo and a block of nondescript text.
OpenAI is facing several lawsuits over copyrighted material used to train ChatGPT.
  • Publishers want compensation from OpenAI for using their works to train AI models.
  • The Center for Investigative Reporting filed a lawsuit against the company this week.
  • The New York Times and other outlets also have similar lawsuits against OpenAI.

OpenAI uses any and all publicly available data to train ChatGPT, including books and articles from the internet. Now, those who own them want to be paid for their work.

Training data is an essential part of creating the AI models that are taking over the tech world. Leading tech companies like Google, Meta, OpenAI, Anthropic, and Microsoft are all scrambling to find new sources of data. Meta at one point even considered buying Simon & Schuster, one of the world's biggest publishing houses.

Part of the problem is that publishers are increasingly accusing these companies of hoovering up copyrighted data. They'd like to be paid for their work. Meta and OpenAI have argued in comments to the US Copyright Office that putting copyrighted material on the internet makes it "publicly available" and thus under fair use.

But they'll still have to make that argument in court as the company faces lawsuits from several groups over the copyrighted material.

The Center for Investigative Reporting, a news nonprofit known sometimes by its acronym CIR and which merged with Mother Jones and Reveal earlier this year, sued OpenAI and Microsoft last week in federal court. The lawsuit accuses OpenAI of being "built on the exploitation of copyrighted works belonging to creators around the world, including CIR."

Lawyers for the CIR accused OpenAI and Microsoft of using copyrighted material from Mother Jones to train their GPT and Copilot AI models.

"OpenAI and Microsoft started vacuuming up our stories to make their product more powerful, but they never asked for permission or offered compensation, unlike other organizations that license our material," Monika Bauerlein, CEO of the Center for Investigative Reporting, said in an announcement about the lawsuit. "This free rider behavior is not only unfair, it is a violation of copyright."

The lawsuit says that "16,793 distinct URLs from Mother Jones's web domain" appeared in a published list of the top web domains present in the company's WebText training set.

In another class action lawsuit from the Author's Guild, two authors claimed that the company used information from their books to train ChatGPT. The New York Times also filed a similar lawsuit against the company in December 2023.

In May, court documents in the Author's Guild lawsuit revealed that OpenAI deleted two huge datasets used to train GPT-3. Lawyers for the guild said the two sets likely contained "more than 100,000 published books."

The two employees responsible for putting together the data no longer work for OpenAI, court documents say.

OpenAI has begun signing licensing agreements with news organizations to fairly use their work. The company has signed such agreements with The Associated Press, publishers of The Wall Street Journal and New York Post, The Atlantic, Prisa Media, Le Monde newspaper, Financial Times, and Business Insider parent Axel Springer.

But the scale of content required for these bots to continuously learn will require far more than a handful of licensing agreements.

One solution is synthetic data, which is artificially generated rather than collected from the real world, and can easily be generated by machine learning algorithms.

OpenAI has considered synthetic data as an option to train its models, but CEO Sam Altman has raised concerns about producing quality data.

"As long as you can get over the synthetic data event horizon, where the model is smart enough to make good synthetic data, everything will be fine," Altman said at a tech conference in May 2023. The company has also explored a process in which AI models work together — one AI system produces data, while another judges it.

OpenAI did not immediately return a request for comment from Business Insider.

Read the original article on Business Insider





Загрузка...


Губернаторы России
Москва

Сергей Собянин: Реализуем самую масштабную программу благоустройства улиц


Спорт в России и мире
Москва

Стали известны победители шахматного турнира игр «Дети Азии»


Загрузка...

Все новости спорта сегодня


Новости тенниса
Анастасия Потапова

Потапова отклонила приглашение МОК выступить на Олимпиаде


Загрузка...


123ru.net – это самые свежие новости из регионов и со всего мира в прямом эфире 24 часа в сутки 7 дней в неделю на всех языках мира без цензуры и предвзятости редактора. Не новости делают нас, а мы – делаем новости. Наши новости опубликованы живыми людьми в формате онлайн. Вы всегда можете добавить свои новости сиюминутно – здесь и прочитать их тут же и – сейчас в России, в Украине и в мире по темам в режиме 24/7 ежесекундно. А теперь ещё - регионы, Крым, Москва и Россия.


Загрузка...

Загрузка...

Экология в России и мире
Москва

Собянин предупредил об аномальной жаре 4 июля и грозе 5 июля в Москве





Путин в России и мире
Москва

МИД Азербайджана назвал встречу Путина и Алиева полезной


Лукашенко в Беларуси и мире
Минск

Лукашенко оценил вступление Белоруссии в ШОС




123ru.netмеждународная интерактивная информационная сеть (ежеминутные новости с ежедневным интелектуальным архивом). Только у нас — все главные новости дня без политической цензуры. "123 Новости" — абсолютно все точки зрения, трезвая аналитика, цивилизованные споры и обсуждения без взаимных обвинений и оскорблений. Помните, что не у всех точка зрения совпадает с Вашей. Уважайте мнение других, даже если Вы отстаиваете свой взгляд и свою позицию. Smi24.net — облегчённая версия старейшего обозревателя новостей 123ru.net.

Мы не навязываем Вам своё видение, мы даём Вам объективный срез событий дня без цензуры и без купюр. Новости, какие они есть — онлайн (с поминутным архивом по всем городам и регионам России, Украины, Белоруссии и Абхазии).

123ru.net — живые новости в прямом эфире!

В любую минуту Вы можете добавить свою новость мгновенно — здесь.





Зеленский в Украине и мире
Киев

В Киеве заявили, что Зеленский оскорбил Карлсона, внезапно отказав ему в интервью


Навальный в России и мире


Здоровье в России и мире


Частные объявления в Вашем городе, в Вашем регионе и в России






Загрузка...

Загрузка...



Диана Арбенина

В Москве прошла премьера фильма к 50-летию Дианы Арбениной



Москва

Автомобиль загорелся на внешней стороне 18 километра МКАД

Друзья 123ru.net


Информационные партнёры 123ru.net



Спонсоры 123ru.net