This Week in AI: Generative AI and the problem of compensating creators | TechCrunch

Keeping up with a fast-moving industry aye A tall order. So until an AI can do this for you, here’s a handy roundup of recent stories in the world of machine learning, as well as notable research and experiments we haven’t covered ourselves.

By the way – TechCrunch is planning to launch an AI newsletter soon. stay tuned.

This week in AI, eight major US newspapers owned by investment giant Alden Global Capital, including the New York Daily News, Chicago Tribune and Orlando Sentinel, sued OpenAI and Microsoft for copyright infringement related to their use of generative AI technology. They, like The New York Times Ongoing lawsuit against OpenAIAccused OpenAI and Microsoft of scraping its IP without permission or compensation to create and commercialize generic models GPT-4,

“We have spent billions of dollars gathering information and reporting news across our publications, and we cannot allow OpenAI and Microsoft to expand their big tech playbook of stealing our work to build their own businesses at our expense. “Frank Pine, executive editor overseeing Alden’s newspapers, said in a statement.

Looking at OpenAI it looks like the lawsuit is likely to end in a settlement and licensing deal existing partnership with publishers and an unwillingness to make their entire business model dependent on fair use argument, But what about the rest of the content creators, whose work is being included in model training without payment?

It seems this is what OpenAI is thinking about.

A recently published research paper Co-authored by Boaz Barak, a scientist OpenAI’s SuperAlignment Team, proposes a framework to compensate copyright owners “proportionally for their contribution to the creation of AI-generated content”. How? Through cooperative game theory,

The framework evaluates the extent to which the content in a training data set – such as text, images or some other data – influences the actions a model produces, known as a game theory concept. Shapley value, Then, based on that valuation, it determines the content owners’ “fair share” (i.e. compensation).

Let’s say you have an image-generation model trained using the artwork of four artists: John, Jacob, Jack, and Jebediah. You ask him to make a flower in Jack’s style. With the outline, you can determine what impact each artist’s work had on the art generated by the model and, thus, how much compensation each artist should receive.

However, this framework also has a downside – it is computationally expensive. The researchers’ solutions depend on estimates of compensation rather than precise calculations. Will this satisfy content creators? I’m not so sure. If someday OpenAI puts it into practice, we will definitely find out.

Here are some other AI stories worth noting from the past few days:

  • Microsoft reaffirms ban on facial recognition: Language added to the terms of service for the Azure OpenAI service, Microsoft’s fully managed wrapper around OpenAI technology, explicitly limits the integration from being used “by or for” police departments in the US to facial recognition. Prohibits.
  • Nature of AI-Native Startup: AI startups face different challenges than your typical software-as-a-service company. This was the message from Rudina Cesari, founder and managing partner of Glasswing Ventures, at the TechCrunch Early Stage event in Boston last week; Ron has the whole story.
  • Anthropic launches a business plan: AI startup Anthropic is launching a new payment plan for enterprises, as well as a new iOS app. Team – Enterprise Plan – Provides customers with high priority access to Anthropic cloud 3 A family of generative AI models and additional administrator and user management controls.
  • CodeWhisperer is no more: Amazon CodeWhisperer is now q developerPart of Amazon’s Q family of business-oriented generative AI chatbots, Available through AWS, Q ​​Developer helps with some of the tasks that developers do in their daily work, like debugging and upgrading apps – just like CodeWhisperer.
  • Just exit Sam’s Club: Walmart-owned Sam’s Club says it’s turning to AI to help speed up its “exit technology.” Instead of requiring store staff to check members’ purchases against their receipts when leaving the store, Sam’s Club customers who pay at the register or through the Scan & Go mobile app now pay without having to double-check their purchases. May run out of some store locations. ,
  • Fish harvesting, automatic: Cutting fish is inherently a messy business. shinkei Working to improve this with an automated system that ships fish more humanely and reliably could result in an entirely different seafood economy, Davin reports.
  • Yelp’s AI Assistant: Yelp this week announced a new AI-powered chatbot for consumers — powered by OpenAI models, the company says — that helps them connect with businesses relevant to their tasks (like installing light fixtures, upgrading external locations and so on). The company is rolling out the AI ​​assistant under the “Projects” tab on its iOS app, with plans to expand to Android later this year.

More Machine Learning

Image Credit: United States Department of Energy

It seems like there was There was a big party at Argonne National Lab. This winter when they brought together a hundred AI and energy sector experts to talk about how the rapidly evolving technology could be helpful to the country’s infrastructure and R&D in that sector. Result Report That’s more or less what you’d expect from that crowd: lots of pie in the sky, but informative nonetheless.

Looking at nuclear power, the grid, carbon management, energy storage and materials, the themes that emerged from this meeting were, first, that researchers need access to high-powered compute tools and resources; second, learning to recognize the weak points of simulations and predictions (including the points enabled by the first thing); Third, the need for AI tools that can integrate and make accessible data from multiple sources and multiple formats. We’ve seen all of these things happen in different ways throughout the industry, so it’s no big surprise, but nothing gets done without some officials at the federal level pulling out the papers, so it’s good to have it on the record.

Georgia Tech and Meta are working on it A wealth of reactions, materials, and calculations, along with a massive new database called OpenDAC, aims to help scientists more easily design carbon capture processes. It focuses on metal-organic frameworks, which are a promising and popular material type for carbon capture, but there are thousands of variations that have not been extensively tested.

The Georgia Tech team, together with Oak Ridge National Lab and META’s FAIR, used approximately 400 million compute hours to simulate quantum chemistry interactions on these materials—far more than a single university could easily muster. is more. Hopefully it will be helpful to climate researchers working in this field. It’s all documented here,

We hear a lot about AI applications in the medical field, although most are in what you might call an advisory role, helping specialists notice things they might not have otherwise noticed, or patterns that Detecting which any technology would take hours to find. This is partly because these machine learning models find relationships between data without understanding what happened or what happened next. Cambridge and Ludwig-Maximilians-Universität München researchers Working on that, because moving beyond basic correlational relationships can be extremely helpful in creating treatment plans.

The work, led by LMU professor Stephan Furiegel, aims to create models that can identify causal mechanisms, not just correlations: “We use machines to identify the causal structure and correctly formalize the problem. Give rules. The machine then has to learn to recognize the effects of interventions and understand how real-life outcomes are reflected in the data fed into the computer,” he said. It’s early days for them, and they’re aware of it, but they believe their work is part of an important decade-scale development period.

graduate student at the university of pennsylvania Ro Encarnación is working on a new angle in the “algorithmic justice” field We’ve seen leading (primarily by women and people of color) over the last seven or eight years. Her work focuses more on users than platforms, using what she calls “incidental auditing”.

When TikTok or Instagram puts out a filter that’s a little racist, or an image generator that does something shocking, what do users do? Complain, sure, but they also continue using it, and learn how to avoid or exacerbate its inherent problems. This may not be a “solution” in the way we think of it, but it demonstrates the diversity and flexibility of the user side of the equation – they are not as fragile or passive as you might think.