Meta admits scraping Aussie data to train AI tools
Photos and posts Australians have shared on Facebook and Instagram dating back to 2007, including images of children, are being used to train Meta’s generative AI tools, an inquiry has heard.
Meta executives confirmed its use of users’ data at the Senate inquiry into Adopting Artificial Intelligence on Wednesday.
One representative said it was “incredibly helpful to have a lot of data from Australians” to develop AI technology.
The executives also revealed options allowing users to prevent their data being used, like those available in Europe, would not be extended to Australians.
The inquiry, which is expected to present a final report next week, is tasked with examining AI trends, opportunities and risks, as well as its impact on elections and the environment.
Meta privacy policy global director Melinda Claybaugh told the Senate committee the tech giant ingested content users shared on its platforms to train its generative AI tools, Llama and Meta AI, if they chose to share it publicly,
Ms Claybaugh also said Meta did not use photos posted by children to train AI but, under questioning from Labor Senator Tony Sheldon, revealed any photos of children that had been shared by adults were being used.
“I want to be very clear that we are not using data from accounts of under 18-year-olds to train our models,” she said.
“We are using public photos posted by people over 18.”
Ms Claybaugh said Australian Facebook and Instagram users could avoid having their content used to train AI by removing it from public view, but said they would not be offered an AI opt-out option available in some other nations.
“We are offering an opt-out to users in Europe, however that is not a settled legal situation,” she said.
“The solution in Europe is specific to Europe.”
Meta Asia Pacific public policy vice-president Simon Milner also defended the company’s use of Australians’ information, telling senators risks including bias could be addressed by training AI models using more local data.
“It is incredibly helpful to have a lot of data from Australians in the training models because that enables us to have output from those models, both in terms of what we produce and what others produce, which reflects the diversity of Australian society,” he said.
“Having a rich corpus of Australian data is extremely important to be able to provide good services for Australians.”
Mr Milner also revealed Meta employed a team of four people to monitor political issues around elections and referendums in Australia, but declined to comment on Meta’s alleged use of a Books3 dataset that included several pirated Australian novels.
He said the company’s 20,000-word privacy policy was onerous for users to read but asking users to opt-in to share their data would be frustrating.
“You’re trying to get that balance right all the time but a kind of compulsory opt-in at all times, it would be extremely annoying for most people across the internet,” Mr Milner said.
“We know that for a fact.”
The Senate committee, which has also heard from tech firms including Amazon, Microsoft and Google, is expected to present a final report by September 19.
Get the latest news from thewest.com.au in your inbox.
Sign up for our emails