In dusty factories, cramped Internet cafés, and makeshift home offices around the world, millions of people sit at computers tediously labeling data.
These workers are the lifeblood of the burgeoning artificial intelligence (AI) industry. Without them, products such as ChatGPT simply would not exist. That’s because the data they label helps AI systems “learn.”
But despite the vital contribution this workforce makes to an industry that is expected to be worth US$407 billion by 2027, the people who comprise it are largely invisible and frequently exploited. Earlier this year, nearly 100 data labelers and AI workers from Kenya who do work for companies like Facebook, Scale AI, and OpenAI published an open letter to United States President Joe Biden in which they said:
“Our working conditions amount to modern day slavery.”
To ensure AI supply chains are ethical, industry and governments must urgently address this problem. But the key question is: how?
What is data labeling?
Data labeling is the process of annotating raw data — such as images, video, or text — so that AI systems can recognize patterns and make predictions.
Self-driving cars, for example, rely on labeled video footage to distinguish pedestrians from road signs. Large language models like ChatGPT rely on labeled text to understand human language.
These labeled datasets are the lifeblood of AI models. Without them, AI systems would be unable to function effectively.
Tech giants like Meta, Google, OpenAI, and Microsoft outsource much of this work to data labeling factories in countries such as the Philippines, Kenya, India, Pakistan, Venezuela, and Colombia.
China is also becoming another global hub for data labeling.
Outsourcing companies that facilitate this work include Scale AI, iMerit, and Samasource. These are very large companies in their own right. For example, Scale AI, which is headquartered in California, is now worth US$14 billion.
Cutting corners
Major tech firms like Alphabet (the parent company of Google), Amazon, Microsoft, Nvidia, and Meta have poured billions into AI infrastructure, from computational power and data storage to emerging computational technologies.
Large-scale AI models can cost tens of millions of dollars to train. Once deployed, maintaining these models requires continuous investment in data labeling, refinement, and real-world testing.
But while AI investment is significant, revenues have not always met expectations. Many industries continue to view AI projects as experimental with unclear profitability paths.
In response, many companies are cutting costs which affect those at the very bottom of the AI supply chain who are often highly vulnerable: data labelers.
Low wages, dangerous working conditions
One way companies involved in the AI supply chain try to reduce costs is by employing large numbers of data labelers in countries in the Global South, such as the Philippines, Venezuela, Kenya, and India. Workers in these countries face stagnating or shrinking wages.
For example, the hourly rate for AI data labelers in Venezuela ranges from between 90 cents and US$2. In comparison, in the United States, this rate is between US$10 to US$25 per hour.
In the Philippines, workers labeling data for multi-billion dollar companies such as Scale AI often earn far below the minimum wage.
Some labeling providers even resort to child labor for labeling purposes.
However, there are many other labor issues within the AI supply chain.
Many data labelers work in overcrowded and dusty environments which pose a serious risk to their health. They also often work as independent contractors, lacking access to protections such as health care or compensation.
The mental toll of data labeling work is also significant, with repetitive tasks, strict deadlines, and rigid quality controls. Data labelers are also sometimes asked to read and label hate speech or other abusive language or material, which has been proven to have negative psychological effects.
Errors can lead to pay cuts or job losses. However, labelers often experience a lack of transparency on how their work is evaluated. They are often denied access to performance data, hindering their ability to improve or contest decisions.
Making AI supply chains ethical
As AI development becomes more complex and companies strive to maximize profits, the need for ethical AI supply chains is urgent.
One way companies can help ensure this is by applying a human rights-centered design, deliberation, and oversight approach to the entire AI supply chain. They must adopt fair wage policies, ensuring data labelers receive living wages that reflect the value of their contributions.
By embedding human rights into the supply chain, AI companies can foster a more ethical, sustainable industry, ensuring that both workers’ rights and corporate responsibility align with long-term success.
Governments should also create new regulations that mandate these practices, encouraging fairness and transparency. This includes transparency in performance evaluation and personal data processing, allowing workers to understand how they are assessed and to contest any inaccuracies.
Clear payment systems and recourse mechanisms will ensure workers are treated fairly. Instead of busting unions, as Scale AI did in Kenya in 2024, companies should also support the formation of digital labor unions or cooperatives. This will give workers a voice to advocate for better working conditions.
As users of AI products, we all can advocate for ethical practices by supporting companies that are transparent about their AI supply chains and commit to fair treatment of workers. Just as we reward green and fair trade producers of physical goods, we can push for change by choosing digital services or apps on our smartphones that adhere to human rights standards, promoting ethical brands through social media, and voting with our dollars for accountability from tech giants on a daily basis.
By making informed choices, we all can contribute to more ethical practices across the AI industry.
Ganna Pogrebna, Executive Director, AI and Cyber Futures Institute, Charles Sturt University
This article is republished from The Conversation under a Creative Commons license. Read the original article.