NASA’s Lucy spacecraft, named for an early human ancestor whose skeleton provided insights into our species’ muddled origins, has begun the first leg of its 12-year journey.
Lifting off from Cape Canaveral early Saturday morning on an Atlas V rocket, Lucy is headed to study asteroids in an area around Jupiter that’s been relatively unchanged since the Big Bang. It will venture farther from the sun than any other solar-powered spacecraft.
“Lucy will profoundly change our understanding of planetary evolution in our solar system,” Adriana Ocampo, a Lucy program executive at NASA, said during a science media briefing held on October 14.
Lucy’s mission is to fly by one asteroid in the jam-packed area that circles the sun between Mars and Jupiter—and then continue on to the Trojans, two swarms of rocky bodies far past the asteroid belt. These asteroid swarms, which travel just ahead of and behind Jupiter as it orbits, are celestial remnants from the solar system’s earliest days.
Lucy will take black-and-white and color images, and use a diamond beam splitter to shine far-infrared light at the asteroids to take their temperature and make maps of their surface. It will also collect other measurements as it flies by. This data could help scientists understand how the planets may have formed.
Sarah Dodson-Robinson, an assistant professor of physics and astronomy at the University of Delaware, says Lucy could offer a definitive time line for not only when the planets originally formed, but where.
“If you can nail down when the Trojan asteroids formed, then you have some information about when did Jupiter form, and can start asking questions like ‘Where did Jupiter go in the solar system?’” she says. “Because it wasn’t always where it is now. It’s moved around.”
And to determine the asteroids’ ages, the spacecraft will search for surface craters that may be no bigger than a football field.
“[The Trojans] haven’t had nearly as much colliding and breaking as some of the other asteroids that are nearer to us,” says Dodson-Robinson. “We’re potentially getting a look at some of these asteroids like they were shortly after they formed.”
On its 4-billion-mile journey, Lucy will receive three gravity assists from Earth, which will involve using the planet’s gravitational force to change the spacecraft’s trajectory without depleting its resources. Coralie Adam, deputy navigation team chief for the Lucy mission, says each push will increase the spacecraft’s velocity from 200 miles per hour to over 11,000 mph.
“If not for this Earth gravity assist, it would take five times the amount of fuel—or three metric tons—to reach Lucy’s target, which would make the mission unfeasible,” said Adam during an engineering media briefing also held on October 14.
Lucy’s mission is slated to end in 2033, but some NASA officials already feel confident that the spacecraft will last far longer. “There will be a good amount of fuel left onboard,” said Adam. “After the final encounter with the binary asteroids, as long as the spacecraft is healthy, we plan to propose to NASA to do an extended mission and explore more Trojans.”
The last 20 months turned every dog into an amateur epidemiologist and statistician. Meanwhile, a group of bona fide epidemiologists and statisticians came to believe that pandemic problems might be more effectively solved by adopting the mindset of an engineer: that is, focusing on pragmatic problem-solving with an iterative, adaptive strategy to make things work.
In a recent essay, “Accounting for uncertainty during a pandemic,” the researchers reflect on their roles during a public health emergency and on how they could be better prepared for the next crisis. The answer, they write, may lie in reimagining epidemiology with more of an engineering perspective and less of a “pure science” perspective.
Epidemiological research informs public health policy and its inherently applied mandate for prevention and protection. But the right balance between pure research results and pragmatic solutions proved alarmingly elusive during the pandemic.
We have to make practical decisions, so how much does the uncertainty really matter?
“I always imagined that in this kind of emergency, epidemiologists would be useful people,” Jon Zelner, a coauthor of the essay, says. “But our role has been more complex and more poorly defined than I had expected at the outset of the pandemic.” An infectious disease modeler and social epidemiologist at the University of Michigan, Zelner witnessed an “insane proliferation” of research papers, “many with very little thought about what any of it really meant in terms of having a positive impact.”
“There were a number of missed opportunities,” Zelner says—caused by missing links between the ideas and tools epidemiologists proposed and the world they were meant to help.
Giving up on certainty
Coauthor Andrew Gelman, a statistician and political scientist at Columbia University, set out “the bigger picture” in the essay’s introduction. He likened the pandemic’s outbreak of amateur epidemiologists to the way war makes every citizen into an amateur geographer and tactician: “Instead of maps with colored pins, we have charts of exposure and death counts; people on the street argue about infection fatality rates and herd immunity the way they might have debated wartime strategies and alliances in the past.”
And along with all the data and public discourse—Are masks still necessary? How long will vaccine protection last?—came the barrage of uncertainty.
In trying to understand what just happened and what went wrong, the researchers (who also included Ruth Etzioni at the University of Washington and Julien Riou at the University of Bern) conducted something of a reenactment. They examined the tools used to tackle challenges such as estimating the rate of transmission from person to person and the number of cases circulating in a population at any given time. They assessed everything from data collection (the quality of data and its interpretation were arguably the biggest challenges of the pandemic) to model design to statistical analysis, as well as communication, decision-making, and trust. “Uncertainty is present at each step,” they wrote.
And yet, Gelman says, the analysis still “doesn’t quite express enough of the confusion I went through during those early months.”
One tactic against all the uncertainty is statistics. Gelman thinks of statistics as “mathematical engineering”—methods and tools that are as much about measurement as discovery. The statistical sciences attempt to illuminate what’s going on in the world, with a spotlight on variation and uncertainty. When new evidence arrives, it should generate an iterative process that gradually refines previous knowledge and hones certainty.
Good science is humble and capable of refining itself in the face of uncertainty.
Susan Holmes, a statistician at Stanford who was not involved in this research, also sees parallels with the engineering mindset. “An engineer is always updating their picture,” she says—revising as new data and tools become available. In tackling a problem, an engineer offers a first-order approximation (blurry), then a second-order approximation (more focused), and so on.
Gelman, however, has previously warned that statistical science can be deployed as a machine for “laundering uncertainty”—deliberately or not, crappy (uncertain) data are rolled together and made to seem convincing (certain). Statistics wielded against uncertainties “are all too often sold as a sort of alchemy that will transform these uncertainties into certainty.”
We witnessed this during the pandemic. Drowning in upheaval and unknowns, epidemiologists and statisticians—amateur and expert alike—grasped for something solid as they tried to stay afloat. But as Gelman points out, wanting certainty during a pandemic is inappropriate and unrealistic. “Premature certainty has been part of the challenge of decisions in the pandemic,” he says. “This jumping around between uncertainty and certainty has caused a lot of problems.”
Letting go of the desire for certainty can be liberating, he says. And this, in part, is where the engineering perspective comes in.
A tinkering mindset
For Seth Guikema, co-director of the Center for Risk Analysis and Informed Decision Engineering at the University of Michigan (and a collaborator of Zelner’s on other projects), a key aspect of the engineering approach is diving into the uncertainty, analyzing the mess, and then taking a step back, with the perspective “We have to make practical decisions, so how much does the uncertainty really matter?” Because if there’s a lot of uncertainty—and if the uncertainty changes what the optimal decisions are, or even what the good decisions are—then that’s important to know, says Guikema. “But if it doesn’t really affect what my best decisions are, then it’s less critical.”
For instance, increasing SARS-CoV-2 vaccination coverage across the population is one scenario in which even if there is some uncertainty regarding exactly how many cases or deaths vaccination will prevent, the fact that it is highly likely to decrease both, with few adverse effects, is motivation enough to decide that a large-scale vaccination program is a good idea.
An engineer is always updating their picture.
Engineers, Holmes points out, are also very good at breaking problems down into critical pieces, applying carefully selected tools, and optimizing for solutions under constraints. With a team of engineers building a bridge, there is a specialist in cement and a specialist in steel, a wind engineer and a structural engineer. “All the different specialties work together,” she says.
For Zelner, the notion of epidemiology as an engineering discipline is something he picked up from his father, a mechanical engineer who started his own company designing health-care facilities. Drawing on a childhood full of building and fixing things, his engineering mindset involves tinkering—refining a transmission model, for instance, in response to a moving target.
“Often these problems require iterative solutions, where you’re making changes in response to what does or doesn’t work,” he says. “You continue to update what you’re doing as more data comes in and you see the successes and failures of your approach. To me, that’s very different—and better suited to the complex, non-stationary problems that define public health—than the kind of static one-and-done image a lot of people have of academic science, where you have a big idea, test it, and your result is preserved in amber for all time.”
Zelner and collaborators at the university spent many months building a covid mapping website for Michigan, and he was involved in creating data dashboards—useful tools for public consumption. But in the process, he saw a growing mismatch between the formal tools and what was needed to inform practical decision-making in a rapidly evolving crisis. “We knew a pandemic would happen one day, but I certainly had not given any thought to what my role would be, or could be,” he says. “We spent several agonizing months just inventing the thing—trying to do this thing we’d never done before and realizing that we had no expertise in doing it.”
He envisions research results that come not only with exhortations that “People should do this!” but also with accessible software allowing others to tinker with the tools. But for the most part, he says, epidemiologists do research, not development: “We write software, and it’s usually pretty bad, but it gets the job done. And then we write the paper, and then it’s up to somebody else—some imagined other person—to make it useful in the broader context. And then that never happens. We’ve seen these failures in the context of the pandemic.”
He imagines the equivalent of a national weather forecasting center for infectious disease. “There’s a world in which all the covid numbers go to one central place,” he says. “Where there is a model that is able to coherently combine that information, generate predictions accompanied by pretty accurate depictions of the uncertainty, and say something intelligible and relatively actionable in a fairly tight time line.”
At the beginning of the pandemic, that infrastructure didn’t exist. But recently, there have been signs of progress.
Fast-moving public health science
Marc Lipsitch, an infectious disease epidemiologist at Harvard, is the director of science at the US Centers for Disease Control’s new Center for Forecasting and Outbreak Analytics, which aims to improve decision-making and enable a coordinated, coherent response to a pandemic as it unfolds.
“We’re not very good at forecasting for infectious diseases right now. In fact, we are quite bad at it,” Lipsitch says. But we were quite bad at weather forecasting when it started in the ’50s, he notes. “And then technology improved, methodology improved, measurement improved, computation improved. With investment of time and scientific effort, we can get better at things.”
Getting better at forecasting is part of the center’s vision for innovation. Another goal is the capability to do specific studies to answer specific questions that arise during a pandemic, and then to produce custom-designed analytics software to inform timely responses on the national and local levels.
These efforts are in sync with the notion of an engineering approach—although Lipsitch would call it simply “fast-moving public health science.”
“Good science is humble and capable of refining itself in the face of uncertainty,” he says. “Scientists, usually over a longer time scale—years or decades—are quite used to the idea of updating our picture of truth.” But during a crisis, the updating needs to happen fast. “Outside of pandemics, scientists are not used to vastly changing our picture of the world each week or month,” he says. “But in this pandemic especially, with the speed of new developments and new information, we are having to do so.”
The philosophy of the new center, Lipsitch says, “is to improve decision-making under uncertainty, by reducing that uncertainty with better analyses and better data, but also by acknowledging what is not known, and communicating that and its consequences clearly.”
And he notes, “We’re gonna need a lot of engineers to make this function—and the engineering approach, for sure.”
Used optimally, data is nothing less than a critically important asset. Problem is, it’s not always easy to put data to work. The Seagate Rethink Data report, with research and analysis by IDC, found that only 32% of the data available to enterprises is ever used and the remaining 68% goes unleveraged. Executives aren’t fully confident in their current ability—nor in their long-range plans—to wring optimal levels of value out of the data they produce, acquire, manage, and use.
What’s the disconnect? If data is so important to a business’s health, why is it so hard to master?
In the best-run companies, the systems that connect data producers and data consumers are secure and easy to deploy. But they’re usually not. Companies are challenged with finding data and leveraging it for strategic purposes. Sources of data are hard to identify and even harder to evaluate. Datasets used to train AI models for the automation of tasks can be hard to validate. Hackers are always looking to steal or compromise data. And finding quality data is a challenge for even the savviest data scientists.
The lack of an end-to-end system for ensuring high-quality data and sharing it efficiently has indirectly delayed the adoption of AI.
Communication gaps can also derail the process of delivering impactful insights. Executives who fund data projects and the data engineers and scientists who carry them out don’t always understand one another. These data practitioners can create a detailed plan, but if the practitioner doesn’t frame the results properly, the business executive who requested them may say they were looking for something different. The project will be labeled a failure, and the chance to generate value out of the effort will fall by the wayside.
Companies encounter data issues, no matter where they are in terms of data maturity. They’re trying to figure out ways to make data an important part of their future, but they’re struggling to put plans into practice.
If you’re in this position, what do you do?
Companies found themselves at a similar inflection point back in the 2010s, trying to sort out their places in the cloud. They took years developing their cloud strategies, planning their cloud migrations, choosing platforms, creating Cloud Business Offices, and structuring their organizations to best take advantage of cloud-based opportunities. Today, they’re reaping the benefits: Their moves to the cloud have enabled them to modernize their apps and IT systems.
Enterprises now have to make similar decisions about data. They need to consider many factors to make sure data is providing a foundation for their business going forward. They should ask questions such as:
Is the data the business needs readily available?
What types of sources of data are needed? Are there distributed and diverse sets of data you don’t know about?
Is the data clean, current, reliable, and able to integrate with existing systems?
Is the rest of the C-level onboard with the chief data officer’s approach?
Are data scientists and end users communicating effectively about what’s needed and what’s being delivered?
How is data being shared?
How can I trust my data?
Does every person and organization that needs access to the data have the right to use it?
This is about more than just business intelligence. It’s about taking advantage of an opportunity that’s taking shape. Data use is exploding, tools to leverage it are becoming more efficient, and data scientists’ expertise is growing. But data is hard to master. Many companies aren’t set up to make the best use of the data they have at hand. Enterprises need to make investments in the people, processes, and technologies that will drive their data strategies.
With all of this in mind, here are 10 principles companies should follow when developing their data strategies:
1. Understand how valuable your data really is
How much is your data worth to you? This can be measured in a number of ways. There are traditional metrics to consider, such as the costs of acquiring the data, the cost to store and transmit it, the uniqueness of the data being acquired, and the opportunity to use it to generate additional revenue. Marketplace metrics affect the value of the data, such as data quality, age of the data, and popularity of a data product.
Your data could also be valuable to others. For example, suppose a hospital collects patient datasets that can generate value for your data. In that case, that data could be of interest to disease researchers, drug manufacturers, insurance companies, and other potential buyers. Is there a mechanism in place to anonymize, aggregate, control, and identify potential users of your data?
Opportunity, balanced by the cost it takes to deliver on it, is one way to determine the potential value of your data.
2. Determine what makes data valuable
While it may be hard to put an actual dollar value on your data, it’s easier to define the elements that contribute to data having a high degree of value. It can be reduced to a simple thought equation:
Completeness + Validity = Quality
Quality + Format = Usability
Usable Data + A Data Practitioner Who Uses it Well = VALUE
Your data project can’t proceed without good data. Is the quality of your data high enough to be worthwhile? That will depend, in part, on how complete the sample is that you’ve collected. Are data fields missing? Quality also depends on how valid the information is. Was it collected from a reliable source? Is the data current, or has time degraded its validity? Do you collect and store your data in accordance with industry and sector ontologies and standards?
Your data has to be usable for it to be worthy of investment. Setting up systems for data practitioners to use and analyze the data well and connect it with business leaders who can leverage the insights closes the loop.
3. Establish where you are on your data journey
Positioning a business to take full advantage of cloud computing is a journey. The same thinking should apply to data.
The decisions companies make about their data strategies depend largely on where they happen to be on their data journeys. How far along are you on your data journey? Assessment tools and blueprints can help companies pinpoint their positions. Assessments should go beyond identifying which tools are in a company’s technology stack. They should look at how data is treated across an organization in many ways, taking into account governance, lifecycle management, security, ingestion and processing, data architectures, consumption and distribution, data knowledge, and data monetization.
Consumption and distribution alone can be measured in terms of an organization’s ability to apply services ranging from business intelligence to streaming data to self-service applications of data analytics. Has the company implemented support for data usage by individual personas? Is it supporting individual APIs? Looking at data knowledge as a category, how advanced are the company’s data dictionaries, business glossaries, catalogs, and master data management plans?
Scoring each set of capabilities reveals a company’s strengths and weaknesses in terms of data preparedness. Until the company takes a closer look, it may not realize how near or far it is from where it needs or want to be.
4. Learn to deal with data from various sources
Data is coming into organizations from all directions—from inside the company, IoT devices, and video surveillance systems at the edge, partners, customers, social media, and the web. The hundreds of zettabytes of worldwide data will have to be selectively managed, protected, and optimized for convenient, productive use.
This is a challenge for enterprises that haven’t developed systems for data collection and data governance. Wherever the data comes from, there needs to be a mechanism for standardizing it so that the data will be usable for a greater benefit.
Different companies and different countries impose different rules on what and how information can be shared. Even individual departments within the same company can run afoul of corporate governance rules designating the paths certain datasets have to follow. That means enforcing data access and distribution policies. To seize these data opportunities, companies need to engineer pathways to discover new datasets and impose governance rules to manage them.
In manufacturing, companies on a supply chain line measure the quality of their parts and suppliers. Often, the machinery and the robotics they’re using are owned by the suppliers. Suppliers may want to set up contracts to see who has the right to use data to protect their own business interests, and manufacturers should define their data sharing requirements with their partners and suppliers up front.
5. Get a strategic commitment from the C-suite
Data benefits many levels of an organization, and personas at each of the affected levels will lobby for a particular aspect of the data value process. Data scientists want more high-powered, easy-to-use technology. Line-of-business leaders push for better, faster insights. At the top of the pyramid is the C-suite, which prioritizes the channeling of data into business value.
It’s critical to get C-level executives on board with a holistic data strategy. Doing it right, after all, can be disruptive. Extracting maximum value from data requires an organization to hire staff with new skill sets, realign its culture, reengineer old processes, and rearchitect the old data platform. It’s a transformation project that can’t be done without getting buy-in from the top levels of a company.
The C-suite is increasingly open to expanding organizations’ use of data. Next to customer engagement, the second highest strategic area of interest at the board level is leveraging data and improving decision-making to remain competitive and exploit changing market conditions, according to the IDC report “Market Analysis Perspective: Worldwide Data Integration and Intelligence Software, 2021.” In the same report, 83% of executives articulated the need to be more data driven than before the pandemic.
How should organizations ensure that the C-suite gets on board? If you’re a stakeholder without a C-level title, your job is to work with your peers to find an executive sponsor to carry the message to leaders who control the decision-making process. Data is a strategic asset that will determine a company’s success in the long run, but it won’t happen without endorsements at the highest levels.
6. In data we trust: Ensure your data is beyond reproach
As AI expands into almost every aspect of modern life, the risks of corrupt or faulty AI practices increase exponentially. This comes down to the quality of the data being used to train the AI models. How was the data produced? Was it based on a faulty sensor? Was there a biased data origin generated into the dataset? Did the selection of data come from one location instead of a statistically valid set of data?
Trustworthy AI depends on having trustworthy data that can be used to build transparent, trustworthy, unbiased, and robust models. If you know how a model is trained and you suspect you’re getting faulty results, you can stop the process and retrain the model. Or, if someone questions the model, you can go back and explain why a particular decision was made, but you need to have clean, validated data to reference.
Governments are often asked by policy watchdogs to support how they’re using AI and to prove that their analyses are not built on biased data. The validity of the algorithms used has sparked debates about efforts to rely on machine learning to guide sentencing decisions and make decisions about welfare benefit claims or other government activities.
The training of the model takes place in steps. You build a model based on data. Then you test the model and gather additional data to retest it. If it passes, you turn it into a more robust production model. The journey continues by adding more data, massaging it, and establishing over time if your model stands up to scrutiny.
The lack of an end-to-end system for ensuring high-quality data and sharing it efficiently has indirectly delayed the adoption of AI. According to IDC, 52% of survey respondents believe that data quality, quantity, and access challenges are holding up AI deployments.
7. Seize upon the metadata opportunity
Metadata is defined elliptically as “data that provides information about other data.” It’s what gives data the context that users need to understand a piece of the information’s characteristics, so they can determine what to do with it in the future.
Metadata standards are commonly used for niche purposes, specific industry applications like astronomical catalogs, or data types like XML files. But there’s also a case to be made for a stronger metadata framework where we can not only define data in common ways but also tag useful data artifacts along its journey. Where did this piece of data originate? Who has viewed it? Who has used it? What has it been used for? Who has added what piece of the dataset? Has the data been verified? Is it prohibited from use in certain situations?
Developing this kind of metadata mechanism requires a technology layer that is open to contributions from those viewing and touching a particular piece of data. It also requires a commitment from broad sets of stakeholders who see the value of being able to share data strategically and transparently.
Creating an additional open metadata layer would be an important step toward allowing the democratization of access to the data by enabling the transparent sharing of key data attributes necessary for access, governance, trust, and lineage. Hewlett Packard Enterprise’s approach to dataspaces is to open up a universal metadata standard that would remove the current complexities associated with sharing diverse datasets.
8. Embrace the importance of culture
Organizations want to make sure they’re getting the most out of the resources they’re nourishing—and to do that, they need to create cultures that promote best practices for information sharing.
Do you have silos? Are there cultural barriers inside your organization that get in the way of the proper dissemination of information to the right sources at the right times? Do different departments feel they own their data and don’t have to share it with others in the organization? Are individuals hoarding valuable data? Have you set up channels and procedures that promote frictionless data sharing? Have you democratized access to data, giving business stakeholders the ability to not only request data but participate in querying and sharing practices?
If any of these factors are blocking the free flow of data exchange, your organization needs to undergo a change management assessment focusing on its needs across people, processes, and technology.
9. Open things up, but trust no one
In all aspects of business, organizations balance the often conflicting concepts of promoting free and open sharing of resources and tightly controlled security. Achieving this balance is particularly important when dealing with data.
Data needs to be shared, but many data producers are uncomfortable doing so because they fear the loss of control and how their data could be used against them, or how their data could be changed or used inappropriately.
Security needs to be a top priority. Data is coming from so many sources—some you control, some you don’t—and being passed through so many hands. That means that security policies surrounding data need to be designed with a zero-trust model through every step of the process. Trust has to be established through the entire stack, from your infrastructure and operating systems to the workloads that sit on top of those systems, all the way down to the silicon level.
10. Create a fully functioning data services pipeline
Moving data among systems requires many steps, including moving data to the cloud, reformatting it, and joining it with other data sources. Each of these steps usually requires separate software.
Automating data pipelines is a critical best practice in the data journey. A fully automated data pipeline allows organizations to extract data at the source, transform it into a usable form, and integrate it with other sources.
The data pipeline is the sum of all these steps, and its job is to ensure that these steps happen reliably to all data. These processes should be automated, but most organizations need at least one or two engineers to maintain the systems, repair failures, and update according to the changing needs of the business.
Begin the data journey today
How well companies leverage their data—wherever it lives—will determine their success in the years to come. Constellation Research projects 90% of the current Fortune 500 will be merged, acquired, or bankrupt by 2050. If they don’t start now, they’ll be left behind. The clock is ticking.
We take it for granted that machines can recognize what they see in photos and videos. That ability rests on large data sets like ImageNet, a hand-curated collection of millions of photos used to train most of the best image-recognition models of the last decade.
But the images in these data sets portray a world of curated objects—a picture gallery that doesn’t capture the mess of everyday life as humans experience it. Getting machines to see things as we do will take a wholly new approach. And Facebook’s AI lab wants to take the lead.
It is kick-starting a project, called Ego4D, to build AIs that can understand scenes and activities viewed from a first-person perspective—how things look to the people involved, rather than to an onlooker. Think motion-blurred GoPro footage taken in the thick of the action, instead of well-framed scenes taken by someone on the sidelines. Facebook wants Ego4D to do for first-person video what ImageNet did for photos.
For the last two years, Facebook AI Research (FAIR) has worked with 13 universities around the world to assemble the largest ever data set of first-person video—specifically to train deep-learning image-recognition models. AIs trained on the data set will be better at controlling robots that interact with people, or interpreting images from smart glasses. “Machines will be able to help us in our daily lives only if they really understand the world through our eyes,” says Kristen Grauman at FAIR, who leads the project.
Such tech could support people who need assistance around the home, or guide people in tasks they are learning to complete. “The video in this data set is much closer to how humans observe the world,” says Michael Ryoo, a computer vision researcher at Google Brain and Stony Brook University in New York, who is not involved in Ego4D.
The business model of Facebook, and other Big Tech companies, is to wring as much data as possible from people’s online behavior and sell it to advertisers. The AI outlined in the project could extend that reach to people’s everyday offline behavior, revealing what objects are around your home, what activities you enjoyed, who you spent time with, and even where your gaze lingered—an unprecedented degree of personal information.
“There’s work on privacy that needs to be done as you take this out of the world of exploratory research and into something that’s a product,” says Grauman. “That work could even be inspired by this project.”
The biggest previous data set of first-person video consists of 100 hours of footage of people in the kitchen. The Ego4D data set consists of 3,025 hours of video recorded by 855 people in 73 different locations across nine countries (US, UK, India, Japan, Italy, Singapore, Saudi Arabia, Colombia, and Rwanda).
The participants had different ages and backgrounds; some were recruited for their visually interesting occupations, such as bakers, mechanics, carpenters, and landscapers.
Previous data sets typically consisted of semi-scripted video clips only a few seconds long. For Ego4D, participants wore head-mounted cameras for up to 10 hours at a time and captured first-person video of unscripted daily activities, including walking along a street, reading, doing laundry, shopping, playing with pets, playing board games, and interacting with other people. Some of the footage also includes audio, data about where the participants’ gaze was focused, and multiple perspectives on the same scene. It’s the first data set of its kind, says Ryoo.
FAIR has also launched a set of challenges that it hopes will focus other researchers’ efforts on developing this kind of AI. The team anticipates algorithms built into smart glasses, like Facebook’s recently announced Ray-Bans, that record and log the wearers’ day-to-day lives. It means that augmented- or virtual-reality “metaverse” apps could, in theory, answer questions like “Where are my car keys?” or “What did I eat and who did I sit next to on my first flight to France?” Augmented-reality assistants could understand what you’re trying to do and offer instructions or useful social cues.
It’s sci-fi stuff, but closer than you think, says Grauman. Large data sets accelerate the research. “ImageNet drove some big advances in a short time,” she says. “We can expect the same for Ego4D, but for first-person views of the world instead of internet images.”
Once the footage was collected, crowdsourced workers in Rwanda spent a total of 250,000 hours watching the thousands of video clips and writing millions of sentences that describe the scenes and activities filmed. These annotations will be used to train AIs to understand what they are watching.
Where this tech ends up and how quickly it develops remain to be seen. FAIR is planning a competition based on its challenges in June 2022. It is also important to note that FAIR, the research lab, is not the same as Facebook, the media megalodon. In fact, insiders say that Facebook has ignored technical fixes that FAIR has come up with for its toxic algorithms. But Facebook is paying for the research, and it is disingenuous to pretend the company is not very interested in its application.
Sam Gregory at Witness, a human rights organization that specializes in video technology, says this technology could be useful for bystanders documenting protests or police abuse. But he thinks those benefits are outweighed by concerns around commercial applications. He notes that it is possible to identify individuals from how they hold a video camera. Gaze data would be even more revealing: “It’s a very strong indicator of interest,” he says. “How will gaze data be stored? Who will it be accessible to? How might it be processed and used?”
“Facebook’s reputation and core business model ring a lot of alarm bells,” says Rory Mir at the Electronic Frontier Foundation. “At this point many are aware of Facebook’s poor track record on privacy, and their use of surveillance to influence users—both to keep users hooked and to sell that influence to their paying customers, the advertisers.” When it comes to augmented and virtual reality, Facebook is seeking a competitive advantage, says Mir: “Expanding the amount and types of data it collects is essential.”
When asked about its plans, Facebook was unsurprisingly tight-lipped: “Ego4D is purely research to promote advances in the broader scientific community,” says a spokesperson. “We don’t have anything to share today about product applications or commercial use.”
A warning: Conspiracy theories about covid are helping disseminate anti-Semitic beliefs to a wider audience, warns a new report by the antiracist advocacy group Hope not Hate. The report says that not only has the pandemic revived interest in the “New World Order” conspiracy theory of a secret Jewish-run elite that aims to run the world, but far-right activists have also worked to convert people’s anti-lockdown and anti-vaccine beliefs into active anti-Semitism.
Worst offenders: The authors easily managed to find anti-Semitism on all nine platforms they investigated, including TikTok, Instagram, Twitter, and YouTube. Some of it uses coded language to avoid detection and moderation by algorithms, but much of it is overt and easily discoverable. Unsurprisingly, the authors found a close link between the amount of anti-Semitism on a platform and how lightly or loosely it is moderated: the laxer the moderation, the bigger the problem.
Some specifics: The report warns that the messaging app Telegram has rapidly become one of the worst offenders, playing host to many channels that disseminate anti-Semitic content, some of them boasting tens of thousands of members. One channel that promotes the New World Order conspiracy theory has gained 90,000 followers since its inception in February 2021. However it’s a problem on every platform. Jewish creators on TikTok have complained that they face a deluge of anti-Semitism on the platform, and they are often targeted by groups who mass-report their accounts in order to get them temporarily banned.
A case study: The authors point to one man who was radicalized during the pandemic as a typical example of how people can end up pushed into adopting more and more extreme views. At the start of 2020 Attila Hildmann was a successful vegan chef in Germany, but in the space of just a year he went from being ostensibly apolitical to “just asking some questions” as a social media influencer to spewing hate and inciting violence on his own Telegram channel.
What can be done: Many of the platforms investigated have had well over a decade to get a handle on regulating and moderating hate speech, and some progress has been made. However, while major platforms have become better at removing anti-Semitic organizations, they’re still struggling to remove anti-Semitic content produced by individuals, the report warns.
Welcome to I Was There When, a new oral history project from the In Machines We Trust podcast. It features stories of how breakthroughs in artificial intelligence and computing happened, as told by the people who witnessed them. In this first episode, we meet Joseph Atick— who helped create the first commercially viable face recognition system.
This episode was produced by Jennifer Strong, Anthony Green and Emma Cillekens with help from Lindsay Muscato. It’s edited by Michael Reilly and Mat Honan. It’s mixed by Garret Lang, with sound design and music by Jacob Gorski.
Jennifer: I’m Jennifer Strong, host of In Machines We Trust.
I want to tell you about something we’ve been working on for a little while behind the scenes here.
It’s called I Was There When.
It’s an oral history project featuring the stories of how breakthroughs in artificial intelligence and computing happened… as told by the people who witnessed them.
Joseph Atick: And as I entered the room, it spotted my face, extracted it from the background and it pronounced: “I see Joseph” and that was the moment where the hair on the back… I felt like something had happened. We were a witness.
Jennifer: We’re kicking things off with a man who helped create the first facial recognition system that was commercially viable… back in the ‘90s…
I am Joseph Atick. Today, I’m the executive chairman of ID for Africa, a humanitarian organization that focuses on giving people in Africa a digital identity so they can access services and exercise their rights. But I have not always been in the humanitarian field. After I received my PhD in mathematics, together with my collaborators made some fundamental breakthroughs, which led to the first commercially viable face recognition. That’s why people refer to me as a founding father of face recognition and the biometric industry. The algorithm for how a human brain would recognize familiar faces became clear while we were doing research, mathematical research, while I was at the Institute for Advanced Study in Princeton. But it was far from having an idea of how you would implement such a thing.
It was a long period of months of programming and failure and programming and failure. And one night, early morning, actually, we had just finalized a version of the algorithm. We submitted the source code for compilation in order to get a run code. And we stepped out, I stepped out to go to the washroom. And then when I stepped back into the room and the source code had been compiled by the machine and had returned. And usually after you compile it runs it automatically, and as I entered the room, it spotted a human moving into the room and it spotted my face, extracted it from the background and it pronounced: “I see Joseph.” and that was the moment where the hair on the back—I felt like something had happened. We were a witness. And I started to call on the other people who were still in the lab and each one of them they would come into the room.
And it would say, “I see Norman. I would see Paul, I would see Joseph.” And we would sort of take turns running around the room just to see how many it can spot in the room. It was, it was a moment of truth where I would say several years of work finally led to a breakthrough, even though theoretically, there wasn’t any additional breakthrough required. Just the fact that we figured out how to implement it and finally saw that capability in action was very, very rewarding and satisfying. We had developed a team which is more of a development team, not a research team, which was focused on putting all of those capabilities into a PC platform. And that was the birth, really the birth of commercial face recognition, I would put it, on 1994.
My concern started very quickly. I saw a future where there was no place to hide with the proliferation of cameras everywhere and the commoditization of computers and the processing abilities of computers becoming better and better. And so in 1998, I lobbied the industry and I said, we need to put together principles for responsible use. And I felt good for a while, because I felt we have gotten it right. I felt we’ve put in place a responsible use code to be followed by whatever is the implementation. However, that code did not live the test of time. And the reason behind it is we did not anticipate the emergence of social media. Basically, at the time when we established the code in 1998, we said the most important element in a face recognition system was the tagged database of known people. We said, if I’m not in the database, the system will be blind.
And it was difficult to build the database. At most we could build thousand 10,000, 15,000, 20,000 because each image had to be scanned and had to be entered by hand—the world that we live in today, we are now in a regime where we have allowed the beast out of the bag by feeding it billions of faces and helping it by tagging ourselves. Um, we are now in a world where any hope of controlling and requiring everybody to be responsible in their use of face recognition is difficult. And at the same time, there is no shortage of known faces on the internet because you can just scrape, as has happened recently by some companies. And so I began to panic in 2011, and I wrote an op-ed article saying it is time to press the panic button because the world is heading in a direction where face recognition is going to be omnipresent and faces are going to be everywhere available in databases.
And at the time people said I was an alarmist, but today they’re realizing that it’s exactly what’s happening today. And so where do we go from here? I’ve been lobbying for legislation. I’ve been lobbying for legal frameworks that make it a liability for you to use somebody’s face without their consent. And so it’s no longer a technological issue. We cannot contain this powerful technology through technological means. There has to be some sort of legal frameworks. We cannot allow the technology to go too much ahead of us. Ahead of our values, ahead of what we think is acceptable.
The issue of consent continues to be one of the most difficult and challenging matters when it deals with technology, just giving somebody notice does not mean that it’s enough. To me consent has to be informed. They have to understand the consequences of what it means. And not just to say, well, we put a sign up and this was enough. We told people, and if they did not want to, they could have gone anywhere.
And I also find that there is, it is so easy to get seduced by flashy technological features that might give us a short-term advantage in our lives. And then down the line, we recognize that we’ve given up something that was too precious. And by that point in time, we have desensitized the population and we get to a point where we cannot pull back. That’s what I’m worried about. I’m worried about the fact that face recognition through the work of Facebook and Apple and others. I’m not saying all of it is illegitimate. A lot of it is legitimate.
We’ve arrived at a point where the general public may have become blasé and may become desensitized because they see it everywhere. And maybe in 20 years, you step out of your house. You will no longer have the expectation that you wouldn’t be not. It will not be recognized by dozens of people you cross along the way. I think at that point in time that the public will be very alarmed because the media will start reporting on cases where people were stalked. People were targeted, people were even selected based on their net worth in the street and kidnapped. I think that’s a lot of responsibility on our hands.
And so I think the question of consent will continue to haunt the industry. And until that question is going to be a result, maybe it won’t be resolved. I think we need to establish limitations on what can be done with this technology.
My career also has taught me that being ahead too much is not a good thing because face recognition, as we know it today, was actually invented in 1994. But most people think that it was invented by Facebook and the machine learning algorithms, which are now proliferating all over the world. I basically, at some point in time, I had to step down as being a public CEO because I was curtailing the use of technology that my company was going to be promoting because the fear of negative consequences to humanity. So I feel scientists need to have the courage to project into the future and see the consequences of their work. I’m not saying they should stop making breakthroughs. No, you should go full force, make more breakthroughs, but we should also be honest with ourselves and basically alert the world and the policymakers that this breakthrough has pluses and has minuses. And therefore, in using this technology, we need some sort of guidance and frameworks to make sure it’s channeled for a positive application and not negative.
Jennifer: I Was There When… is an oral history project featuring the stories of people who have witnessed or created breakthroughs in artificial intelligence and computing.
Do you have a story to tell? Know someone who does? Drop us an email at email@example.com.
Jennifer: This episode was taped in New York City in December of 2020 and produced by me with help from Anthony Green and Emma Cillekens. We’re edited by Michael Reilly and Mat Honan. Our mix engineer is Garret Lang… with sound design and music by Jacob Gorski.
Load up the website This Person Does Not Exist and it’ll show you a human face, near-perfect in its realism yet totally fake. Refresh and the neural network behind the site will generate another, and another, and another. The endless sequence of AI-crafted faces is produced by a generative adversarial network (GAN)—a type of AI that learns to produce realistic but fake examples of the data it is trained on.
But such generated faces—which are starting to be used in CGI movies and ads—might not be as unique as they seem. In a paper titled This Person (Probably) Exists, researchers show that many faces produced by GANs bear a striking resemblance to actual people who appear in the training data. The fake faces can effectively unmask the real faces the GAN was trained on, making it possible to expose the identity of those individuals. The work is the latest in a string of studies that call into doubt the popular idea that neural networks are “black boxes” that reveal nothing about what goes on inside.
To expose the hidden training data, Ryan Webster and his colleagues at the University of Caen Normandy in France used a type of attack called a membership attack, which can be used to find out whether certain data was used to train a neural network model. These attacks typically take advantage of subtle differences between the way a model treats data it was trained on—and has thus seen thousands of times before—and unseen data.
For example, a model might identify a previously unseen image accurately, but with slightly less confidence than one it was trained on. A second, attacking model can learn to spot such tells in the first model’s behavior and use them to predict when certain data, such as a photo, is in the training set or not.
Such attacks can lead to serious security leaks. For example, finding out that someone’s medical data was used to train a model associated with a disease might reveal that this person has that disease.
Webster’s team extended this idea so that instead of identifying the exact photos used to train a GAN, they identified photos in the GAN’s training set that were not identical but appeared to portray the same individual—in other words, faces with the same identity. To do this, the researchers first generated faces with the GAN and then used a separate facial-recognition AI to detect whether the identity of these generated faces matched the identity of any of the faces seen in the training data.
The results are striking. In many cases, the team found multiple photos of real people in the training data that appeared to match the fake faces generated by the GAN, revealing the identity of individuals the AI had been trained on.
The work raises some serious privacy concerns. “The AI community has a misleading sense of security when sharing trained deep neural network models,” says Jan Kautz, vice president of learning and perception research at Nvidia.
In theory this kind of attack could apply to other data tied to an individual, such as biometric or medical data. On the other hand, Webster points out that people could also use the technique to check whether their data has been used to train an AI without their consent.
Artists could find out whether their work had been used to train a GAN in a commercial tool, he says: “You could use a method such as ours for evidence of copyright infringement.”
The process could also be used to make sure GANs don’t expose private data in the first place. The GAN could check whether its creations resembled real examples in its training data, using the same technique developed by the researchers, before releasing them.
Yet this assumes that you can get hold of that training data, says Kautz. He and his colleagues at Nvidia have come up with a different way to expose private data, including images of faces and other objects, medical data, and more, that does not require access to training data at all.
Instead, they developed an algorithm that can re-create the data that a trained model has been exposed to by reversing the steps that the model goes through when processing that data. Take a trained image-recognition network: to identify what’s in an image, the network passes it through a series of layers of artificial neurons. Each layer extracts different levels of information, from edges to shapes to more recognizable features.
Kautz’s team found that they could interrupt a model in the middle of these steps and reverse its direction, re-creating the input image from the internal data of the model. They tested the technique on a variety of common image-recognition models and GANs. In one test, they showed that they could accurately re-create images from ImageNet, one of the best known image recognition data sets.
As in Webster’s work, the re-created images closely resemble the real ones. “We were surprised by the final quality,” says Kautz.
The researchers argue that this kind of attack is not simply hypothetical. Smartphones and other small devices are starting to use more AI. Because of battery and memory constraints, models are sometimes only half-processed on the device itself and sent to the cloud for the final computing crunch, an approach known as split computing. Most researchers assume that split computing won’t reveal any private data from a person’s phone because only the model is shared, says Kautz. But his attack shows that this isn’t the case.
Kautz and his colleagues are now working to come up with ways to prevent models from leaking private data. We wanted to understand the risks so we can minimize vulnerabilities, he says.
Even though they use very different techniques, he thinks that his work and Webster’s complement each other well. Webster’s team showed that private data could be found in the output of a model; Kautz’s team showed that private data could be revealed by going in reverse, re-creating the input. “Exploring both directions is important to come up with a better understanding of how to prevent attacks,” says Kautz.
Sometime in mid-2019, a police contractor in the Chinese city of Kuitun tapped a young college student from the University of Washington on the shoulder as she walked through a crowded market intersection. The student, Vera Zhou, didn’t notice the tapping at first because she was listening to music through her earbuds as she weaved through the crowd. When she turned around and saw the black uniform, the blood drained from her face. Speaking in Chinese, Vera’s native language, the police officer motioned her into a nearby People’s Convenience Police Station—one of more than 7,700 such surveillance hubs that now dot the region.
On a monitor in the boxy gray building, she saw her face surrounded by a yellow square. On other screens she saw pedestrians walking through the market, their faces surrounded by green squares. Beside the high-definition video still of her face, her personal data appeared in a black text box. It said that she was Hui, a member of a Chinese Muslim group that makes up around 1 million of the population of 15 million Muslims in Northwest China. The alarm had gone off because she had walked beyond the parameters of the policing grid of her neighborhood confinement. As a former detainee in a re-education camp, she was not officially permitted to travel to other areas of town without explicit permission from both her neighborhood watch unit and the Public Security Bureau. The yellow square around her face on the screen indicated that she had once again been deemed a “pre-criminal” by the digital enclosure system that held Muslims in place. Vera said at that moment she felt as though she could hardly breathe.
Kuitun is a small city of around 285,000 in Xinjiang’s Tacheng Prefecture, along the Chinese border with Kazakhstan. Vera had been trapped there since 2017 when, in the middle of her junior year as a geography student at the University of Washington (where I was an instructor), she had taken a spur-of-the-moment trip back home to see her boyfriend. After a night at a movie theater in the regional capital Ürümchi, her boyfriend received a call asking him to come to a local police station. There, officers told him they needed to question his girlfriend: they had discovered some suspicious activity in Vera’s internet usage, they said. She had used a virtual private network, or VPN, in order to access “illegal websites,” such as her university Gmail account. This, they told her later, was a “sign of religious extremism.”
It took some time for what was happening to dawn on Vera. Perhaps since her boyfriend was a non-Muslim from the majority Han group and they did not want him to make a scene, at first the police were quite indirect about what would happen next. They just told her she had to wait in the station.
When she asked if she was under arrest, they refused to respond.
“Just have a seat,” they told her. By this time she was quite frightened, so she called her father back in her hometown and told him what was happening. Eventually, a police van pulled up to the station: She was placed in the back, and once her boyfriend was out of sight, the police shackled her hands behind her back tightly and shoved her roughly into the back seat.
Vera Zhou didn’t think the war on terror had anything to do with her. She considered herself a non-religious fashionista who favored chunky earrings and dressing in black. She had gone to high school near Portland, Oregon, and was on her way to becoming an urban planner at a top-ranked American university. She had planned to reunite with her boyfriend after graduation and have a career in China, where she thought of the economy as booming. She had no idea that a new internet security law had been implemented in her hometown and across Xinjiang at the beginning of 2017, and that this was how extremist “pre-criminals,” as state authorities referred to them, were being identified for detention. She did not know that a newly appointed party secretary of the region had given a command to “round up everyone who should be rounded up” as part of the “People’s War.”
Now, in the back of the van, she felt herself losing control in a wave of fear. She screamed, tears streaming down her face, “Why are you doing this? Doesn’t our country protect the innocent?” It seemed to her like it was a cruel joke, like she had been given a role in a horror movie, and that if she just said the right things they might snap out of it and realize it was all a mistake.
For the next few months, Vera was held with 11 other Muslim minority women in a second-floor cell in a former police station on the outskirts of Kuitun. Like Vera, others in the room were also guilty of cyber “pre-crimes.” A Kazakh woman had installed WhatsApp on her phone in order to contact business partners in Kazakhstan. A Uyghur woman who sold smartphones at a bazaar had allowed multiple customers to register their SIM cards using her ID card.
Around April 2018, without warning, Vera and several other detainees were released on the provision that they report to local social stability workers on a regular basis and not try to leave their home neighborhoods.
Whenever her social stability worker shared something on social media, Vera was always the first person to support her by liking it and posting it to her own account.
Every Monday, her probation officer required that Vera go to a neighborhood flag-raising ceremony and participate by loudly singing the Chinese national anthem and making statements pledging her loyalty to the Chinese government. By this time, due to widely circulated reports of detention for cyber-crimes in the small town, it was known that online behavior could be detected by the newly installed automated internet surveillance systems. Like everyone else, Vera recalibrated her online behavior. Whenever the social stability worker assigned to her shared something on social media, Vera was always the first person to support her by liking it and posting it on her own account. Like everyone else she knew, she started to “spread positive energy” by actively promoting state ideology.
After she was back in her neighborhood, Vera felt that she had changed. She thought often about the hundreds of detainees she had seen in the camp. She feared that many of them would never be allowed out since they didn’t know Chinese and had been practicing Muslims their whole lives. She said her time in the camp also made her question her own sanity. “Sometimes I thought maybe I don’t love my country enough,” she told me. “Maybe I only thought about myself.”
But she also knew that what had happened to her was not her fault. It was the result of Islamophobia being institutionalized and focused on her. And she knew with absolute certainty that an immeasurable cruelty was being done to Uyghurs and Kazakhs because of their ethno-racial, linguistic, and religious differences.
“I just started to stay home all the time”
Like all detainees, Vera had been subjected to a rigorous biometric data collection that fell under the population-wide assessment process called “physicals for all,” before she was taken to the camps. The police had scanned Vera’s face and irises, recorded her voice signature, and collected her blood, fingerprints, and DNA—adding this precise high-fidelity data to an immense dataset that was being used to map the behavior of the population of the region. They had also taken her phone away to have it and her social media accounts scanned for Islamic imagery, connections to foreigners, and other signs of “extremism.” Eventually they gave it back, but without any of the US-made apps like Instagram.
For several weeks, she began to find ways around the many surveillance hubs that had been built every several hundred meters. Outside of high-traffic areas many of them used regular high-definition surveillance cameras that could not detect faces in real time. Since she could pass as Han and spoke standard Mandarin, she would simply tell the security workers at checkpoints that she forgot her ID and would write down a fake number. Or sometimes she would go through the exit of the checkpoint, “the green lane,” just like a Han person, and ignore the police.
One time, though, when going to see a movie with a friend, she forgot to pretend that she was Han. At a checkpoint at the theater she put her ID on the scanner and looked into the camera. Immediately an alarm sounded and the mall police contractors pulled her to the side. As her friend disappeared into the crowd, Vera worked her phone frantically to delete her social media account and erase the contacts of people who might be detained because of their association with her. “I realized then that it really wasn’t safe to have friends. I just started to stay at home all the time.”
Eventually, like many former detainees, Vera was forced to work as an unpaid laborer. The local state police commander in her neighborhood learned that she had spent time in the United States as a college student, so he asked Vera’s probation officer to assign her to tutor his children in English.
“I thought about asking him to pay me,” Vera remembers. “But my dad said I need to do it for free. He also sent food with me for them, to show how eager he was to please them.”
The commander never brought up any form of payment.
In October 2019, Vera’s probation officer told her that she was happy with Vera’s progress and she would be allowed to continue her education back in Seattle. She was made to sign vows not to talk about what she had experienced. The officer said, “Your father has a good job and will soon reach retirement age. Remember this.”
In the fall of 2019, Vera returned to Seattle. Just a few months later, across town, Amazon—the world’s wealthiest technology company—received a shipment of 1,500 heat-mapping camera systems from the Chinese surveillance company Dahua. Many of these systems, which were collectively worth around $10 million, were to be installed in Amazon warehouses to monitor the heat signatures of employees and alert managers if workers exhibited covid symptoms. Other cameras included in the shipment were distributed to IBM and Chrysler, among other buyers.
Dahua was just one of the Chinese companies that was able to capitalize on the pandemic. As covid began to move beyond the borders of China in early 2020, a group of medical research companies owned by the Beijing Genomics Institute, or BGI, radically expanded, establishing 58 labs in 18 countries and selling 35 million covid-19 tests to more than 180 countries. In March 2020, companies such as Russell Stover Chocolates and US Engineering, a Kansas City, Missouri–based mechanical contracting company, bought $1.2 million worth of tests and set up BGI lab equipment in University of Kansas Medical System facilities.
And while Dahua sold its equipment to companies like Amazon, Megvii, one of its main rivals, deployed heat-mapping systems to hospitals, supermarkets, campuses in China, and to airports in South Korea and the United Arab Emirates.
Yet, while the speed and intention of this response to protect workers in the absence of an effective national-level US response was admirable, these Chinese companies are also tied up in forms of egregious human rights abuses.
Dahua is one of the major providers of “smart camp” systems that Vera Zhou experienced in Xinjiang (the company says its facilities are supported by technologies such as “computer vision systems, big data analytics and cloud computing”). In October 2019, both Dahua and Megvii were among eight Chinese technology firms placed on a list that blocks US citizens from selling goods and services to them (the list, which is intended to prevent US firms from supplying non-US firms deemed a threat to national interests, prevents Amazon from selling to Dahua, but not buying from them). BGI’s subsidiaries in Xinjiang were placed on the US no-trade list in July 2020.
Amazon’s purchase of Dahua heat-mapping cameras recalls an older moment in the spread of global capitalism that was captured by historian Jason Moore’s memorable turn of phrase: “Behind Manchester stands Mississippi.”
What did Moore mean by this? In his rereading of Friedrich Engels’s analysis of the textile industry that made Manchester, England, so profitable, he saw that many aspects of the British Industrial Revolution would not have been possible without the cheap cotton produced by slave labor in the United States. In a similar way, the ability of Seattle, Kansas City, and Seoul to respond as rapidly as they did to the pandemic relies in part on the way systems of oppression in Northwest China have opened up a space to train biometric surveillance algorithms.
The protections of workers during the pandemic depends on forgetting about college students like Vera Zhou. It means ignoring the dehumanization of thousands upon thousands of detainees and unfree workers.
At the same time, Seattle also stands before Xinjiang.
Amazon has its own role in involuntary surveillance that disproportionately harms ethno-racial minorities given its partnership with US Immigration and Customs Enforcement to target undocumented immigrants and its active lobbying efforts in support of weak biometric surveillance regulation. More directly, Microsoft Research Asia, the so-called “cradle of Chinese AI,” has played an instrumental role in the growth and development of both Dahua and Megvii.
Chinese state funding, global terrorism discourse, and US industry training are three of the primary reasons why a fleet of Chinese companies now leads the world in face and voice recognition. This process was accelerated by a war on terror that centered on placing Uyghurs, Kazakhs, and Hui within a complex digital and material enclosure, but it now extends throughout the Chinese technology industry, where data-intensive infrastructure systems produce flexible digital enclosures throughout the nation, though not at the same scale as in Xinjiang.
China’s vast and rapid response to the pandemic has further accelerated this process by rapidly implementing these systems and making clear that they work. Because they extend state power in such sweeping and intimate ways, they can effectively alter human behavior.
The Chinese approach to the pandemic is not the only way to stop it, however. Democratic states like New Zealand and Canada, which have provided testing, masks, and economic assistance to those forced to stay home, have also been effective. These nations make clear that involuntary surveillance is not the only way to protect the well-being of the majority, even at the level of the nation.
In fact, numerous studies have shown that surveillance systems support systemic racism and dehumanization by making targeted populations detainable. The past and current US administrations’ use of the Entity List to halt sales to companies like Dahua and Megvii, while important, is also producing a double standard, punishing Chinese firms for automating racialization while funding American companies to do similar things.
Increasing numbers of US-based companies are attempting to develop their own algorithms to detect racial phenotypes, though through a consumerist approach that is premised on consent. By making automated racialization a form of convenience in marketing things like lipstick, companies like Revlon are hardening the technical scripts that are available to individuals.
As a result, in many ways race continues to be an unthought part of how people interact with the world. Police in the United States and in China think about automated assessment technologies as tools they have to detect potential criminals or terrorists. The algorithms make it appear normal that Black men or Uyghurs are disproportionately detected by these systems. They stop the police, and those they protect, from recognizing that surveillance is always about controlling and disciplining people who do not fit into the vision of those in power. The world, not China alone, has a problem with surveillance.
To counteract the increasing banality, the everydayness, of automated racialization, the harms of biometric surveillance around the world must first be made apparent. The lives of the detainable must be made visible at the edge of power over life. Then the role of world-class engineers, investors, and public relations firms in the unthinking of human experience, in designing for human reeducation, must be made clear. The webs of interconnection—the way Xinjiang stands behind and before Seattle— must be made thinkable.
—This story is an edited excerpt from In The Camps: China’s High-Tech Penal Colony, by Darren Byler (Columbia Global Reports, 2021.) Darren Byler is an assistant professor of international studies at Simon Fraser University, focused on the technology and politics of urban life in China.
The plummeting costs of renewables, the growing strength of the clean energy sector, and the rising influence of activists have begun to shift the politics of climate action in the US, panelists argued during MIT Technology Review’s annual EmTech conference last week.
Those forces allowed President Joe Biden to put climate change at the center of his campaign and helped build momentum behind the portfolio of clean energy policies and funding measures in the infrastructure and reconciliation packages under debate in the US Congress, said Bill McKibben, the climate author and founder of the environmental activist group 350.org, during the September 30 session.
You can view the full video of the session below:
The measures will mark the first major climate laws in the nation if they pass in something close to their current form. Most notably, they include the Clean Electricity Performance Program, which uses payments and penalties to encourage utilities to boost their share of electricity from carbon-free sources (read our earlier explainer here).
Other speakers on the panel, titled Cleaning Up the Power Sector, advised on the creation of that program. They included Leah Stokes, an associate professor focused on energy and climate policy at the University of California, Santa Barbara; and Jesse Jenkins, an assistant professor and energy systems researcher at Princeton University.
They argued during the session that the legislation, designed to ensure that 80% of the nation’s electricity comes from clean sources by 2030, is more effective and politically feasible than competing approaches, including the carbon taxes favored by many economists.
“When … we say to people, ‘We’re going to make it more expensive for you to use an essential good, which is energy,’ that isn’t very popular,” Stokes said. “That theory of political change has run up against the reality of income inequality in this country.”
“The different paradigm is to say, ‘Rather than making it more expensive to use fossil fuels, let’s help make it cheaper to use the clean stuff,’” she added.
But it remains to be seen whether the clean electricity measure and the other climate provisions will pass, and in what form. Even some Democratic senators in the narrowly divided Congress have pushed back on what they portray as excessive spending in the bills.
For all the progress on climate issues, well-funded and politically influential utility and fossil-fuel interests continue to impede efforts to overhaul energy systems at the speed and scale required, stressed Julian Brave Noisecat, vice president of policy and strategy at Data for Progress, who moderated the session.
“These interests are remarkably entrenched and remain so despite significant grassroots opposition,” he said.
If legislators defang the key climate provisions, it will slow the shift to clean energy in the US and undermine the negotiating power of Biden’s climate czar, John Kerry, in the UN climate conference early next month, McKibben said. “Rest assured that will limit everybody else’s ambition, too,” he said.
The moon may have been more volcanically active than we realized.
Lunar samples that China’s Chang’e 5 spacecraft brought to Earth are revealing new clues about volcanoes and lava plains on the moon’s surface. In a study published today in Science, researchers describe the youngest lava samples ever collected on the moon.
The samples were taken from Oceanus Procellarum, a region known for having had huge lakes of lava that have since solidified into basalt rock. The sample they analyzed most closely indicates that the moon experienced an era of volcanic excitement that lasted longer than scientists previously thought.
Researchers compared fragments from within that same sample to determine when molten magma had crystallized. The results surprised them. In their early lives, small, rocky bodies like the moon typically cool faster than larger ones. But their observations showed that wasn’t necessarily the case for our closest heavenly neighbor.
“The expectation is that the moon is so small that it will probably be dead very quickly after formation,” says Alexander Nemchin, a professor of geology at Curtin University in Perth, Australia and a co-author of the study. “This young sample contradicts this concept, and in some way, we need to rethink our view of the moon a little bit, or maybe quite a lot.”
Using isotope dating and a technique based on lunar crater chronology, which involves estimating the age of an object in space in part by counting the craters on its surface, the team determined that lava flowed in Oceanus Procellarum as recently as 2 billion years ago.
Nemchin says there’s no evidence that radioactive elements that generate heat (such as potassium, thorium, and uranium) exist in high concentrations below the moon’s mantle. That means those elements probably didn’t cause these lavas flows, as scientists had thought. Now, they will have to look for other explanations for how the flows formed.
The moon’s volcanic history could teach us more about the Earth’s. According to the giant impact theory, the moon may just be a chunk of Earth that got knocked loose when our planet collided with another one.
“Anytime we get new or improved information about the age of the stuff on the moon, that has a knock-on effect for not just understanding the universe, but volcanism and even just general geology on other planets,” says Paul Byrne, an associate professor of earth and planetary sciences at Washington University in St. Louis, who was not involved in the study.
Volcanic activity not only shaped how the moon looks—those old lava beds are visible to the naked eye today as huge dark patches on the moon’s surface—but may even help answer the question of whether we’re alone in the universe, Byrne says.
“The search for extraterrestrial life in part requires understanding habitability,” Byrne says. Volcanic activity plays a role in the cultivation of atmospheres and oceans, key components for life. But what exactly these new findings tell us about potential life elsewhere remains to be seen.