Jan. 15, 2021 | BY C. Todd Lopez , DOD News
While the defense industrial base is healthy, there are single points of failure and dependencies on overseas suppliers that must be addressed, the undersecretary of defense for acquisition and sustainment said.
“Over a period of years, we have offshored many, many sources of supply,” Ellen M. Lord said during an online discussion Thursday with the Hudson Institute. “It’s not for one reason; it’s for a variety of reasons, whether it be regulations, whether it be labor costs, whether it be government support of different industries.”
The deindustrialization of the U.S. over the last 50 years, the end of the Cold War and the focus it gave the U.S. on defeating the Soviet Union, digital technology and the rise of China have all created challenges to national defense.
In the newly released Fiscal Year 2020 Industrial Capabilities Report to Congress, Lord said the department looked into those challenges and their effects on the defense industrial base and proposed key actions to address them.
“What we did in this report was try to really capture those risks, look at the opportunities and come up with some specific steps that we can really take to reform how we go about looking at that supply chain and, in the endgame, really get capability downrange to the warfighter as quickly and cost-effectively as possible,” she said.
First, Lord said, the U.S. must re-shore more of its industrial base — bring it back to the U.S. and U.S. allies.
“There are a couple [of] key areas there with shipbuilding, as well as microelectronics — fundamental to our capability,” she said.
Development of a modern manufacturing and engineering workforce along with a more robust research and development base is also critical. Declines in U.S. science, technology, engineering and mathematics education and industrial jobs hurt the ability of the defense industrial base to innovate, Lord said.
“We want to make sure that we have modern manufacturing and engineering expertise,” she said. “We do not have nearly the number of scientists and engineers as China has. We need to make sure that we develop our talent to be able to leverage on these critical areas.”
The department must also reform and modernize the defense acquisition process to better meet the realities of the 21st century, Lord said.
“We’ve started with a number of those, but there’s much further to go,” she said. “We want to make sure that our traditional defense industrial base is widened to get all of those creative, innovative companies. We know the small companies are where most of our innovation comes from, and the barriers to entry — sometimes to getting into the Department of Defense — are rather onerous.”
Lord said part of modernizing and reforming defense acquisition is the recently announced Trusted Capital Marketplace, which will match potential defense suppliers — many of them small companies that have never done business with DOD — with the investors they need to keep operating and innovating. The Trusted Capital Marketplace will vet investors to ensure foreign ownership, control and influence is nonexistent.
Finally, Lord said, the department must find new ways to partner private sector innovation with public sector resources and demand.
“We, as the government, I believe, need to work with industry to make sure that we diversify that industrial base and, also, that we much more quickly translate technological capability into features of current platforms and weapon systems, as well as incorporate it in new ones,” Lord said.
After Amazon’s three-week re:Invent conference, companies building AI applications may have the impression that AWS is the only game in town. Amazon announced improvements to SageMaker, its machine learning (ML) workflow service, and to Edge Manager — improving AWS’ ML capabilities on the edge at a time when serving the edge is considered increasingly critical for enterprises. Moreover, the company touted big customers like Lyft and Intuit.
But Mohammed Farooq believes there is a better alternative to the Amazon hegemon: an open AI platform that doesn’t have any hooks back to the Amazon cloud. Until earlier this year, Farooq led IBM’s Hybrid multi-cloud strategy, but he recently left to join the enterprise AI company Hypergiant.Ad: (2:07)Skip AdMicrosoft says hackers viewed source code, didn’t change it, and other top stories in technology from January 05, 2021.
Here is our Q&A with Farooq, who is Hypergiant’s chair, global chief technology officer, and general manager of products. He has skin in the game and makes an interesting argument for open AI.
VentureBeat: With Amazon’s momentum, isn’t it game over for any other company hoping to be a significant service provider of AI services, or at the least for any competitor not named Google or Microsoft?
Mohammed Farooq: On the one hand, for the last three to five-plus years, AWS has delivered outstanding capabilities with SageMaker (Autopilot, Data Wrangler) to enable accessible analytics and ML pipelines for technical and nontechnical users. Enterprises have built strong-performing AI models with these AWS capabilities.
On the other hand, the enterprise production throughput of performing AI models is very low. The low throughput is a result of the complexity of deployment and operations management of AI models within consuming production applications that are running on AWS and other cloud/datacenter and software platforms.
Enterprises have not established an operations management system — something referred to within the industry as ModelOps. ModelOps are required and should have things like lifecycle processes, best practices, and business management controls. These are necessary to evolve the AI models and data changes in the context of the underlying heterogeneous software and infrastructure stacks currently in operation.
AWS does a solid job of automating an AI ModelOps process within the AWS ecosystem. However, running enterprise ModelOps, as well as DevOps and DataOps, will need not only AWS, but multiple other cloud, network, and edge architectures. AWS is great as far as it goes, but what is required is seamless integration with enterprise ModelOps, hybrid/multi-cloud infrastructure architecture, and IT operations management system.
Failures in experimentation are the result of average time needed to create a model. Today, successful AI models that deliver value and that business leaders trust take 6-12 months to build. According to the Deloitte MLOps Industrialized AI Report (released in December 2020), an average AI team can build and deploy, at best, two AI models in a year. At this rate, industrializing and scaling AI in the enterprise will be a challenge. An enterprise ModelOps process integrated with the rest of enterprise IT is required to speed up and scale AI solutions in the enterprise.
I would argue that we are on the precipice of a new era in artificial intelligence — one where AI will not only predict but recommend and take autonomous actions. But machines are still taking actions based on AI models that are poorly experimented with and fail to meet defined business goals (key performance indicators).
VentureBeat: So what is it that holds the industry back? Or asked a different way, what is that holds Amazon back from doing this?
Farooq: To improve development and performance of AI models, I believe we must address three challenges that are slowing down the AI model development, deployment, and production management in the enterprise. Amazon and other big players haven’t been able to address these challenges yet. They are:
AI data: This is where everything starts and ends in performant AI models. Microsoft [Azure] Purview is a direct attempt to solve the data problems of the enterprise data governance umbrella. This will provide AI solution teams (consumers) valuable and trustworthy data.
AI operations processes: These are enabled for development and deployment in the cloud (AWS) and do not extend or connect to the enterprise DevOps, DataOps, and ITOps processes. AIOps processes to deploy, operate, manage, and govern need to be automated and integrated into enterprise IT processes. This will industrialize AI in the enterprise. It took DevOps 10 years to establish CI/CD processes and automation platforms. AI needs to leverage the assets in CI/CD and overlay the AI model lifecycle management on top of it.
AI architecture: Enterprises with native cloud and containers are accelerating on the path to hybrid and multi-cloud architectures. With edge adoption, we are moving to pure distributed architecture, which will connect the cloud and edge ecosystem. AI architecture will have to operate on distributed architectures across hybrid and multi-cloud infrastructure and data environments. AWS, Azure, Google, and VMWare are effectively moving towards that paradigm.
To develop the next phase of AI, which I am calling “industrialized AI in the enterprise,” we need to address all of these. They can only be met with an open AI platform that has an integrated operations management system.
VentureBeat: Explain what you mean by an “open“ AI platform.
Farooq: An open AI platform for ModelOps lets enterprise AI teams mix and match required AI stacks, data services, AI tools, and domain AI models for different providers. Doing so will result in powerful business solutions at speed and scale.
AWS, with all of its powerful cloud, AI, and edge offerings, has still not stitched together a ModelOps that can industrialize AI and cloud. Enterprises today are using a combination of ServiceNow, legacy systems management, DevOps tooling, and containers to bring this together. AI operations adds another layer of complexity to an already increasingly complex model.
An enterprise AI operations management system should be the master control point and system of record, intelligence, and security for all AI solutions in a federated model (AI models and data catalogs). AWS, Azure, or Google can provide data, process, and tech platforms and services to be consumed by enterprises.
But lock-in models, like those currently being offered, harm enterprise’s ability to develop core AI capabilities. Companies like Microsoft, Amazon, and Google are hampering our ability to build high-caliber solutions by constructing moats around their products and services. The path to the best technology solutions, in the service of both AI providers and consumers, is one where choice and openness is prized as a pathway to innovation.
You have seen companies articulate a prominent vision for the future of AI. But I believe they are limited because they are not going far enough to democratize AI access and usage with the current enterprise IT Ops and governance process. To move forward, we need an enterprise ModelOps process and an open AI services integration platform that industrializes AI development, deployment, operations, and governance.
Without these, enterprises will be forced to choose vertical solutions that fail to integrate with enterprise data technology architectures and IT operations management systems.
VentureBeat: Has anyone tried to build this open AI platform?
Farooq: Not really. To manage AI ModelOps, we need a more open and connected AI services ecosystem, and to get there, we need an AI services integration platform. This essentially means that we need cloud provider operations management integrated with enterprise AI operations processes and a reference architecture framework (led by CTO and IT operations).
There are two options for enterprise CIOs, CTOs, CEOs, and architects. One is vertical, and the other one is horizontal.
Dataiku, Databricks, Snowflake, C3.AI, Palantir, and many others are building these horizontal AI stack options for the enterprise. Their solutions operate on top of AWS, Google, and Azure AI. It’s a great start. However, C3.AI and Palantir are also moving towards lock-in options by using model-driven architectures.
VentureBeat: So how is the vision of what you’re building at Hypergiant different to these efforts?
Farooq: The choice is clear: We have to enable an enterprise AI stack, ModelOps tooling, and governance capabilities enabled by an open AI services integration platform. This will integrate and operate customer ModelOps and governance processes internally that can work for each business unit and AI project.
What we need is not another AI company, but rather an AI services integrator and operator layer that improves how these companies work together for enterprise business goals.
A customer should be able to use Azure solutions, MongoDB, and Amazon Aurora, depending on what best suits their needs, price points, and future agenda. What this requires is a mesh layer for AI solution providers.
VentureBeat: Can you further define this “mesh layer”? Your figure shows it is a horizontal layer, but how does it work in practice? Is it as simple as plugging in your AI solution on top, and then having access to any cloud data source underneath? And does it have to be owned by a single company? Can it be open-sourced, or somehow shared, or at least competitive?
Farooq: The data mesh layer is the core component, not only for executing the ModelOps processes across cloud, edge, and 5G, but it is also a core architectural component for building, operating, and managing autonomous distributed applications.
Currently we have cloud data lakes and data pipelines (batch or steaming) as an input to build and train AI models. However, in production, data needs to be dynamically orchestrated across datacenters, cloud, 5G, and edge end points. This will ensure that the AI models and the consuming apps at all times have the required data feeds in production to execute.
AI/cloud developers and ModelOps teams should have access to data orchestration rules and policy APIs as a single interface to design, build, and operate AI solutions across distributed environments. This API should hide the complexity of the underlying distributed environments (i.e., cloud, 5G, or edge).
In addition, we need packaging and container specs that will help DevOps and ModelOps professionals use the portability of Kubernetes to quickly deploy and operate AI solutions at scale.
These data mesh APIs and packaging technologies need to be open sourced to ensure that we establish an open AI and cloud stack architecture for enterprises and not walled gardens from big providers.
By analogy, look at what Twilio has done for communications: Twilio strengthened customer relationships across businesses by integrating many technologies in one easy-to-manage interface. Examples in other industries include HubSpot in marketing and Squarespace for website development. These companies work by providing infrastructure that simplifies the experience of the user across the tools of many different companies.
VentureBeat: When are you launching this?
Farooq: We are planning to launch a beta version of a first step of that roadmap early next year [Q1/2020].
VentureBeat: AWS has a reseller policy. Could it could crack down on any mesh layer if they wanted to?
Farooq: AWS could build and offer their own mesh layer that is tied to its cloud and that interfaces with 5G and edge platforms of its partners. But this will not help its enterprise customers accelerate the development, deployment, and management of AI and hybrid/multi-cloud solutions at speed and scale. However, collaborating with the other cloud and ISV providers, as it has done with Kubernetes (CNCF-led open source project), will benefit AWS significantly.
As further innovation on centralized cloud computing models have stalled (based on current functionality and incremental releases across AWS, Azure, and Google), the data mesh and edge native architectures is where innovation will need to happen, and a distributed (declarative and runtime) data mesh architecture is a great place for AWS to contribute and lead the industry.
The digital enterprise will be the biggest beneficiary of a distributed data mesh architecture, and this will help industrialize AI and digital platforms faster — thereby creating new economic opportunities and in return more spend on AWS and other cloud provider technologies.
VentureBeat: What impact would such a mesh-layer solution have on the leading cloud companies? I imagine it could influence user decisions on what underlying services to use. Could that middle mesh player reduce pricing for certain bundles, undercutting marketing efforts by the cloud players themselves?
Farooq: The data mesh layer will trigger massive innovation on the edge and 5G native (not cloud native) applications, middleware, and infra-architectures. This will drive the large providers to rethink their product roadmaps, architecture patterns, go-to-market offerings, partnerships, and investments.
VentureBeat: If the cloud companies see this coming, do you think they’ll be more inclined to move toward an open ecosystem more rapidly and squelch you?
Farooq: The big providers in a first or second cycle of evolution of a technology or business model will always want to build a moat and lock in enterprise clients. For example, AWS never accepted that hybrid or multi-cloud was needed. But in the second cycle of cloud adoption by VMWare clients, VMWare started to preach an enterprise-outward hybrid cloud strategy connecting to AWS, Azure, and Google.
This led AWS to launch a private cloud offering (called Outposts), which is a replica for the AWS footprint on a dedicated hardware stack that has the same offerings. AWS executes its API across AWS public and Outposts. In short, they came around.
The same will happen to edge, 5G, and distributed computing. Right now, AWS, Google, and Azure are building their distributed computing platforms. However, the power of the open source community and the innovation speed is so great, the distributed computing architecture in the next cycle and beyond will have to move to an open ecosystem.
VentureBeat: What about lock-in at the mesh-layer level? If I choose to go with Hypergiant so I can access services across clouds, and then a competing mesh player emerges that offers better prices, how easy is it to move?
Farooq: We at Hypergiant believe in an open ecosystem, and our go-to-market business model depends on being at the intersection of enterprise consumption and provider offerings. We drive consumption economics, not provider economics. This will require us to support multiple data mesh technologies and create a fabric for interoperation with a single interface to our clients. The final goal is to ensure an open ecosystem, developer, and operator ease, and value to enterprise clients so that they are able to accelerate their business and revenue strategies by leveraging the best value and the best breed of technologies. We are looking at this from the point of view of the benefits to the enterprise, not the provider.
VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the subjects of interest to you,
- our newsletters
- gated thought-leader content and discounted access to our prized events, such as Transform
- networking features, and more.
Maj. Chuck Suslowicz , Jan Kallberg , and LTC Todd Arnold
The SolarWinds breach points out the importance of having both offensive and defensive cyber force experience.
The breach is an ongoing investigation, and we will not comment on the investigation. Still, in general terms, we want to point out the exploitable weaknesses in creating two silos — OCO and DCO.
The separation of OCO and DCO, through the specialization of formations and leadership, undermines broader understanding and value of threat intelligence. The growing demarcation between OCO and DCO also have operative and tactical implications. The Multi-Domain Operations (MDO) concept emphasizes the competitive advantages that the Army — and greater Department of Defense — can bring to bear by leveraging the unique and complementary capabilities of each service.
It requires that leaders understand the capabilities their organization can bring to bear in order to achieve the maximum effect from the available resources. Cyber leaders must have exposure to a depth and the breadth of their chosen domain to contribute to MDO.
Unfortunately, within the Army’s operational cyber forces, there is a tendency to designate officers as either offensive cyber operations (OCO) or defensive cyber operations (DCO) specialists. The shortsighted nature of this categorization is detrimental to the Army’s efforts in cyberspace and stymies the development of the cyber force, affecting all soldiers.
The Army will suffer in its planning and ability to operationally contribute to MDO from a siloed officer corps unexposed to the domain’s inherent flexibility.
We consider the assumption that there is a distinction between OCO and DCO to be flawed. It perpetuates the idea that the two operational types are doing unrelated tasks with different tools, and that experience in one will not improve performance in the other. We do not see such a rigid distinction between OCO and DCO competencies. In fact, most concepts within the cyber domain apply directly to both types of operations.
The argument that OCO and DCO share competencies is not new; the iconic cybersecurity expert Dan Geer first pointed out that cyber tools are dual-use nearly two decades ago, and continues to do so. A tool that is valuable to a network defender can prove equally valuable during an offensive operation, and vice versa.
For example, a tool that maps a network’s topology is critical for the network owner’s situational awareness. The tool could also be effective for an attacker to maintain situational awareness of a target network. The dual-use nature of cyber tools requires cyber leaders to recognize both sides of their utility.
So, a tool that does a beneficial job of visualizing key terrain to defend will create a high-quality roadmap for a devastating attack. Limiting officer experiences to only one side of cyberspace operations (CO) will limit their vision, handicap their input as future leaders, and risk squandering effective use of the cyber domain in MDO.
An argument will be made that “deep expertise is necessary for success” and that officers should be chosen for positions based on their previous exposure. This argument fails on two fronts. First, the Army’s decades of experience in officers’ development have shown the value of diverse exposure in officer assignments. Other branches already ensure officers experience a breadth of assignments to prepare them for senior leadership.
Second, this argument ignores the reality of “challenging technical tasks” within the cyber domain. As cyber tasks grow more technically challenging, the tools become more common between OCO and DCO, not less common. For example, two of the most technically challenging tasks, reverse engineering of malware (DCO) and development of exploits (OCO), use virtually identical toolkits.
An identical argument can be made for network defenders preventing adversarial access and offensive operators seeking to gain access to adversary networks. Ultimately, the types of operations differ in their intent and approach, but significant overlap exists within their technical skillsets.
Experience within one fragment of the domain directly translates to the other and provides insight into an adversary’s decision-making processes. This combined experience provides critical knowledge for leaders, and lack of experience will undercut the Army’s ability to execute MDO effectively. Defenders with OCO experience will be better equipped to identify an adversary’s most likely and most devastating courses of action within the domain. Similarly, OCO planned by leaders with DCO experience are more likely to succeed as the planners are better prepared to account for potential adversary countermeasures.
In both cases, the cross-pollination of experience improves the Army’s ability to leverage the cyber domain and improve its effectiveness. Single tracked officers may initially be easier to integrate or better able to contribute on day one of an assignment. However, single-tracked officers will ultimately bring far less to the table than officers experienced in both sides of the domain due to the multifaceted cyber environment in MDO.
Maj. Chuck Suslowicz is a research scientist in the Army Cyber Institute at West Point and an instructor in the U.S. Military Academy’s Department of Electrical Engineering and Computer Science (EECS). Dr. Jan Kallberg is a research scientist at the Army Cyber Institute at West Point and an assistant professor at the U.S. Military Academy. LTC Todd Arnold is a research scientist in the Army Cyber Institute at West Point and assistant professor in U.S. Military Academy’s Department of Electrical Engineering and Computer Science (EECS.) The views expressed are those of the authors and do not reflect the official policy or position of the Army Cyber Institute at West Point, the U.S. Military Academy or the Department of Defense.
The letter, signed by nine members of Congress, sends an important signal about how regulators will scrutinize tech giants.
by Karen Hao December 17, 2020
The letter, signed by nine members of Congress, sends an important signal about how regulators will scrutinize tech giants.
Nine members of the US Congress have sent a letter to Google asking it to clarify the circumstances around its former ethical AI co-lead Timnit Gebru’s forced departure. Led by Representative Yvette Clarke and Senator Ron Wyden, and co-signed by Senators Elizabeth Warren and Cory Booker, the letter sends an important signal about how Congress is scrutinizing tech giants and thinking about forthcoming regulation.
Gebru, a leading voice in AI ethics and one of a small handful of Black women at Google, was unceremoniously dismissed two weeks ago, after a protracted disagreement over a research paper. The paper detailed the risks of large AI language models trained on enormous amounts of text data, which are a core line of Google’s research, powering various products including its lucrative Google Search.
Citing MIT Technology Review’s coverage, the letter raises three issues: the potential for bias in large language models, the growing corporate influence over AI research, and Google’s lack of diversity. It asks Google CEO Sundar Pichai for a concrete plan on how it will address each of these, as well as for its current policy on reviewing research and details on its ongoing investigation into Gebru’s exit (Pichai committed to this investigation in an internal memo, first published by Axios). “As Members of Congress actively seeking to enhance AI research, accountability, and diversity through legislation and oversight, we respectfully request your request to the following inquiries,” the letter states.
In April of 2019, Clarke and Wyden introduced a bill, the Algorithmic Accountability Act, that would require big companies to audit their machine-learning systems for bias and take corrective action in a timely manner if such issues were identified. It would also require those companies to audit all processes involving sensitive data—including personally identifiable, biometric, and genetic information—for privacy and security risks. At the time, many legal and technology experts praised the bill for its nuanced understanding of AI and data-driven technologies. “Great first step,” wrote Andrew Selbst, an assistant professor at the University California Los Angeles School of Law, on Twitter. “Would require documentation, assessment, and attempts to address foreseen impacts. That’s new, exciting & incredibly necessary.”
The latest letter doesn’t tie directly to the Algorithmic Accountability Act, but it is part of the same move by certain congressional members to craft legislation that would mitigate AI bias and the other harms of data-driven, automated systems. Notably, it comes amid mounting pressure for antitrust regulation. Earlier this month, the US Federal Trade Commission filed an antitrust lawsuit against Facebook for its “anticompetitive conduct and unfair methods of competition.” Over the summer, House Democrats published a 449-page report on Big Tech’s monopolistic practices.
The letter also comes in the context of rising geopolitical tensions. As US-China relations have reached an all-time low during the pandemic, US officials have underscored the strategic importance of emerging technologies like AI and 5G. The letter also raises this dimension, acknowledging Google’s leadership in AI and its role in maintaining US leadership. But it makes clear that this should not undercut regulatory action, a line of argument popularized by Facebook CEO Mark Zuckerberg. “To ensure America wins the AI race,” the letter says, “American technology companies must not only lead the world in innovation; they must also ensure such innovation reflects our nation’s values.”
“Our letter should put everyone in the technology sector, not just Google, on notice that we are paying attention,” said Clarke in a statement to MIT Technology Review. “Ethical AI is the battleground for the future of civil rights. Our concerns about recent developments aren’t just about one person; they are about what the 21st century will look like if academic freedom and inclusion take a back seat to other priorities. We can’t mitigate algorithmic bias if we impede those who seek to research and study it.”
Tech giants dominate research but the line between real breakthrough and product showcase can be fuzzy. Some scientists have had enough.
by Will Douglas Heaven
November 12, 2020
Last month Nature published a damning response written by 31 scientists to a study from Google Health that had appeared in the journal earlier this year. Google was describing successful trials of an AI that looked for signs of breast cancer in medical images. But according to its critics, the Google team provided so little information about its code and how it was tested that the study amounted to nothing more than a promotion of proprietary tech.
“We couldn’t take it anymore,” says Benjamin Haibe-Kains, the lead author of the response, who studies computational genomics at the University of Toronto. “It’s not about this study in particular—it’s a trend we’ve been witnessing for multiple years now that has started to really bother us.”
Haibe-Kains and his colleagues are among a growing number of scientists pushing back against a perceived lack of transparency in AI research. “When we saw that paper from Google, we realized that it was yet another example of a very high-profile journal publishing a very exciting study that has nothing to do with science,” he says. “It’s more an advertisement for cool technology. We can’t really do anything with it.”
Science is built on a bedrock of trust, which typically involves sharing enough details about how research is carried out to enable others to replicate it, verifying results for themselves. This is how science self-corrects and weeds out results that don’t stand up. Replication also allows others to build on those results, helping to advance the field. Science that can’t be replicated falls by the wayside.
At least, that’s the idea. In practice, few studies are fully replicated because most researchers are more interested in producing new results than reproducing old ones. But in fields like biology and physics—and computer science overall—researchers are typically expected to provide the information needed to rerun experiments, even if those reruns are rare.
AI is feeling the heat for several reasons. For a start, it is a newcomer. It has only really become an experimental science in the past decade, says Joelle Pineau, a computer scientist at Facebook AI Research and McGill University, who coauthored the complaint. “It used to be theoretical, but more and more we are running experiments,” she says. “And our dedication to sound methodology is lagging behind the ambition of our experiments.”
The problem is not simply academic. A lack of transparency prevents new AI models and techniques from being properly assessed for robustness, bias, and safety. AI moves quickly from research labs to real-world applications, with direct impact on people’s lives. But machine-learning models that work well in the lab can fail in the wild—with potentially dangerous consequences. Replication by different researchers in different settings would expose problems sooner, making AI stronger for everyone.
AI already suffers from the black-box problem: it can be impossible to say exactly how or why a machine-learning model produces the results it does. A lack of transparency in research makes things worse. Large models need as many eyes on them as possible, more people testing them and figuring out what makes them tick. This is how we make AI in health care safer, AI in policing more fair, and chatbots less hateful.
What’s stopping AI replication from happening as it should is a lack of access to three things: code, data, and hardware. According to the 2020 State of AI report, a well-vetted annual analysis of the field by investors Nathan Benaich and Ian Hogarth, only 15% of AI studies share their code. Industry researchers are bigger offenders than those affiliated with universities. In particular, the report calls out OpenAI and DeepMind for keeping code under wraps.
Then there’s the growing gulf between the haves and have-nots when it comes to the two pillars of AI, data and hardware. Data is often proprietary, such as the information Facebook collects on its users, or sensitive, as in the case of personal medical records. And tech giants carry out more and more research on enormous, expensive clusters of computers that few universities or smaller companies have the resources to access.
To take one example, training the language generator GPT-3 is estimated to have cost OpenAI $10 to $12 million—and that’s just the final model, not including the cost of developing and training its prototypes. “You could probably multiply that figure by at least one or two orders of magnitude,” says Benaich, who is founder of Air Street Capital, a VC firm that invests in AI startups. Only a tiny handful of big tech firms can afford to do that kind of work, he says: “Nobody else can just throw vast budgets at these experiments.”
The rate of progress is dizzying, with thousands of papers published every year. But unless researchers know which ones to trust, it is hard for the field to move forward. Replication lets other researchers check that results have not been cherry-picked and that new AI techniques really do work as described. “It’s getting harder and harder to tell which are reliable results and which are not,” says Pineau.
What can be done? Like many AI researchers, Pineau divides her time between university and corporate labs. For the last few years, she has been the driving force behind a change in how AI research is published. For example, last year she helped introduce a checklist of things that researchers must provide, including code and detailed descriptions of experiments, when they submit papers to NeurIPS, one of the biggest AI conferences.
Replication is its own reward
Pineau has also helped launch a handful of reproducibility challenges, in which researchers try to replicate the results of published studies. Participants select papers that have been accepted to a conference and compete to rerun the experiments using the information provided. But the only prize is kudos.
This lack of incentive is a barrier to such efforts throughout the sciences, not just in AI. Replication is essential, but it isn’t rewarded. One solution is to get students to do the work. For the last couple of years, Rosemary Ke, a PhD student at Mila, a research institute in Montreal founded by Yoshua Bengio, has organized a reproducibility challenge where students try to replicate studies submitted to NeurIPS as part of their machine-learning course. In turn, some successful replications are peer-reviewed and published in the journal ReScience.
“It takes quite a lot of effort to reproduce another paper from scratch,” says Ke. “The reproducibility challenge recognizes this effort and gives credit to people who do a good job.” Ke and others are also spreading the word at AI conferences via workshops set up to encourage researchers to make their work more transparent. This year Pineau and Ke extended the reproducibility challenge to seven of the top AI conferences, including ICML and ICLR.
Another push for transparency is the Papers with Code project, set up by AI researcher Robert Stojnic when he was at the University of Cambridge. (Stojnic is now a colleague of Pineau’s at Facebook.) Launched as a stand-alone website where researchers could link a study to the code that went with it, this year Papers with Code started a collaboration with arXiv, a popular preprint server. Since October, all machine-learning papers on arXiv have come with a Papers with Code section that links directly to code that authors wish to make available. The aim is to make sharing the norm.
Do such efforts make a difference? Pineau found that last year, when the checklist was introduced, the number of researchers including code with papers submitted to NeurIPS jumped from less than 50% to around 75%. Thousands of reviewers say they used the code to assess the submissions. And the number of participants in the reproducibility challenges is increasing.
Sweating the details
But it is only a start. Haibe-Kains points out that code alone is often not enough to rerun an experiment. Building AI models involves making many small changes—adding parameters here, adjusting values there. Any one of these can make the difference between a model working and not working. Without metadata describing how the models are trained and tuned, the code can be useless. “The devil really is in the detail,” he says.
It’s also not always clear exactly what code to share in the first place. Many labs use special software to run their models; sometimes this is proprietary. It is hard to know how much of that support code needs to be shared as well, says Haibe-Kains.
Pineau isn’t too worried about such obstacles. “We should have really high expectations for sharing code,” she says. Sharing data is trickier, but there are solutions here too. If researchers cannot share their data, they might give directions so that others can build similar data sets. Or you could have a process where a small number of independent auditors were given access to the data, verifying results for everybody else, says Haibe-Kains.
Hardware is the biggest problem. But DeepMind claims that big-ticket research like AlphaGo or GPT-3 has a trickle-down effect, where money spent by rich labs eventually leads to results that benefit everyone. AI that is inaccessible to other researchers in its early stages, because it requires a lot of computing power, is often made more efficient—and thus more accessible—as it is developed. “AlphaGo Zero surpassed the original AlphaGo using far less computational resources,” says Koray Kavukcuoglu, vice president of research at DeepMind.
In theory, this means that even if replication is delayed, at least it is still possible. Kavukcuoglu notes that Gian-Carlo Pascutto, a Belgian coder at Mozilla who writes chess and Go software in his free time, was able to re-create a version of AlphaGo Zero called Leela Zero, using algorithms outlined by DeepMind in its papers. Pineau also thinks that flagship research like AlphaGo and GPT-3 is rare. The majority of AI research is run on computers that are available to the average lab, she says. And the problem is not unique to AI. Pineau and Benaich both point to particle physics, where some experiments can only be done on expensive pieces of equipment such as the Large Hadron Collider.
In physics, however, university labs run joint experiments on the LHC. Big AI experiments are typically carried out on hardware that is owned and controlled by companies. But even that is changing, says Pineau. For example, a group called Compute Canada is putting together computing clusters to let universities run large AI experiments. Some companies, including Facebook, also give universities limited access to their hardware. “It’s not completely there,” she says. “But some doors are opening.”
Haibe-Kains is less convinced. When he asked the Google Health team to share the code for its cancer-screening AI, he was told that it needed more testing. The team repeats this justification in a formal reply to Haibe-Kains’s criticisms, also published in Nature: “We intend to subject our software to extensive testing before its use in a clinical environment, working alongside patients, providers and regulators to ensure efficacy and safety.” The researchers also said they did not have permission to share all the medical data they were using.
It’s not good enough, says Haibe-Kains: “If they want to build a product out of it, then I completely understand they won’t disclose all the information.” But he thinks that if you publish in a scientific journal or conference, you have a duty to release code that others can run. Sometimes that might mean sharing a version that is trained on less data or uses less expensive hardware. It might give worse results, but people will be able to tinker with it. “The boundaries between building a product versus doing research are getting fuzzier by the minute,” says Haibe-Kains. “I think as a field we are going to lose.”
Research habits die hard
If companies are going to be criticized for publishing, why do it at all? There’s a degree of public relations, of course. But the main reason is that the best corporate labs are filled with researchers from universities. To some extent the culture at places like Facebook AI Research, DeepMind, and OpenAI is shaped by traditional academic habits. Tech companies also win by participating in the wider research community. All big AI projects at private labs are built on layers and layers of public research. And few AI researchers haven’t made use of open-source machine-learning tools like Facebook’s PyTorch or Google’s TensorFlow.
As more research is done in house at giant tech companies, certain trade-offs between the competing demands of business and research will become inevitable. The question is how researchers navigate them. Haibe-Kains would like to see journals like Nature split what they publish into separate streams: reproducible studies on one hand and tech showcases on the other.
But Pineau is more optimistic. “I would not be working at Facebook if it did not have an open approach to research,” she says.
Other large corporate labs stress their commitment to transparency too. “Scientific work requires scrutiny and replication by others in the field,” says Kavukcuoglu. “This is a critical part of our approach to research at DeepMind.”
“OpenAI has grown into something very different from a traditional laboratory,” says Kayla Wood, a spokesperson for the company. “Naturally that raises some questions.” She notes that OpenAI works with more than 80 industry and academic organizations in the Partnership on AI to think about long-term publication norms for research.
Pineau believes there’s something to that. She thinks AI companies are demonstrating a third way to do research, somewhere between Haibe-Kains’s two streams. She contrasts the intellectual output of private AI labs with that of pharmaceutical companies, for example, which invest billions in drugs and keep much of the work behind closed doors.
The long-term impact of the practices introduced by Pineau and others remains to be seen. Will habits be changed for good? What difference will it make to AI’s uptake outside research? A lot hangs on the direction AI takes. The trend for ever larger models and data sets—favored by OpenAI, for example—will continue to make the cutting edge of AI inaccessible to most researchers. On the other hand, new techniques, such as model compression and few-shot learning, could reverse this trend and allow more researchers to work with smaller, more efficient AI.
Either way, AI research will still be dominated by large companies. If it’s done right, that doesn’t have to be a bad thing, says Pineau: “AI is changing the conversation about how industry research labs operate.” The key will be making sure the wider field gets the chance to participate. Because the trustworthiness of AI, on which so much depends, begins at the cutting edge.
Technology companies have taken many aspects of tech governance from democratically elected leaders. It will take an international effort to fight back.by
September 29, 2020
MS TECH | GETTY
Should Twitter censor lies tweeted by the US president? Should YouTube take down covid-19 misinformation? Should Facebook do more against hate speech? Such questions, which crop up daily in media coverage, can make it seem as if the main technologically driven risk to democracies is the curation of content by social-media companies. Yet these controversies are merely symptoms of a larger threat: the depth of privatized power over the digital world.
Every democratic country in the world faces the same challenge, but none can defuse it alone. We need a global democratic alliance to set norms, rules, and guidelines for technology companies and to agree on protocols for cross-border digital activities including election interference, cyberwar, and online trade. Citizens are better represented when a coalition of their governments—rather than a handful of corporate executives—define the terms of governance, and when checks, balances, and oversight mechanisms are in place.
There’s a long list of ways in which technology companies govern our lives without much regulation. In areas from building critical infrastructure and defending it—or even producing offensive cyber tools—to designing artificial intelligence systems and government databases, decisions made in the interests of business set norms and standards for billions of people.
Increasingly, companies take over state roles or develop products that affect fundamental rights. For example, facial recognition systems that were never properly regulated before being developed and deployed are now so widely used as to rob people of their privacy. Similarly, companies systematically scoop up private data, often without consent—an industry norm that regulators have been slow to address.
Since technologies evolve faster than laws, discrepancies between private agency and public oversight are growing. Take, for example, “smart city” companies, which promise that local governments will be able to ease congestion by monitoring cars in real time and adjusting the timing of traffic lights. Unlike, say, a road built by a construction company, this digital infrastructure is not necessarily in the public domain. The companies that build it acquire insights and value that may not flow back to the public.
This disparity between the public and private sectors is spiraling out of control. There’s an information gap, a talent gap, and a compute gap. Together, these add up to a power and accountability gap. An entire layer of control of our daily lives thus exists without democratic legitimacy and with little oversight.
Why should we care? Because decisions that companies make about digital systems may not adhere to essential democratic principles such as freedom of choice, fair competition, nondiscrimination, justice, and accountability. Unintended consequences of technological processes, wrong decisions, or business-driven designs could create serious risks for public safety and national security. And power that is not subject to systematic checks and balances is at odds with the founding principles of most democracies.
Today, technology regulation is often characterized as a three-way contest between the state-led systems in China and Russia, the market-driven one in the United States, and a values-based vision in Europe. The reality, however, is that there are only two dominant systems of technology governance: the privatized one described above, which applies in the entire democratic world, and an authoritarian one.
To bring globe-spanning technology firms to heel, we need something new: a global alliance that puts democracy first.
The laissez-faire approach of democratic governments, and their reluctance to rein in private companies at home, also plays out on the international stage. While democratic governments have largely allowed companies to govern, authoritarian governments have taken to shaping norms through international fora. This unfortunate shift coincides with a trend of democratic decline worldwide, as large democracies like India, Turkey, and Brazil have become more authoritarian. Without deliberate and immediate efforts by democratic governments to win back agency, corporate and authoritarian governance models will erode democracy everywhere.
Does that mean democratic governments should build their own social-media platforms, data centers, and mobile phones instead? No. But they do need to urgently reclaim their role in creating rules and restrictions that uphold democracy’s core principles in the technology sphere. Up to now, these governments have slowly begun to do that with laws at the national level or, in Europe’s case, at the regional level. But to bring globe-spanning technology firms to heel, we need something new: a global alliance that puts democracy first.
Global institutions born in the aftermath of World War II, like the United Nations, the World Trade Organization, and the North Atlantic Treaty Organization, created a rules-based international order. But they fail to take the digital world fully into account in their mandates and agendas, even if many are finally starting to focus on digital cooperation, e-commerce, and cybersecurity. And while digital trade (which requires its own regulations, such as rules for e-commerce and criteria for the exchange of data) is of growing importance, WTO members have not agreed on global rules covering services for smart manufacturing, digital supply chains, and other digitally enabled transactions.
What we need now, therefore, is a large democratic coalition that can offer a meaningful alternative to the two existing models of technology governance, the privatized and the authoritarian. It should be a global coalition, welcoming countries that meet democratic criteria.
The Community of Democracies, a coalition of states that was created in 2000 to advance democracy but never had much impact, could be revamped and upgraded to include an ambitious mandate for the governance of technology. Alternatively, a “D7” or “D20” could be established—a coalition akin to the G7 or G20 but composed of the largest democracies in the world.
Such a group would agree on regulations and standards for technology in line with core democratic principles. Then each member country would implement them in its own way, much as EU member states do today with EU directives.
What problems would such a coalition resolve? The coalition might, for instance, adopt a shared definition of freedom of expression for social-media companies to follow. Perhaps that definition would be similar to the broadly shared European approach, where expression is free but there are clear exceptions for hate speech and incitements to violence.
Or the coalition might limit the practice of microtargeting political ads on social media: it could, for example, forbid companies from allowing advertisers to tailor and target ads on the basis of someone’s religion, ethnicity, sexual orientation, or collected personal data. At the very least, the coalition could advocate for more transparency about microtargeting to create more informed debate about which data collection practices ought to be off limits.
The democratic coalition could also adopt standards and methods of oversight for the digital operations of elections and campaigns. This might mean agreeing on security requirements for voting machines, plus anonymity standards, stress tests, and verification methods such as requiring a paper backup for every vote. And the entire coalition could agree to impose sanctions on any country or non-state actor that interferes with an election or referendum in any of the member states.
Another task the coalition might take on is developing trade rules for the digital economy. For example, members could agree never to demand that companies hand over the source code of software to state authorities, as China does. They could also agree to adopt common data protection rules for cross-border transactions. Such moves would allow a sort of digital free-trade zone to develop across like-minded nations.
China already has something similar to this in the form of eWTP, a trade platform that allows global tariff-free trade for transactions under a million dollars. But eWTP, which was started by e-commerce giant Alibaba, is run by private-sector companies based in China. The Chinese government is known to have access to data through private companies. Without a public, rules-based alternative, eWTP could become the de facto global platform for digital trade, with no democratic mandate or oversight.
Another matter this coalition could address would be the security of supply chains for devices like phones and laptops. Many countries have banned smartphones and telecom equipment from Huawei because of fears that the company’s technology may have built-in vulnerabilities or backdoors that the Chinese government could exploit. Proactively developing joint standards to protect the integrity of supply chains and products would create a level playing field between the coalition’s members and build trust in companies that agree to abide by them.
The next area that may be worthy of the coalition’s attention is cyberwar and hybrid conflict (where digital and physical aggression are combined). Over the past decade, a growing number of countries have identified hybrid conflict as a national security threat. Any nation with highly skilled cyber operations can wreak havoc on countries that fail to invest in defenses against them. Meanwhile, cyberattacks by non-state actors have shifted the balance of power between states.
Right now, though, there are no international criteria that define when a cyberattack counts as an act of war. This encourages bad actors to strike with many small blows. In addition to their immediate economic or (geo)political effect, such attacks erode trust that justice will be served.
A democratic coalition could work on closing this accountability gap and initiate an independent tribunal to investigate such attacks, perhaps similar to the Hague’s Permanent Court of Arbitration, which rules on international disputes. Leaders of the democratic alliance could then decide, on the basis of the tribunal’s rulings, whether economic and political sanctions should follow.
These are just some of the ways in which a global democratic coalition could advance rules that are sorely lacking in the digital sphere. Coalition standards could effectively become global ones if its members represent a good portion of the world’s population. The EU’s General Data Protection Regulation provides an example of how this could work. Although GDPR applies only to Europe, global technology firms must follow its rules for their European users, and this makes it harder to object as other jurisdictions adopt similar laws. Similarly, non-members of the democratic coalition could end up following many of its rules in order to enjoy the benefits.
If democratic governments do not assume more power in technology governance as authoritarian governments grow more powerful, the digital world—which is a part of our everyday lives—will not be democratic. Without a system of clear legitimacy for those who govern—without checks, balances, and mechanisms for independent oversight—it’s impossible to hold technology companies accountable. Only by building a global coalition for technology governance can democratic governments once again put democracy first.
Marietje Schaake is the international policy director at Stanford University’s Cyber Policy Center and an international policy fellow at Stanford’s Institute for Human-Centered Artificial Intelligence. Between 2009 and 2019, Marietje served as a Member of European Parliament for the Dutch liberal democratic party.
Tech companies are setting norms and standards of all kinds that used to be set by governments.
Technology companies provide much of the critical infrastructure of the modern state and develop products that affect fundamental rights. Search and social media companies, for example, have set de facto norms on privacy, while facial recognition and predictive policing software used by law enforcement agencies can contain racial bias.
In this episode of Deep Tech, Marietje Schaake argues that national regulators aren’t doing enough to enforce democratic values in technology, and it will take an international effort to fight back. Schaake—a Dutch politician who used to be a member of the European parliament and is now international policy director at Stanford University’s Cyber Policy Center—joins our editor-in-chief, Gideon Lichfield, to discuss how decisions made in the interests of business are dictating the lives of billions of people.
Also this week, we get the latest on the hunt to locate an air leak aboard the International Space Station—which has grown larger in recent weeks. Elsewhere in space, new findings suggest there is even more liquid water on Mars than we thought. It’s located in deep underground lakes and there’s a chance it could be home to Martian life. Space reporter Neel Patel explains how we might find out.
Back on Earth, the US election is heating up. Data reporter Tate Ryan-Mosley breaks down how technologies like microtargeting and data analytics have improved since 2016.
Check out more episodes of Deep Tech here.
Show notes and links:
- How democracies can claim back power in the digital world September 29, 2020
- The technology that powers the 2020 campaigns, explained September 28, 2020
- There might be even more underground reservoirs of liquid water on Mars September 28, 2020
- Astronauts on the ISS are hunting for the source of another mystery air leak September 30, 2020
Full episode transcript:
Gideon Lichfield: There’s a situation playing out onboard the International Space Station that sounds like something out of Star Trek…
Computer: WARNING. Hull breach on deck one. Emergency force fields inoperative.
Crewman: Everybody out. Go! Go! Go!
Gideon Lichfield: Well, it’s not quite that bad. But there is an air leak in the space station. It was discovered about a year ago, but in the last few weeks, it’s gotten bigger. And while NASA says it’s still too small to endanger the crew… well… they also still can’t quite figure out where the leak is.
Elsewhere in space, new findings suggest there is even more liquid water on Mars than we thought. It’s deep in underground lakes. There might even be life in there. The question is—how will we find out?
Here on Earth, meanwhile, the US election is heating up. We’ll look at how technologies like microtargeting and data analytics have improved since 2016. That means campaigns can tailor messages to voters more precisely than ever.
And, finally, we’ll talk to one of Europe’s leading thinkers on tech regulation, who argues that democratic countries need to start approaching it in an entirely new way.
I’m Gideon Lichfield, editor-in-chief of MIT Technology Review, and this is Deep Tech.
The International Space Station always loses a tiny bit of air, and it’s had a small leak for about a year. But in August, Mission Control noticed air pressure on board the station was dropping—a sign the leak was expanding.
The crew were told to hunker down in a single module and shut the doors between the others. Mission Control would then have a go at pressurizing each sealed module to determine where the leak was.
As our space reporter Neel Patel writes, this process went on for weeks. And they didn’t find the leak. Until, one night…
Neel Patel: On September 28th, in the middle of the night, the astronauts are woken up. Two cosmonauts and one astronaut that are currently on the ISS. And mission control tells them, “Hey, we think we know where the leak is, finally. You guys have to go to the Russian side of the station in the Svezda module and start poking around and seeing if you can find it.”
Gideon Lichfield: Okay. And so they got up and they got in the, in the module and they went and poked around. And did they find it?
Neel Patel: No, they have still not found that leak yet. These things take a little bit of time. It’s, you know, you can’t exactly just run around searching every little wall in the module and, you know, seeing if there’s a little bit of cool air that’s starting to rush out.
The best way for the astronauts to look for the leak is a little ultrasonic leak detector. That kind of spots frequencies that air might be rushing out. And that’s an indication of where there might be some airflow where there shouldn’t be. And it’s really just a matter of holding that leak detector up to sort of every little crevice and determining if things are, you know, not the way they should be.
Gideon Lichfield: So as I mentioned earlier, the space station always leaks a little bit. What made this one big enough to be worrying?
Neel Patel: So..the.. you know, like I said before, the air pressure was dropping a little bit. That’s an indication that the hole is not stable, that there might be something wrong, that there could allegedly be some kind of cracks that had been growing.
And if that’s the case, it means that the hull of the spacecraft at that point is a little bit unstable. And if the leak is not taken care of as soon as possible, if the cracks are not repaired, as soon as possible, things could grow and grow and eventually reach a point where something might break. Now, that’s a pretty distant possibility, but you don’t take chances up in space.
Gideon Lichfield: Right. And also you’re losing air and air is precious…
Neel Patel: Right. And in this instance, there was enough air leaking that there started to be concerns from both the Russian and US sides that they may need to send in more oxygen sooner than later.
And, you know, the way space operations work, you have things planned over for years in advance. And of course, you know, you still have a leak to worry about.
Gideon Lichfield: So how do leaks actually get started on something like the ISS?
Neel Patel: So that’s a good question. And there are a couple ways for this to happen. Back in 2018, there was a two millimeter hole found on the Russian Soyuz spacecraft.
That was very worrisome and no one understood initially how that leak might’ve formed. Eventually it was determined that a drilling error during manufacturing probably caused it. That kind of leak was actually sort of good news because it meant that, with a drilling hole, things are stable. There aren’t any kind of like aberrant cracks that could, you know, get bigger and start to lead to a bigger destruction in the hull. So that was actually good news then, but other kinds of leaks are mostly thought to be caused by micro meteoroids.
Things in space are flying around at. Over 20,000 miles per hour, which means even the tiniest little object, even the tiniest little grain or dust could you know, just whip a very massive hole inside the hull of the space station.
Gideon Lichfield: Ok so those are micro meteoroids that are probably causing those kinds of leaks, but obviously there’s also a growing problem of space debris. Bits of spacecraft and junk that we’ve been thrown up into orbit that is posing a threat.
Neel Patel: Absolutely space debris is a problem. It’s only getting worse and worse with every year. Probably the biggest, most high profile, incident that caused the most space debris in history was the 2009 crash between two satellites, Iridium 33 and cosmos 2251. That was the first and only satellite crash between two operational satellites that we know of so far. And the problem with that crash is it ended up creating tons and tons of debris that were less than 10 centimeters in length. Now objects greater than 10 centimeters are tracked by the Air Force, but anything smaller than 10 centimeters is virtually undetectable so far. That means that, you know, any of these little objects that are under 10 centimeters, which is, you know, a lot of different things are threats to the ISS. And as I mentioned before at the speed that these things are running at, they could cause big destruction for the ISS or any other spacecraft in orbit.
Gideon Lichfield: So it’s basically a gamble? Yeah? They’re just hoping that none of these bits crashes into it, because if it does, there’s nothing they can do to spot it or stop it.
Neel Patel: No, our radar technologies are getting better. So we’re able to spot smaller and smaller objects, but this is still a huge problem that so many experts have been trying to raise alarms about.
And unfortunately, the sort of officials that be, that control, you know, how we manage the space environment still haven’t come to a consensus about what we want to do about this, what kind of standards we want to implement and how we can reduce the problem.
Gideon Lichfield: So… They still haven’t found this leak. So what’s going on now?
Neel Patel: Okay. So according to a NASA spokesperson quote, there have been no significant updates on the leak since September 30th. Roscosmos, the Russian space agency, released information that further isolated the leak to the transfer chamber of the Svezda service module. The investigation is still ongoing and poses no immediate danger to the crew.
Gideon Lichfield: All right, leaving Earth orbit for a bit. Let’s go to Mars. People have been looking for water on Mars for a long time, and you recently reported that there might be more liquid water on Mars than we originally thought. Tell us about this discovery.
Neel Patel: So in 2018, a group of researchers used radar observations that were made by the European Space Agency’s Mars Express orbiter to determine that there was a giant, subsurface lake sitting 1.5 kilometers below the surface of Mars underneath the glaciers near the South pole. The lake is huge. It’s almost 20 kilometers long and is, you know, liquid water. We’re not talking about the frozen stuff that’s sitting on the surface. We’re talking about liquid water. Two years later, the researchers have come back to even more of that radar data. And what they found is that neighboring that body of water might be three other lakes. Also nearby, also sitting a kilometer underground.
Gideon Lichfield: So how does this water stay liquid? I mean Mars is pretty cold, especially around the poles.
Neel Patel: So the answer is salt. It’s suspected that these bodies of waters have been able to exist in a liquid form for so long, despite the frigid temperatures, because they’re just caked in a lot of salt. Salts, as you might know, can significantly lower the freezing point of water. On Mars it’s thought that there might be calcium, magnesium, sodium, and other salt deposits.
These have been found around the globe and it’s probable that these salts are also existing inside the lakes. And that’s what allowed them to have stayed as liquid instead of a solid for so long.
Gideon Lichfield: So what would it take to get to these underground lakes? If we could actually be on Mars and what might we find when we got there?
Neel Patel: These lakes, as I’ve mentioned, are sitting at least one kilometer sometimes further, deeper, underground. Uh, there’s not really a chance that any kind of future Martian explorers in the next generation or two are going to have the type of equipment that are gonna allow them to drill all the way that deep.
Which is not really a problem for these future colonists. There’s plenty of surface ice at the Martian poles that’s easier to harvest in case they want to create drinking water or, you know, turn that into hydrogen oxygen, rocket fuel.
The important thing to think about is do these underground lakes perhaps possess Martian life. As we know on Earth, life can exist in some very extreme conditions and it’s, you know, at least a non zero chance that these lakes perhaps also possess the same sort of extreme microbes that can survive these kinds of frigid temperatures and salty environments.
Gideon Lichfield: Alright so maybe we don’t want to try to drink this water, but it would be great if we could explore it to find out if there is in fact life there. So is there any prospect that any current or future space mission could get to those leaks and find that out?
Neel Patel: No, not anytime soon. Drilling equipment is very big, very heavy. There’s no way you’re going to be able to properly fit something like that on a spacecraft. That’s going to Mars. But one way we might be able to study the lakes is by measuring the seismic activity around the South pole.
If we were to place a small little Lander on the surface of Mars, have it drill just a little ways into the ground. It could measure the vibrations coming out of Mars. It could use those, use that data to characterize how big the lakes are, what their shape might be. And by extension, we may be able to use that data to determine, you know, how… in what locations of the lakes life might exist and, you know, figure out where we want to probe next for further study.
Gideon Lichfield: Technology has been an increasingly important part of political campaigns in the US, particularly since Barack Obama used micro-targeting and big data to transform the way that he campaigned. With every election since then, the techniques have gotten more and more sophisticated. And in her latest story for MIT technology review, Tate Ryan-Mosley looks at some of the ways in which the campaigns this time round are segmenting and targeting voters even more strategically than before. So Tate, can you guide us through what is new and improved and how things have changed since the 2016 election?
Tate Ryan-Mosley: Yeah. So I’ve identified kind of four key continuations of trends that have started and in prior presidential elections, and all, kind of all of the trends are pushing towards this kind of new era of campaigning where all of the messages, the positioning, the presentation of their candidates is really being, you know, personalized for each individual person in the United States. And so, the key things driving that are really, you know, data acquisition. So the amount of data that these campaigns have on every person in the United States. Another new thing is data exchanges which is kind of the structural mechanism by which all of this data is aggregated and shared and used.
And then the way that that data kind of gets pushed into the field and into strategy is of course microtargeting. And this year, you know, we’re seeing campaigns employ things with much more granularity, like using SMS as one of the main communication tools to reach prospective voters. Actually uploading lists of profile names into social media websites. And lastly, kind of a big shift in 2020 is a more clear move away from kind of the traditional opinion polling mechanisms into AI modeling. So instead of having, you know, these big polling companies call a bunch of people and try to get a sense of the pulse of the election, you’re really seeing AI being leveraged to predict the outcomes of elections and in particular segments.
Gideon Lichfield: So let’s break a couple of those things down. One of the areas that you talked about is data exchanges, and there’s a company that you write about in your story called Data Trust. Can you tell us a bit about who they are and what they do?
Tate Ryan-Mosley: So data trust is the Republican’s kind of main data aggregation technology. And so what it enables them to do is collect data on all prospective voters, host that data, analyze the data, and actually share it with, politically aligned PACs, 501(c)(3)’s and 501(c)(4)’s. And previously because of FEC regulations, you’re not allowed to kind of cross that wall between campaign and 501(c)(3)’s, 501(c)(4)’s and PACs. And the way that these data exchanges are set up is it’s enabling data sharing between those groups.
Gideon Lichfield: How does that not cross the wall?
Tate Ryan-Mosley: Right. So basically the, what they say is the data is anonymized to the point that you don’t know where the data is coming from. And that is kind of the way that they’ve been able to skirt the rules. The Democrats actually sued the Republicans after the 2016 election, and then they lost. And so what’s really notable is that this year the Democrats have created their own data exchange, which is called DDX. And so this is the first year that the Democrats will have any type of similar technology. And since the Democrats have come online, they’ve actually collected over 1 billion data points, which is a lot of data.
Gideon Lichfield: So these data exchanges allow basically a campaign and everyone that is aligned with it, supporting it, to share all the same data. And what is that enabling them to do that they couldn’t do before?
Tate Ryan-Mosley: Yeah,that’s a good question. And what it’s really doing is it’s kind of enabling a lot of efficiency and the way that voters are being reached. So there’s a lot of double spend on voters who are already decided. So for example, the Trump campaign might be reaching out to a particular, you know, voter that has already been decided by a group like the NRA to be, you know, conservatively aligned and very likely to vote for Trump. But the Trump campaign doesn’t know that in their data set. So this would enable the Trump campaign to not spend money reaching out to that person. And it makes kind of the efficiency and the comprehensiveness of their outreach kind of next level.
Gideon Lichfield: So let’s talk about micro-targeting. The famous example of micro-targeting of course, is Cambridge Analytica, which illicitly acquired a bunch of people’s data from Facebook in the 2016 campaign, and then claimed that it could design really specific messages aimed at millions of American voters. And a lot of people, I think called that ability into question, right. But where are we now with microtargeting?
Tate Ryan-Mosley: There’s kind of this misconception around the way in which microtargeting is impactful. What Cambridge Analytica claimed to do was use data about people’s opinions and personalities to profile them and create messages that were really likely to persuade a person about a specific issue at a particular time. And that’s kind of what’s been debunked. That, you know, political ads, political messages are not actually significantly more persuasive now than they’ve ever been. And really you can’t prove it. There’s no way to attribute a vote to a particular message or a particular ad campaign.
Tate Ryan-Mosley: So what’s really become the consensus about, you know, why micro-targeting is important is that it increases the polarization of the electorate or the potential electorate. So basically it’s really good at identifying already decided voters and making them either more mobile. So you know, more vocal about their cause and their position or bringing them increasingly into the hard line and even getting them to donate. So we saw this pretty clearly with the Trump campaigns app that they have put out this year.
So there’s a lot of surveillance kind of built into the structure of the app that is meant to micro target their own supporters. and the reason they’re doing that is that’s kind of seen as the number one fundraising mechanism. If we can convince somebody who agrees with Trump to get really impassioned about Trump, that really means, that means money.
Gideon Lichfield: Let’s talk about another thing, which is polling. Of course, the difficulty with polling that we saw in the 2016 election was people don’t answer their phones anymore and doing an accurate opinion poll is getting harder and harder. So how is technology helping with that problem?
Tate Ryan-Mosley: So what’s being used is now AI modeling, which basically takes a bunch of data and spits out a prediction about how likely a person is either to show up to vote, to vote in a particular way, or to feel a certain way about a particular issue. and so these AI models they’re also used in 2016 and it’s worth noting in 2016, AI models were about as accurate as traditional opinion polls in terms of, you know, really not predicting that Trump was going to win. But you know, as the data richness gets better, as data gets more, you know, becomes more real time, as the quality improves, we’re seeing an increased accuracy in AI modeling that kind of is signifying. It’s likely to take, you know, more and more, become a bigger part of how polling is done.
Gideon Lichfield: So what we’re seeing is that this election represents a new level in the use of technologies that we’ve seen over the past decade or more, that are us the ability, or giving campaigns the ability to target people ever more precisely to share data about people more widely and use it more efficiently. As well as to predict which way voters are going to go much more reliably. So what does all this add up to? What are the consequences for our politics?
Tate Ryan-Mosley: What we’re really seeing as is kind of a fragmentation of campaign messaging and the ability to kind of scale those fragments and those silos up. And so what’s happening is it’s becoming significantly easier for campaigns to say different things, to different groups of people and that kind of skirts some of the norms that we have and in public opinion and civic discourse around lying around, you know, switching positions around distortion that have in the past really been able to check public figures.
Gideon Lichfield: Because politicians can say one thing to one group of people, a completely different thing to a different group. And the two groups don’t know that they’re being fed different messages.
Tate Ryan-Mosley: Exactly. So, you know, the Biden campaign can easily send out a text message to a small group of, you know, 50 people in a swing county that say something really specific to their local politics. And most people wouldn’t ever know, or really be able to fact check them because they just don’t have access to the messages that campaigns are giving, you know, really specific groups of people.
And so that’s really kind of changing the way that we have civic discourse. And you know, it even allows some campaigns to kind of manufacture cleavages in the public. So it can actually kind of game out how they want to be viewed by a specific group of people and hit those messages home, you know, and kind of create cleavage that previously wasn’t there or wouldn’t be there organically.
Gideon Lichfield: Does that mean that American politics is just set to become irretrievably fragmented?
Tate Ryan-Mosley: I mean, that’s absolutely the concern. What’s interesting as I’ve talked to some experts that actually feel that this might indeed be the pinnacle of campaign technology and personalized campaigns because public opinion is really shifting on this. So Pew research group actually just did a survey that came out this month that showed that the majority of the American public does not think social media platforms should allow for any political advertisement at all.
And the large majority of Americans believe that political micro-targeting, especially on social media should be disallowed. And we’re starting to see that reflected in Congress. So there are a handful of bills actually that have bipartisan support that have been introduced to both the house and the Senate that are seeking to kind of address some of these issues. Obviously we won’t see the impact of that before the 2020 election, but a lot of experts are pretty hopeful that we’ll be able to see some legitimate regulation for the upcoming presidential in 2024.
Gideon Lichfield: Tech companies are setting norms and standards of all kinds that used to be set by governments. That’s the view of Marietje Schaake, who wrote an essay for us recently. Marietje is a Dutch politician who used to be a member of the European parliament and is now international policy director at Stanford University’s Cyber Policy Center. Marietje, What’s a specific example of the way in which the decisions that tech companies have made end up effectively setting the norms for the rest of us?
Marietje Schaake: Well, I think a good example is how, for example, facial recognition systems and even the whole surveillance model of social media and search companies has set de facto norms compromising the right to privacy. I mean, if you look at how much data is collected across a number of services, the fact that there’s data brokers renders the rights of privacy very, very fragile, if not compromised as such. And so I think that is an example, especially if there’s no laws to begin with where the de facto standard is very, very hard to roll back once it’s set by the companies.
Gideon Lichfield: Right. So how did we get to this?
Marietje Schaake: Yeah, that’s, that’s the billion dollar question. And I think we have to go back to the culture that went along with the rise of companies coming out of Silicon Valley that was essentially quite libertarian. And I think they, these companies, these, entrepreneurs, these innovators, may have had good intentions, may have hoped that their inventions and their businesses would have a liberating effect and they can lawmakers that the best support that they could give this liberating technology was to do nothing in the form of regulation. And effectively in the US and in the EU—even if the EU is often called a super regulator—there has been very, very little regulation to preserve core principles like non-discrimination or antitrust in light of the massive digital disruptions. And so the success of the libertarian culture from Silicon Valley, the power of big tech companies now that can lobby against regulatory proposals explains why we are where we are.
Gideon Lichfield: One of the things that you say in your essay is that there are actually two kinds of regulatory regimes in the world, for tech. There’s the privatized one, in other words, in Western countries the tech companies are really the ones setting a lot of the rules for how the digital space works. And then there’s an authoritarian one which is China, Russia, and other countries where governments are taking a very heavy handed approach to regulation. What are the consequences then of having a world in which it’s a choice between these two regimes?
Marietje Schaake: I think the net result is that the resilience of democracy and actually the articulation of democratic values, the safeguarding of democratic values, the building of institutions has lagged behind. And this comes at a time where democracy is under pressure globally anyway. We can see it in our societies. We can see it on the global stage where in multilateral organizations, it is not a given that the democracies have the majority of votes or, or voices. And so all in all it makes democracy and projected out into the future, the democratic mark on the digital world, very fragile. And that’s why I think there’s reason for concern.
Gideon Lichfield: Okay. So in your essay, you’re proposing a solution to all of this, which is a kind of democratic alliance of nations to create rules for tech governance. Why is that necessary?
Marietje Schaake: Right. I think it’s necessary for democracies to work together much more effectively, and to step up their role in developing a democratic governance model of technology. And I think it’s necessary because with the growing power of. Corporations and their, uh, ability to set standards and effectively to govern the digital world on the one hand.
And then on the other hand, a much more top down control oriented state led model that we would see in States like China and Russia. There, there’s just too much of a vacuum on the part of democracies. And I think if they work together, they’re in the best position to handle cross border companies and to have an effective way of working together to make sure that they leverage their collective scale, essentially.
Gideon Lichfield: Can you give an example of how this democratic coalition would work? What sorts of decisions might it take or where might it set rules?
Marietje Schaake: Well, let me focus on one area that I think needs a lot of work and attention. And that is the question of how to interpret laws of war and armed conflict but also the preservation of peace and accountability after cyber attacks.
So right now, because there is a vacuum in the understanding of how laws of armed conflict and thresholds of war apply in the digital world, attacks happen every day. But often without consequences. And the notion of accountability, I think is very important as part of the rule of law to ensure that there is a sense of justice also in the digital world. And so I can very well imagine that in this space that really needs to be articulated and shaped now with institutions and mechanisms, then the democracies could, could really focus on that area of war, of peace, of accountability.
Gideon Lichfield: So when you say an attack happens without consequences, you mean some nation state or some actor launches a cyber attack and nobody can agree that it should be treated as an act of war?
Marietje Schaake: Exactly. I think that that is happening far more often than people might realize. And in fact, because there is such a legal vacuum, it’s easy for attackers to sort of stay in a zone where they can almost anticipate that they will not face any consequences. And part of this is political. How willing are countries to come forward and point to a perpetrator. But it’s also that there’s currently a lack of proper investigation to ensure that there might be something like a trial, you know, a court of arbitration where different parties can, can speak about their side of the conflict and that there would be a ruling by an independent, judiciary-type of organization to make sure that there is an analysis of what happened but that there’s also consequences to clearly escalatory behavior.
And if the lack of accountability continues, I fear that it will play into the hands of nations and their proxies. So the current lack of holding to account perpetrators that may launch cyber attacks to achieve their geopolitical political or even economic goals is very urgent. So I would imagine that a kind of tribunal or a mechanism of arbitration could really help close this accountability gap.
Gideon Lichfield: That’s it for this episode of Deep Tech. This is a podcast just for subscribers of MIT Technology Review, to bring alive the issues our journalists are thinking and writing about.
Before we go, I want to quickly tell you about EmTech MIT, which runs from October 19th through the 22nd. It’s our flagship annual conference on the most exciting trends in emerging technology.
This year, it’s all about how we can build technology that meets the biggest challenges facing humanity, from climate change and racial inequality to pandemics and cybercrime.
Our speakers include the CEOs of Salesforce and Alphabet X, the CTOs of Facebook and Twitter, the head of cybersecurity at the National Security Agency, the head of vaccine research at Eli Lilly, and many others. And because of the pandemic, it’s an online event, which means it’s both much cheaper than in previous years and much, much easier to get to.
You can find out more and reserve your spot by visiting EmTechMIT.com – that’s E-M…T-E-C-H…M-I-T dot com – and use the code DeepTech50 for $50 off your ticket. Again, that’s EmTechMIT.com with the discount code DeepTech50.
Deep Tech is written and produced by Anthony Green and edited by Jennifer Strong and Michael Reilly. I’m Gideon Lichfield. Thanks for listening.