Wave 7: Robotics and Physical AI

Wave 7 asked panelists to consider the physical impacts of AI and robotics. Respondents were asked to forecast global industrial robot supply, the U.S. labor share, when an autonomous robot will first pass the Coffee Test and perform an appendectomy without human help, and Amazon's real e-commerce sales per distribution worker.

First released on:
19 April 2026

The following report summarizes responses from 214 experts, in addition to 55 superforecasters and 630 members of the public collected between Mar 09, 2026 and Mar 31, 2026. Within expert respondents, 48 computer scientists, 44 industry professionals, 52 economists, and 70 research staff at policy think tanks participated.

Our wider website contains more information about LEAP, our Panel, and our Methodology, as well as reports from other waves.

Insights

  1. Experts expect robots to be able to reliably make coffee in an unfamiliar environment by 2034. The "Coffee Test" requires a robot to make a cup of coffee in three different unstructured, unfamiliar environments of equivalent complexity to an ordinary home, with no mistakes. The median expert forecasts that an autonomous robot will first successfully complete the Coffee Test in 2034, with superforecasters predicting 2032 and the public giving a substantially later forecast of 2039. Forecasters' rationales focused on the challenges of fine-motor manipulations in varied, unstructured environments and on incentives to solve these problems.
  2. Respondents expect robotic surgery to happen by 2035-2040. Experts forecast that a robot will first successfully perform an appendectomy without physical help around 2040, with superforecasters predicting 2035 and the public 2039.1 The later timeline for surgery is notable given that, unlike the Coffee Test, the appendectomy test explicitly permits verbal coaching by a human. In rationales, forecasts expecting a later resolution date cited regulatory barriers, malpractice liability, and surgeon lobbying as potential multi-decade blockers of an AI appendectomy.
  3. Despite forecasting a world with far more industrial robots, experts forecast labor's share of the economy to fall only modestly. From a 2024 baseline of 58.3%, the median expert forecasts U.S. labor share declining to 51% by 2040—a roughly 7 percentage point drop over 16 years, compared to the roughly 10 percentage point decline that occurred over the previous 23 years.2 However, experts predict industrial robot stocks will grow from 4.7 million to 16 million units by 2040. Forecaster rationales suggest that experts see the growth in industrial robots as important, but many note that labor share changes slowly, embedded in institutional and demographic structures that change over decades rather than years. As one moderate-decline respondent put it: "labor share moves like a supertanker." Even respondents who cite advances in AI in their rationales tend to anchor to the current downward trend in labor share rather than forecast an acceleration.
  4. Experts forecast Amazon's e-commerce sales per distribution employee to rise, citing warehouse automation as a key factor. The median expert forecasts Amazon's real e-commerce net sales per distribution employee to rise from its 2024 baseline of $464,771 to $480,000 by 2026, $549,842 by 2030, and $800,000 by 2040.3 Superforecasters are more bullish, predicting over $1,000,000 by 2040. Forecasters cited the suitability of warehouses for robotic deployment, Amazon's track-record in automation and a decoupling of sales from labor as key factors driving their predictions.

Questions

  • Industrial Robotics Supply: How many industrial robots will be in use globally in the following resolution years? ⬇️

  • Labor Share: What will the U.S. labor share be in the following resolution years for the private, nonfarm sector? ⬇️

  • Coffee Test: In what year will an autonomous robot first successfully perform the “Coffee Test” (see Background and Resolution Criteria for precise definitions) in 3 consecutive trials, each in distinct homes? ⬇️

  • Robotic Surgeon: In what year will a robot first successfully perform an appendectomy—surgery to remove an appendix—on a live human, without physical human help? ⬇️

  • Amazon Distribution Workers: What will Amazon’s real e-commerce net sales per distribution employee be in the fiscal years ending in or before each of the following resolution years (in 2025 USD, i.e. adjusted for inflation)? ⬇️

For full question details and resolution criteria, see below.

Results

In this section, we present each question, and summarize the forecasts made and the reasoning underlying those forecasts. More concretely, we present background material, historical baselines, and resolution criteria; graphs, results summaries, results tables; as well as rationale analyses and rationale examples. In the first three waves, experts and superforecasters wrote over 600,000 words supporting their beliefs. We analyse these rationales alongside predictions to provide significantly more context on why experts believe what they believe, and the drivers of disagreement, than the forecasts alone.

Industrial Robotics Supply

Question. How many industrial robots will be in use globally in the following resolution years?

Industrial Robotics Supply. The figure above shows the distribution of forecasts by participant group, illustrating the median (50th percentile) and interquartile range (25th–75th percentiles) of each forecast.

Results. All participant groups forecast continued growth in the number of industrial robots from the 2024 estimate of approximately 4.66 million. However, the pace of this growth diverges substantially among the groups, especially by 2040. For 2026, the median experts, superforecasters, and the public forecast medians of roughly 5.75 million, 5.70 million, and 5.40 million units, respectively, representing agreement on modest near-term growth. By 2030, the public's forecasts begin to diverge: the median expert forecasts around 8.1 million robots, superforecasters forecast a similar 8.0 million, while the public is more conservative at 7.0 million. This divergence becomes most pronounced by 2040, where experts forecast a median of 16 million robots and superforecasters a comparable 16.5 million, versus just 10 million for the public. Uncertainty is wide and grows substantially over time: the median 25th and 75th percentile forecast range spans roughly 12 to 22 million by 2040 for both experts and superforecasters.

Rationale analysis

  • Physical and capital constraints on deployment: Moderate-growth respondents emphasize that industrial robot adoption is governed by slow capital replacement cycles, integration costs, and manufacturing capacity rather than technological capabilities. One writes: "Factories are not software. A steel mill or automotive stamping line might operate for thirty years before its core architecture changes." But high-growth respondents tend to argue that more versatile robots, produced at scale, mean these traditional constraints are unlikely to apply: "The big step change is the possibility of operating autonomously in changing environments, which would mean that all these robots can be deployed without substantial upfront investments in customizing their development for deployment. This would make deployment faster, and the number of places where they can be deployed larger."
  • AI's impact on industrial robotics adoption: Many high-growth respondents argue that breakthroughs in AI's cognitive abilities "are likely to expand the range of tasks robots can perform, including [in] semi-structured environments like warehouses and light manufacturing, accelerating adoption," whereas several moderate-growth respondents question the "correlation between the types of AI advances that are occurring and the deployment of industrial robots." One argues that growth is likely to be "largely linear and unaffected by technologies like AI…AI makes no sense for precision movements that need tight coordination with PLCs [programmable logic controllers] for reliable real-time execution."
  • Demographic drivers: Many high-growth forecasters see aging populations in China, Japan, South Korea, Germany, and Europe as a major driver of automation: "Long-term trends such as aging populations and labor shortages in major economies will further push industries to replace human labor with robots, especially in automotive, electronics, logistics, and metalworking." Several moderate-growth forecasters, however, point to countervailing forces in the developing world: "India, Southeast Asia, Latin America are largely untapped but have much lower manufacturing wages, which slows adoption."; "So much of the world is younger with human labor in industrial environments and those regions have growing, younger populations."
  • Linear trend continuation vs. exponential inflection: Many high-growth respondents anticipate an acceleration well beyond historical trends, often invoking exponential or hockey-stick dynamics. One highlights "two factors for this: 1) Technological development will continue, expanding the realm for robot performance and suppress[ing] per-unit costs (supply effects). 2) Globalization of markets and economic growth in low-/middle-income countries will expand the market for robot sales across the globe (demand effect)." Another writes, "in order to see truly explosive growth in robotics, robots should be able to make new robots that in turn make new robots. Given the recent progress in robots…and the possibility of software-only AI speeding up progress in robotics in the coming years, I think this is a perfectly conceivable possibility by 2040." Moderate-growth respondents instead tend to anchor heavily on trend extrapolation, particularly for near-term projections: "My median is mostly following the largely linear trend since 2020."; "Linear trend extrapolation is the safe bet here."
  • Potential for market saturation: Some moderate-growth respondents argue saturation in specific industries could be a factor. One writes: "China, by far the biggest market, is getting closer to saturation in mature segments like automotive welding and electronics assembly, even though earlier-stage segments still have a lot of room." But another forecaster notes that "South Korea has more than double the number of robots per 10,000 workers as China (roughly 1,000 to 500), and so there is scope for catch up in China as well as…better robots doing more tasks, expanding the scope of robots," and others point to untapped sectors—food processing, construction, logistics, textiles, and SMEs [small and medium enterprises]—as sources of future demand.
  • Humanoid robots as a wildcard: Many high-growth respondents assign significant probability to humanoid robots upending the existing dynamic: "I expect humanoid robots (e.g., Tesla Optimus, Figure 01, 1X NEO) to be classified within IFR's 'industrial robot' operational stock by the mid-2030s as they are deployed at scale in factory and warehouse settings." Moderate-growth respondents either ignore humanoids or express skepticism. One writes: "There may not be that many industrial jobs for humanoid robots to do by the time they are cheap enough to deploy effectively."

Rationale examples, moderate-growth respondents:

The inertia in industrial capital cycles is enormous…industrial change is governed far more by capital replacement cycles (replacement of outdated equipment, new technology lines, and retrofit) than by technological possibility. A robot arm itself is relatively cheap compared to the cost of redesigning the production cell around it. Conveyor geometry, safety systems, sensor integration, tooling tolerances, and workforce retraining all have to change. Those costs dominate the economics of automation, which is why adoption tends to proceed as existing lines age out rather than through wholesale replacement…The central uncertainty in all of this is not whether robots work. They clearly do. The question is whether the global labor market exerts enough pressure to justify the enormous cost of retooling existing production systems. Many industries continue operating profitably with hybrid labor models, particularly in regions where wages remain low and labor pools remain large. As long as that remains true the economic incentive to replace entire production lines remains limited.

I am skeptical of forecasts that treat advances in AI as translating directly into very rapid growth in industrial robot stock. What seems more plausible to me is a slower and more uneven translation process, where gains in capability still have to pass through the stubborn material conditions of production and use. Industrial robot stock does not expand simply because more tasks become technically tractable. It grows when manufacturing capacity, component supply, installation, factory redesign, maintenance, and replacement cycles can be coordinated at scale, and when firms decide that those investments are worth making.

Industrial robotics applications are generally expensive, and the number of installations that the average manufacturing company completes within a given year is limited by two factors: 1) budget, and; 2) internal resources to manage the process. Even with significant advances in robotics, I would expect most new installations in the coming years to be standard articulated and SCARA [selective compliance articulated robot arm] robots. Despite claims of deployment at scale from 2026 onward, I don't think it's reasonable to expect humanoid robots (Optimus-style) to be manufactured and deployed at a massive scale before 2030.

Rationale examples, high-growth respondents:

By 2040 we probably have superhuman AGI, an industrial explosion is underway and robots are a critical element. There's no reason why they can't be mass-manufactured like cars are.

With both China and the US racing to develop humanoid robots controlled by ever more capable AI, the state of the art for all robot devices could dramatically increase. The development of actuators for arms and fingers can be applied to static robot arms increasing their dexterity and usefulness. The development of AI for 3D environments coupled with LLM text manipulators should greatly increase the speed of deployment and the types of problems addressed. Deployment may not be a hockey stick but will have a dramatically positive slope for the foreseeable future.

Major manufacturing hubs (China, Japan, Germany, South Korea) are facing shrinking working-age populations, making automation an existential necessity rather than just a cost-saving measure…Geopolitical tensions are driving 'friend-shoring' and the construction of new manufacturing facilities (e.g., semiconductor fabs, EV battery plants) in Western countries, which require high levels of automation to offset higher domestic labor costs.

Robotic manufacturing is going to boom in a very non linear fashion, similar to chatgpt, once we have one robot that is highly skilled we will be able to copy its weights on many different machines which will push the production very quickly.

Labor Share

Question. What will the U.S. labor share be in the following resolution years for the private, nonfarm sector?

Labor Share. The figure above shows the distribution of forecasts by participant group, illustrating the median (50th percentile) and interquartile range (25th–75th percentiles) of each forecast.

Results. All participant groups forecast that U.S. labor share will remain close to its current value in the near term. From a 2024 baseline of 58.3%, the median expert, public participant, and superforecaster each forecast labor share will edge down to 58% in 2026 and to 56% by 2030. By 2040, forecasts diverge modestly, with the median public and superforecasters both forecasting labor share falling to 53%, compared with 51% for experts. The expert distribution is shifted lower than the public's by 2040, especially at the 25th and 50th percentiles, where the differences are statistically significant. Even so, the central forecast in all three groups remains above 50%.

Rationale analysis

  • Historical precedent vs. transformative AI: Nearly all forecasters expect the decline in the U.S. labor share—"from roughly 68% in 2001 to 58% in 2024"—to continue. The core difference among forecasters is by how much. Moderate-decline respondents tend to anchor on the existing trendline, particularly for near-term outcomes. One notes, "labor share moves like a supertanker," and another that, "labor share tends to change slowly because it is embedded in deep institutional and economic structures…labor markets adjust with demographic cycles measured in decades." These forecasters see AI as contributing to future declines, but in a way that fits neatly into the existing, long-term drive toward automation and that also includes other major drivers, such as "globalization, capital deepening, and structural shifts in bargaining power." Major-decline respondents argue that advances in AI will significantly accelerate the existing trend, particularly when it comes to long-term outcomes: "The U.S. labor share has decreased by roughly 10 p.p over the past 40 years…transformative AI will produce a similar decline in the labor share, but in less than half the time."
  • AI as substitute for labor: Major-decline respondents view AI as primarily substituting for labor across expanding domains. One writes: "By [2040] it is likely that we will have made such significant gains in AI developments that AI systems can outperform humans at: a) nearly cognitive tasks and; b) a significant share of physical tasks. This means that we will see an unprecedented decline in the labor share for nonfarm sectors such as finance, insurance, information technology and others where a small remaining share of employees occupy managerial positions or those around maintenance of AI systems and infrastructure." Moderate-decline respondents typically envision a slower AI-diffusion timeline and stress that AI is likely to change the nature of, but not eliminate, human labor: "Near-term AI adoption is more likely to reconfigure tasks and increase capital intensity within occupations rather than fully displace labor at scale," writes one, and another that "if AI and automation increases labor productivity in manufacturing…and a bigger share of labor is used in the provision of personal services that are not really suitable to automate (e.g. personal care), then it's possible that the labor share will increase."
  • Policy responses: Many moderate-decline respondents cite the potential for government policies to act as a stabilizing force. "Robots don't vote humans do," writes one, adding, "governments could decide that some classes of jobs should be reserved for humans…many non-traditional professions require a licence, which is often just a job protection measure." Another notes the question outcome "does not just depend on the technology but also on how regulation will react to a decline in labor share. Such a reaction could be to enforce a higher labor share, to implement something closer to universal basic income, or to do neither." Major-decline respondents express skepticism. One writes, "It is possible to stem this drop but it will require greater political will and cooperation than has been apparent in the last 25 years." Another notes that not all mitigating responses would stem the labor share decline: "I expect mounting political discontent and calls for windfall taxes or UBI by the early 2030s, government transfers generally do not count as 'labour share' in BLS methodology, so even if society adapts to keep humans materially comfortable, the technical metric will still crater."
  • Capital capture of productivity gains: Many major-decline respondents argue that most AI-generated value will flow to owners of capital: "The rise in AI will increase productivity and profits, but those yields will go to shareholders, not the actual workers."; "Labor share will tend to go down, unless we do something about the inequality between labor and capital. Given the current trends it is unlikely to happen." Although a minority, several forecasters argue countervailing wage pressures exist, particularly from demographics: "Perhaps more important in the short term is the aging population in the U.S. and the decline of immigration. That will put pressure on the labor market and might increase pressure to drive this upward."
  • Existence of a structural floor: Moderate-decline respondents argue labor share cannot fall indefinitely without systemic breakdown: "It obviously can't go to zero, because then no one is earning a living...otherwise there will be no income earned by people to spend on the goods that the AI-assisted businesses are selling." Major-decline respondents either don't engage with this argument or tend to point to the possibility of an exceptionally low floor, with one writing: "My median of 25% assumes a post-scarcity or ASI-driven economy where biological human labour has been relegated to a niche prestige market."

Rationale examples, moderate-decline respondents:

The decades-long decline in U.S. labor share from roughly 68% in 2001 to 58% in 2024 is striking, but the drivers appear to be largely structural and slow-moving, including increased market concentration, the shift toward asset-light and platform-based business models, and long-run changes in bargaining power, rather than primarily automation. Given that the trend has already decelerated somewhat in recent years and seems to be approaching a floor, I've projected a modest continued decline to around 57% by 2026 and 2030.

The main upward forces are: demographic labor scarcity (shrinking prime-age workforce bidding up wages), the reinstatement effect (automation historically creating new human task categories), and possible policy responses such as antitrust enforcement and labor market regulation.

The future of labor will not be like 'The Jetsons' cartoon, in which labor is a dumb widget pushing a button—rather labor will bifurcate between a more productive, higher paid skilled workforce boasted by mastery of LLM and robotics versus lower productive, lower paid non-tech workforce whose productivity is essentially unchanged. A key piece to slowing this trend will be enacting policies to protect labor (workers) and this has historically been difficult until a crisis is at hand (e.g., the horrific working conditions present in the industry during the late 1890s and early 1900s finally spurred labor reform. This time, protections for workers displaced by automation will need to be enacted at potentially considerable expense.

Following Acemoglu's recent work (e.g., The Simple Macroeconomics of AI), I expect AI's macro effects over the next decade to be relatively modest, with the effect on labor share depending on the balance between automation and the creation of new labor-using tasks. This leads me to predict a gradual further decline in labor share by 2026 and 2030, and a wider range of possible outcomes by 2040.

Rationale examples, major-decline respondents:

The primary driver is the race between AI automating existing tasks (lowering labor share) and AI creating new, human-centric tasks (raising it). I assume automation will outpace reinstatement. AI technologies require massive capital investment (compute, energy, robotics). The returns on these investments will naturally flow to capital owners rather than workers…This historical trajectory demonstrates a consistent shift of economic output toward capital, heavily influenced by software automation and globalization. The downward pressure reported in Q3 2025 suggests that current macroeconomic conditions and early AI efficiencies are continuing this established trend.

AI will tremendously accelerate this rate of decline, most especially by 2040. This will happen for two reasons, both heavily AI driven. The first is that AI automation of both physical and cognitive work is going to eventually wipe out as much as 80% of jobs people now do. The second is that AI from 2030 to 2040 will cause a tremendous boom in productivity, which will significantly increase gross domestic product. So the…denominator will significantly increase.

The industrial revolution replaced manual tasks and provided opportunities in the cognitive space, [but] AI is now targeting both. The speed at which this is happening is also exponential, which means that there is less time to adapt & create new areas/industries. It could take decades before we can discover new areas…to reskill into.

Coffee Test

Question. In what year will an autonomous robot first successfully perform the “Coffee Test” (see Background and Resolution Criteria for precise definitions) in 3 consecutive trials, each in distinct homes?

Coffee Test. The figure above shows the distribution of forecasts by participant group, illustrating the median (50th percentile) and interquartile range (25th–75th percentiles) of each forecast.

Results. Experts forecast that an autonomous robot will first successfully complete the Coffee Test in the mid-2030s, with a median forecast of 2034. Superforecasters give an earlier median forecast of 2032, while the public forecast was substantially later, at 2039. Both experts and superforecasters forecast earlier resolution than the public across the full distribution: expert-public differences are statistically significant at the 5th, 25th, 50th, 75th, and 95th percentiles, while experts and superforecasters do not differ significantly on those percentiles. Uncertainty is substantial; the median expert gives only a 5% chance of resolution by 2028 and a 95% chance only by 2050.

Rationale analysis

  • Perception of current capabilities: This was a large divide. Respondents who predict an early resolution tend to believe the ingredients (beyond coffee!) for success largely exist today, and the challenge is primarily one of integration and engineering: "This is a case of 'it will happen whenever a team decides to invest a few months to do it' [given that] the technical capability is already here."; "I've already observed robots that come remarkably close to this, so I don't think this is very far away." Whereas late-resolution respondents typically see a wide chasm between current systems and the test's requirements, with one writing, "We do not have robots that are even close to that level of capability, not even in laboratory settings. We are only beginning to develop robotic hands with enough tactile sensitivity to avoid breaking the objects they handle and to adapt to different shapes and materials."
  • Dexterity: Late-resolution respondents often treat fine-motor-skill manipulation as a significant, and unsolved, obstacle that will need to be overcome before the test can be successfully performed: "Humans (hands) making coffee combine strength, fine control, tactical feedback, and reflexive adjustment in ways that are deeply integrated into our hardware and are generally subconscious. Because AI systems still find it difficult to learn 'from touch' I sense progress on dexterous manipulation will continue to lag." Another adds that "grinding coffee beans, inserting paper filters, scooping loose grounds, and controlling pours all involve different grasps and forces, often under visual occlusion where the robot's own gripper blocks its camera view." Many early-resolution respondents push back on that narrative, with one writing that "coffee beans, filters, and mugs are rigid or semi-rigid objects that obey predictable physics, completely bypassing the adversarial chaos of manipulating deformable or living things…The hard part is perception and planning rather than dexterity."
  • Generalization capabilities: Many early-resolution respondents believe Vision Language Action foundation models are close to being able to navigate new environments well. One argues that "current LLMs can already handle most of the reasoning and planning required," and adds that "there's no requirement that all processing happen on-device, so the robot can offload reasoning to powerful frontier models," while another points to sim2real transfer methods, as technology that is enabling the type of generalization capabilities that will be needed to pass the test. Other early-resolution respondents discount the need for robust generalization: "The task really doesn't seem that hard considering it should be around ~10 steps which should be trainable." Late-resolution respondents, however, tend to see generalization as another core, unsolved challenge: "Humans put the coffee filters in the wrong drawer, hide mugs in bizarre cabinets, leave the machine unplugged, use unfamiliar brewers, and create cluttered geometry that was clearly designed by a raccoon with a grudge. Passing this test three times in three different homes means the robot has to survive that entire carnival of entropy."
  • Incentives: Several early-resolution respondents argue that the coffee test's status as a celebrated challenge might accelerate timelines. One speculates that "there may already be a company that is focused on developing a coffee making robot just for the sake of saying they were the company that created the coffee making robot, to gain publicity." Late-resolution respondents tend to point out that, prominence of the test notwithstanding, economic incentives point elsewhere: "This is also a use case that has marginal commercial applicability for an adequate return on investment. Why would a company that wants to make money seriously focus on this use case?"

Rationale examples, early-resolution respondents:

Presumption that the fusion of advancements in sensors, mobile computing processing, software programming, and mechanical dexterity will make a successful test possible in the near-future—particularly as multiple teams have publicly committed to projects specific to the Coffee Test challenge.

Unlike autonomous vehicles, this test is conducted in a pre-consented experimental setting, removing social and regulatory barriers. Current humanoid robots (Tesla Optimus, Figure) have already demonstrated coffee-pouring in structured settings.

The challenge is complex—but it's essentially pattern recognition. For that reason, I expect it to be solvable with current state software technology (the brains), coupled with some further developments in the field of robotics (the muscle).

The largest challenge is the manual dexterity to turn on various water faucets the robot might face, opening cabinets, measuring coffee and pouring water. These fine motor skills have continued to be a large challenge in robotics but there is plenty of incentive to solve these skills as they would be broadly applicable in a large number of roles from short order cooks, to fulfillment centers (pickers), to basic home repair.

Rationale examples, late-resolution respondents:

Tiny, imperceptible changes in the environment are a form of noise that complicates the robots' tasks exponentially. Software alone cannot overcome all of these physical limitations. Multiple companies recently claimed they would use AI to overcome all obstacles and have the coffee test mastered in the next few years. It's easy to classify those comments as highly speculative by looking at what recognized industry leaders are currently doing: Boston Dynamics routinely repeats that tasks such as making coffee are still years away, and even makes fun of companies with that focus.

As of today, foundation models for robotics consistently lose 40–80% of their success rate in real-world transfer, with the worst degradation on contact-rich, long-horizon tasks, which is exactly what brewing coffee in a stranger's kitchen requires.

The strict form of Wozniak's Coffee Test is not 'robot touches a Keurig after a choreographed demo.' It is a robot entering an unfamiliar home, finding the kitchen, finding the machine, coffee, water, and mug, inferring how this particular setup works, and then successfully making coffee. The stronger version you are asking for is even nastier: three consecutive successes in three distinct homes. That is not just manipulation. That is open world navigation, object search, contextual inference, long horizon planning, fault recovery, and robust action under household weirdness and general human living system chaos. Factoring in the myriad ways to create a cup of coffee and this becomes [akin to an] NP-Hard problem.

Robotic Surgeon

Question. In what year will a robot first successfully perform an appendectomy—surgery to remove an appendix—on a live human, without physical human help?

Robotic Surgeon. The figure above shows the distribution of forecasts by participant group, illustrating the median (50th percentile) and interquartile range (25th–75th percentiles) of each forecast.

Results. Experts forecast that a robot will first successfully perform an appendectomy without physical human help around 2040. The median public participant forecasts a similar date in 2039, while superforecasters are earlier at 2035. Broadly, experts and the public give similar predictions, with little evidence of systematic difference outside the earliest tail of the distribution. By contrast, superforecasters are consistently earlier than experts, especially in the middle and upper parts of the distribution. Uncertainty remains wide: the median expert gives a 5% chance of resolution by 2030, rising to 95% only by 2060.

Rationale analysis

  • Technical readiness—surgery vs. coffee: Many early-resolution respondents argue that surgery is a more tractable robotics problem than general-purpose tasks like the previously considered coffee test because the operating room is standardized and the procedure well-defined: "Surgery takes place in a constrained environment with controlled lighting, sterile instrumentation, known anatomy, and well understood procedural steps…From a robotics perspective that looks much more like an industrial manipulation problem than like navigating the chaos of someone's kitchen while searching drawers for coffee filters." But many late-resolution respondents argue that the structured-environment advantage is illusory once live tissue enters the picture: "An appendectomy requires manipulating wet, deformable, actively bleeding tissue that varies wildly in geometry from patient to patient. While the 2025 JHU demonstration on ex vivo pig organs was an impressive proof of concept, the gap between dead isolated tissue and a live human abdomen introduces a level of adversarial physical complexity." Another emphasizes that "the survey explicitly requires that no human steps in to tidy up, such as closing the wound. Performing perfect, adaptive suturing or knot-tying on live, elastic skin and fat layers demands an incredibly high level of haptic feedback and non-rigid body manipulation from the robot."
  • Jurisdiction: Another common point of divergence between forecasters concerns assumptions about where the first procedure will occur and what the accompanying regulations will be. Early-resolution respondents often argue that a less restrictive jurisdiction "likely outside the United States and Europe" will host the first success, dramatically compressing timelines: "I suspect it will happen in China (advanced AI, more risk tolerant) or a country with limited medical care after the equivalent technical capability has been proven on animals." Late-resolution respondents tend to focus more on regulatory and institutional hurdles in the West, particularly given that "the field faces a 'zero-fail' requirement": "A decade-long timeline seems at least necessary to reflect the significant regulatory and ethical hurdles," writes one. Others point to malpractice exposure, insurance frameworks, and surgeon lobbying as potential multi-decade blockers, with one writing, "I don't think any legal framework for liability exists today. That by itself will take forever." One late-resolution respondent argues that jurisdictions with weaker regulation are "unlikely to be the countries where these machines are being developed or deployed."
  • Economic incentive: Some late-resolution respondents emphasize that appendectomies are already safe, fast, and relatively inexpensive. One writes: "The autonomous system therefore has to be functionally perfect to justify the switch, which is a much harder bar than 'better than the alternative.'" Another adds, "It isn't clear the cost of such a system would make sense for most use cases. Appendectomy is a routine surgery that can be handled easily by many surgeons. The procedure itself is usually short and replacing the surgeon doesn't eliminate the pre and post care costs." Several early-resolution respondents, however, point to surgeon shortages and cost pressures: "Autonomous surgery has strong…support because of the high cost of surgical labor and the capacity constraints in the healthcare system." One speculates that "environments without doctors like the ISS or Antarctica" would be prime candidates for such robots.

Rationale examples, early-resolution respondents:

Existing systems like da Vinci already perform appendectomies with remote guidance. Since verbal commands are permitted, the main difference from current teleoperation is replacing the joystick with voice instructions. The remaining barriers are almost entirely regulatory, not technical. The key advantage of this approach is enabling surgery in remote locations (military, space, rural areas). ARPA-H [Advanced Research Projects Agency for Health] is already soliciting proposals for autonomous surgical robotic systems. I expect this milestone by 2030, likely first in a military or research context where regulatory approval is faster.

Capability-wise, this problem is actually easier than the Coffee Test by an extreme margin. Surgery takes place in a constrained environment with controlled lighting, sterile instrumentation, known anatomy, and well understood procedural steps. An appendectomy in particular is one of the most standardized operations in modern medicine. Laparoscopic appendectomies are already highly instrument-mediated procedures where the surgeon manipulates tools through fixed ports while viewing a camera feed…The workspace is known, the tools are known, and the task sequence is well characterized.

When this milestone is achieved, it is very likely that it will be achieved in a jurisdiction like China, which has fewer of these adoption blockers and where one can imagine a Chinese research hospital documenting an autonomous appendectomy on a consented patient under an IRB [institutional review board] protocol [and that this] would fully resolve this question.

Rationale examples, late-resolution respondents:

Across the levels of autonomy for surgical robotics (LASR), with levels spanning from 0 (no autonomy) to 5 (full autonomy)...not a single system designated as L4 or L5 has been cleared anywhere, and my overall expectation here is that capabilities are likely to get there much faster than the (western) regulatory landscape due to adoption blockers from surgeon experts, hesitance around malpractice liability, a generally sclerotic medical device approval system, and a widespread cultural hesitance against AI being used in such high-stakes settings.

The threshold for letting a robot operate on a live human without physical intervention is shaped by much more than capability alone. It depends on whether the system can remain reliable through anatomical variation, minor complications, and the contingent texture of surgery, but it also depends on ethics review, liability, regulation, clinical trust, and the willingness of institutions to authorize a first case.

JHU [Johns Hopkins University] performed the procedure on an animal ex-vivo. In-vivo animal surgery needs to be met first before an IBR will approve it as an experimental procedure in a human. Little data exists to perform such a procedure on live animals and the in-vivo milestone will be difficult. Generalization from animal to human is uncertain which likely will make IBR approval difficult. Another IBR hurdle is that an appendectomy is mostly an emergency procedure and IBR decisions are difficult to be made in such tight timeline scenarios.

First, robotics will need to improve substantially to allow this type of highly dexterous operation to occur. Considering robotics are still struggling with 'picking' in fulfillment centers this seems to be at least 5-7 years off. Moreover, LLMs have not aided significantly in the advancement of robotic training where human demonstrations and RL techniques continue to dominate. Second, after sufficiently capable robotic hands are developed, rigorous testing and approval will be required, likely taking additional years until a human demonstration occurs.

Amazon Distribution Workers

Question. What will Amazon’s real e-commerce net sales per distribution employee be in the fiscal years ending in or before each of the following resolution years (in 2025 USD, i.e. adjusted for inflation)?

Amazon Distribution Workers. The figure above shows the distribution of forecasts by participant group, illustrating the median (50th percentile) and interquartile range (25th–75th percentiles) of each forecast.

Results. All participant groups expect Amazon's real e-commerce net sales per distribution employee to rise over time from the 2024 baseline of $464,771.3 The median expert forecasts $480,000 by 2026, $549,842 by 2030, and $800,000 by 2040. The public is consistently less bullish, forecasting $470,280, $484,978, and $506,819, respectively, while superforecasters are more bullish than both experts and the public, forecasting $510,000, $650,000, and over $1,000,000 by 2040. Differences between experts and the public are already statistically significant by 2026 and widen substantially by 2030 and 2040. Uncertainty also grows markedly over time, especially among experts and superforecasters; by 2040, the 75th percentile forecast reached $1,000,000 for experts and $1,600,000 for superforecasters.

Rationale analysis

  • Amazon's intent: Fast-automation respondents often point to Amazon's profit motives, along with leaked internal plans and public statements about automation goals, as likely to prove determinative: "More robots, [fewer] workers…greater return then per worker…Amazon will do everything in its power to accelerate this transformation."; "The question is how quickly Amazon is actually planning to meet its 2033 goal to avoid hiring more than 600,000 U.S. workers per the internal documents [leaked by the New York Times.]" While slow-automation respondents generally don't dispute that these factors will play a meaningful role, a few note that political backlash may ensue and that Amazon "may decide that maintaining minimal levels of employment may be a political price worth paying for to support its overall business model."
  • Suitability of warehouse environments: Many fast-automation respondents characterize warehouses as nearly ideal for robotic deployment, especially given anticipated advances in robotic vision and haptics. One writes that warehouses are "highly structured, and the task very repeatable. Moreover, the massive scale of their operations means the training data can be substantial and the cost savings substantial enough to justify the training expenses." Another adds that "fulfillment work is often repetitive and tedious, and humans do not have a strong advantage over robots in performing these tasks." Several slow-automation respondents, however, note that the massive number of products sold creates challenges: "Amazon has [millions of] SKUs. The long tail of products (weird shapes, fragile items, variable packaging) is genuinely hard for robots. The last 20% of SKUs might take 10 years longer than the first 80%."
  • Numerator vs. Denominator Dynamics: Fast-automation respondents commonly expect the ratio to skyrocket because they anticipate a decoupling of sales from labor as sales rise and worker headcounts plummet: "As Amazon aggressively unbundles physical logistics from biological labour, the denominator [workers] will shrink substantially while sales [the numerator] continue compounding." But several slow-automation respondents emphasize that Amazon historically grows into its automation, meaning headcount is likely to stay flat or increase as the company "increases robots, but retains [the] bulk of [its] current workforce," thereby limiting the collapse of the denominator. One points to Amazon's expansion into groceries and the "unique handling needs of food" as a specific reason to retain headcount. A few others question the assumption that Amazon's sales will keep rising indefinitely, with one writing, "I expect Amazon to actually lose market value as disruptive companies are coming in. So even if they focus on robotics, the average income per employee will go down."
  • Deployment constraints: Fast-automation respondents tend to acknowledge "Amazon [as] a world leader in the automation of its distribution centres," and view deployment as a smooth continuation of current trends. Several slow-automation respondents instead emphasize that the deployment of physical infrastructure typically requires considerable time and money, constraints which will limit the pace of change: "Retrofitting existing fulfilment centres will be expensive, and disruptive, so this will take a while to start changing the metrics." Another speculates that Amazon may already be "nearing peak capacity in terms of the number of automation projects [it] can deliver every year."

Rationale examples, fast-automation respondents:

Amazon has every financial incentive to automate distribution work, the technology is already being deployed at scale, and unlike the Coffee Test or robotic surgery, there's no meaningful regulatory or societal friction stopping them from replacing warehouse workers with robots. Sales are going to keep climbing as e-commerce continues to grow, and headcount will either stay flat or actively shrink, so the ratio improves on both ends simultaneously.

Warehouse operations seem almost perfect for cheap automation, not politically protected (nobody really cares that much for warehouse work, there will probably be better uses for humans once anyone can follow instructions from AI and work in HVAC maintenance or whatever). So I'm guessing warehouses keep a tiny crew to maintain the automation. Using a small human crew is probably cheaper than using smart general-purpose robots that could be doing something else.

Amazon [is] pretty ruthless when it comes to modernizing and efficiency. I think they will aggressively reduce headcount. There is a good case that they will be able to become more efficient and capture ever more sales, by offering attractive cheap prices, even to a population with less disposable income.

Amazon's high warehouse turnover rates offer a natural pathway for headcount reduction through attrition rather than direct layoffs…

Rationale examples, slow-automation respondents:

The starting point of ~465k already reflects two decades of operational optimization inside Amazon's logistics system. The early gains came from warehouse layout optimization, better routing algorithms, and incremental mechanization. More recently the gains have come from denser robotic picking systems, software driven inventory placement, and improvements in fulfillment network topology. The important thing for forecasting is that productivity improvements in physical logistics tend to compound gradually rather than explosively. Warehouses are capital intensive systems with long equipment lifetimes, and changes propagate through the network only as new facilities are built or existing ones are retooled.

This is not only a factor of automation, but also political stability and health of the economy. Large shifts in labor create risks along these lines.

Amazon is a company that heavily invests in robotic development and deployment, often developing its own integrated custom solutions. It's therefore most likely that they are already nearing peak capacity in terms of the number of automation projects Amazon can deliver every year. A massive uptick could happen only if a new generation of robots capable of dealing with the complexity of packaging and sorting tasks becomes available. Even then, the costs associated with deploying such automations could provide a significant limiting factor, particularly if these automations are slower than their human counterparts.

Footnotes

  1. Some respondents appear to have assumed (incorrectly) that no human coaching was permitted for this test, which likely also shifted forecasts later.

  2. https://fred.stlouisfed.org/series/MPU4910141

  3. Some respondents appear to have misinterpreted this question as including last-mile delivery drivers, who typically are not Amazon employees, which may have introduced a small downward bias to the forecasts. 2

  4. In some cases, the "aggregate" refers to the mean; in others, the median is used, depending on which is more appropriate for the distribution of responses. 2 3 4 5

  5. We occasionally elicit participants' quantile forecasts (estimates of specific percentiles of a continuous outcome) to illustrate the range and uncertainty of their predictions. 2 3 4 5

Cite Our Work

Please use one of the following citation formats to cite this work.

APA Format

Murphy, C., Rosenberg, J., Canedy, J., Jacobs, Z., Flechner, N., Britt, R., Pan, A., Rogers-Smith, C., Mayland, D., Buffington, C., Kučinskas, S., Coston, A., Kerner, H., Pierson, E., Rabbany, R., Salganik, M., Seamans, R., Su, Y., Tramèr, F., Hashimoto, T., Narayanan, A., Tetlock, P. E., & Karger, E. (2025). The Longitudinal Expert AI Panel: Understanding Expert Views on AI Capabilities, Adoption, and Impact (Working paper No. 5). Forecasting Research Institute. Retrieved 2026-04-22, from https://leap.forecastingresearch.org/reports/wave7

BibTeX

@techreport{leap2025,
    author = {Murphy, Connacher and Rosenberg, Josh and Canedy, Jordan and Jacobs, Zach and Flechner, Nadja and Britt, Rhiannon and Pan, Alexa and Rogers-Smith, Charlie and Mayland, Dan and Buffington, Cathy and Kučinskas, Simas and Coston, Amanda and Kerner, Hannah and Pierson, Emma and Rabbany, Reihaneh and Salganik, Matthew and Seamans, Robert and Su, Yu and Tramèr, Florian and Hashimoto, Tatsunori and Narayanan, Arvind and Tetlock, Philip E. and Karger, Ezra},
    title = {The Longitudinal Expert AI Panel: Understanding Expert Views on AI Capabilities, Adoption, and Impact},
    institution = {Forecasting Research Institute},
    type = {Working paper},
    number = {5},
    url = {https://leap.forecastingresearch.org/reports/wave7}
    urldate = {2026-04-22}
    year = {2025}
  }