healthcarereimagined

Envisioning healthcare for the 21st century

  • About
  • Economics

A new chip architecture points to faster, more energy-efficient AI – IBM

Posted by timmreardon on 11/17/2023
Posted in: Uncategorized.

A new chip prototype from IBM Research’s lab in California, long in the making, has the potential to upend how and where AI is used efficiently.

We’re in the midst of a Cambrian explosion in AI. Over the last decade, AI has gone from theory and small tests to enterprise-scale use cases. But the hardware used to run AI systems, although increasingly powerful, was not designed with today’s AI in mind. As AI systems scale, the costs skyrocket. And Moore’s Law, the theory that the density of circuits in processors would double every two years, has slowed.

But new research out of IBM Research’s lab in Almaden, California, nearly two decades in the making, has the potential to drastically shift how we can efficiently scale up powerful AI hardware systems.

Since the birth of the semiconductor industry, computer chips have primarily followed the same basic structure, where the processing units and the memory storing the information to be processed are stored discretely. While this structure has allowed for simpler designs that have been able to scale well over the decades, it’s created what’s called the von Neumann bottleneck, where it takes time and energy to continually shuffle data back and forth between memory, processing, and any other devices within a chip. The work by IBM Research’s Dharmendra Modha and his colleagues aims to change this, taking inspiration from how the brain computes. “It forges a completely different path from the von Neumann architecture,” according to Modha.

Over the last eight years, Modha has been working on a new type of digital AI chip for neural inference, which he calls NorthPole. It’s an extension of TrueNorth, the last brain-inspired chip that Modha worked on prior to 2014. In tests on the popular ResNet-50 image recognition and YOLOv4 object detection models, the new prototype device has demonstrated higher energy efficiency, higher space efficiency, and lower latency than any other chip currently on the market, and is roughly 4,000 times faster than TrueNorth.

The first promising set of results from NorthPole chips were published today in Science. NorthPole is a breakthrough in chip architecture that delivers massive improvements in energy, space, and time efficiencies, according to Modha. Using the ResNet-50 model as a benchmark, NorthPole is considerably more efficient than common 12-nm GPUs and 14-nm CPUs. (NorthPole itself is built on 12 nm node processing technology.) In both cases, NorthPole is 25 times more energy efficient, when it comes to the number of frames interpreted per joule of power required. NorthPole also outperformed in latency, as well as space required to compute, in terms of frames interpreted per second per billion transistors required. According to Modha, on ResNet-50, NorthPole outperforms all major prevalent architectures — even those that use more advanced technology processes, such as a GPU implemented using a 4 nm process.

How does it manage to compute with so much efficiency than existing chips? One of the biggest differences with NorthPole is that all of the memory for the device is on the chip itself, rather than connected separately. Without that von Neumann bottleneck, the chip can carry out AI inferencing considerably faster than other chips already on the market. NorthPole was fabricated with a 12-nm node process, and contains 22 billion transistors in 800 square millimeters. It has 256 cores and can perform 2,048 operations per core per cycle at 8-bit precision, with potential to double and quadruple the number of operations with 4-bit and 2-bit precision, respectively. “It’s an entire network on a chip,” Modha said.

“Architecturally, NorthPole blurs the boundary between compute and memory,” Modha said. “At the level of individual cores, NorthPole appears as memory-near-compute and from outside the chip, at the level of input-output, it appears as an active memory.” This makes NorthPole easy to integrate in systems and significantly reduces load on the host machine.

But the biggest advantage of NorthPole is also a constraint: it can only easily pull from the memory it has onboard. All of the speedups that are possible on the chip would be undercut if it had to access information from another place. Via an approach called scale-out, NorthPole can actually support larger neural networks by breaking them down into smaller sub-networks that fit within NorthPole’s model memory, and connecting these sub-networks together on multiple NorthPole chips. So while there is ample memory on a NorthPole (or collectively on a set of NorthPoles) for many of the models that would be useful for specific applications, this chip is not meant to be a jack of all trades. “We can’t run GPT-4 on this, but we could serve many of the models enterprises need,” Modha said . “And, of course, NorthPole is only for inferencing.”

This efficacy means that the device also doesn’t need bulky liquid-cooling systems to run — fans and heat sinks are more than enough — meaning that it could be deployed in some rather small spaces.

Potential applications for NorthPole

While research into the NorthPole chip is still ongoing, its structure lends itself to emerging AI use cases, as well as more well-established ones.

In testing, NorthPole team focused primarily on computer vision-related uses, in part because funding for the project came from the U.S. Department of Defense. Some of the primary applications in consideration were detection, image segmentation, and video classification. But it was also tested in other arenas, such as natural language processing (on the encoder-only BERT model) and speech recognition (on the DeepSpeech2 model). The team is currently exploring mapping decoder-only large language models to NorthPole scale-out systems.

When you think of these AI tasks, all sorts of fantastical use cases spring to mind, from autonomous vehicles, to robotics, digital assistants, or spatial computing. Many sorts of edge applications that require massive amounts of data processing in real time could be well-suited for NorthPole. For example, it could potentially be the sort of device that’s needed to move autonomous vehicles from machines that require set maps and routes to operate on a small scale, to ones that can think and react to the rare edge-case situations that make navigating in the real world so challenging even for proficient human drivers. These sorts of edge-cases are the exact sweet spot for future NorthPole applications. NorthPole could enable satellites that monitor agriculture and manage wildlife populations, monitor vehicle and freight for safer and less congested roads, operate robots safely, and detect cyber threats for safer businesses.

What’s next

This is just the start of the work for Modha on NorthPole. The current state of the art for CPUs is 3 nm — and IBM itself is already years into research on 2 nm nodes. That means there’s a handful of generations of chip processing technologies NorthPole could be implemented on, in addition to fundamental architectural innovations, to keep finding efficiency and performance gains.

But for Modha, this is just one important milestone along a continuum that has dominated the last 19 years of his professional career. He’s been working on digital brain-inspired chips throughout that time, knowing that the brain is the most energy-efficient processor we know, and searching for ways to replicate that digitally. TrueNorth was fully inspired by the structures of neurons in the brain — and had as many digital “synapses” in it as the brain of a bee. But sitting on a park bench in 2015 in San Francisco, Modha said he was thinking through his work to date. He had the belief that there was something in marrying the best of traditional processing devices with the structure of processing in the brain, where memory and processing are interspersed throughout the brain. The answer was “brain-inspired computing, with silicon speed,” according to Modha.

Over the next eight years, Modha and his colleagues were single-minded and hermetic in their goal of turning this vision into a reality. Toiling inconspicuously in Almaden, the team didn’t give any lectures or publish any papers on their work, until this year. Each person brought different skills and perspective yet everyone collaborated so that as a whole the team’s contribution was much greater than the sum of the parts. Now, the plan is to show what NorthPole could do, while exploring how to translate the designs into smaller chip production processes and further exploring the architectural possibilities.

This work stemmed from simple ideas — how can we make computers that work like the brain — and after years of fundamental research, has come up with an answer. Something that is really only possible today at a place like IBM Research, where there is the time and space to explore the big questions in computing, and where they can take us. “NorthPole is a faint representation of the brain in the mirror of a silicon wafer,” Modha said.

Article link: https://research.ibm.com/blog/northpole-ibm-ai-chip

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on LinkedIn (Opens in new window) LinkedIn
Like Loading...

Related

Posts navigation

← Pentagon developing repository to document when AI goes wrong – DefenseScoop
The first IBM Quantum System One on a university campus comes to RPI →
  • Search site

  • Follow healthcarereimagined on WordPress.com
  • Recent Posts

    • Hype Correction – MIT Technology Review 12/15/2025
    • Semantic Collapse – NeurIPS 2025 12/12/2025
    • The arrhythmia of our current age – MIT Technology Review 12/11/2025
    • AI: The Metabolic Mirage 12/09/2025
    • When it all comes crashing down: The aftermath of the AI boom – Bulletin of the Atomic Scientists 12/05/2025
    • Why Digital Transformation—And AI—Demands Systems Thinking – Forbes 12/02/2025
    • How artificial intelligence impacts the US labor market – MIT Sloan 12/01/2025
    • Will quantum computing be chemistry’s next AI? 12/01/2025
    • Ontology is having its moment. 11/28/2025
    • Disconnected Systems Lead to Disconnected Care 11/26/2025
  • Categories

    • Accountable Care Organizations
    • ACOs
    • AHRQ
    • American Board of Internal Medicine
    • Big Data
    • Blue Button
    • Board Certification
    • Cancer Treatment
    • Data Science
    • Digital Services Playbook
    • DoD
    • EHR Interoperability
    • EHR Usability
    • Emergency Medicine
    • FDA
    • FDASIA
    • GAO Reports
    • Genetic Data
    • Genetic Research
    • Genomic Data
    • Global Standards
    • Health Care Costs
    • Health Care Economics
    • Health IT adoption
    • Health Outcomes
    • Healthcare Delivery
    • Healthcare Informatics
    • Healthcare Outcomes
    • Healthcare Security
    • Helathcare Delivery
    • HHS
    • HIPAA
    • ICD-10
    • Innovation
    • Integrated Electronic Health Records
    • IT Acquisition
    • JASONS
    • Lab Report Access
    • Military Health System Reform
    • Mobile Health
    • Mobile Healthcare
    • National Health IT System
    • NSF
    • ONC Reports to Congress
    • Oncology
    • Open Data
    • Patient Centered Medical Home
    • Patient Portals
    • PCMH
    • Precision Medicine
    • Primary Care
    • Public Health
    • Quadruple Aim
    • Quality Measures
    • Rehab Medicine
    • TechFAR Handbook
    • Triple Aim
    • U.S. Air Force Medicine
    • U.S. Army
    • U.S. Army Medicine
    • U.S. Navy Medicine
    • U.S. Surgeon General
    • Uncategorized
    • Value-based Care
    • Veterans Affairs
    • Warrior Transistion Units
    • XPRIZE
  • Archives

    • December 2025 (8)
    • November 2025 (9)
    • October 2025 (10)
    • September 2025 (4)
    • August 2025 (7)
    • July 2025 (2)
    • June 2025 (9)
    • May 2025 (4)
    • April 2025 (11)
    • March 2025 (11)
    • February 2025 (10)
    • January 2025 (12)
    • December 2024 (12)
    • November 2024 (7)
    • October 2024 (5)
    • September 2024 (9)
    • August 2024 (10)
    • July 2024 (13)
    • June 2024 (18)
    • May 2024 (10)
    • April 2024 (19)
    • March 2024 (35)
    • February 2024 (23)
    • January 2024 (16)
    • December 2023 (22)
    • November 2023 (38)
    • October 2023 (24)
    • September 2023 (24)
    • August 2023 (34)
    • July 2023 (33)
    • June 2023 (30)
    • May 2023 (35)
    • April 2023 (30)
    • March 2023 (30)
    • February 2023 (15)
    • January 2023 (17)
    • December 2022 (10)
    • November 2022 (7)
    • October 2022 (22)
    • September 2022 (16)
    • August 2022 (33)
    • July 2022 (28)
    • June 2022 (42)
    • May 2022 (53)
    • April 2022 (35)
    • March 2022 (37)
    • February 2022 (21)
    • January 2022 (28)
    • December 2021 (23)
    • November 2021 (12)
    • October 2021 (10)
    • September 2021 (4)
    • August 2021 (4)
    • July 2021 (4)
    • May 2021 (3)
    • April 2021 (1)
    • March 2021 (2)
    • February 2021 (1)
    • January 2021 (4)
    • December 2020 (7)
    • November 2020 (2)
    • October 2020 (4)
    • September 2020 (7)
    • August 2020 (11)
    • July 2020 (3)
    • June 2020 (5)
    • April 2020 (3)
    • March 2020 (1)
    • February 2020 (1)
    • January 2020 (2)
    • December 2019 (2)
    • November 2019 (1)
    • September 2019 (4)
    • August 2019 (3)
    • July 2019 (5)
    • June 2019 (10)
    • May 2019 (8)
    • April 2019 (6)
    • March 2019 (7)
    • February 2019 (17)
    • January 2019 (14)
    • December 2018 (10)
    • November 2018 (20)
    • October 2018 (14)
    • September 2018 (27)
    • August 2018 (19)
    • July 2018 (16)
    • June 2018 (18)
    • May 2018 (28)
    • April 2018 (3)
    • March 2018 (11)
    • February 2018 (5)
    • January 2018 (10)
    • December 2017 (20)
    • November 2017 (30)
    • October 2017 (33)
    • September 2017 (11)
    • August 2017 (13)
    • July 2017 (9)
    • June 2017 (8)
    • May 2017 (9)
    • April 2017 (4)
    • March 2017 (12)
    • December 2016 (3)
    • September 2016 (4)
    • August 2016 (1)
    • July 2016 (7)
    • June 2016 (7)
    • April 2016 (4)
    • March 2016 (7)
    • February 2016 (1)
    • January 2016 (3)
    • November 2015 (3)
    • October 2015 (2)
    • September 2015 (9)
    • August 2015 (6)
    • June 2015 (5)
    • May 2015 (6)
    • April 2015 (3)
    • March 2015 (16)
    • February 2015 (10)
    • January 2015 (16)
    • December 2014 (9)
    • November 2014 (7)
    • October 2014 (21)
    • September 2014 (8)
    • August 2014 (9)
    • July 2014 (7)
    • June 2014 (5)
    • May 2014 (8)
    • April 2014 (19)
    • March 2014 (8)
    • February 2014 (9)
    • January 2014 (31)
    • December 2013 (23)
    • November 2013 (48)
    • October 2013 (25)
  • Tags

    Business Defense Department Department of Veterans Affairs EHealth EHR Electronic health record Food and Drug Administration Health Health informatics Health Information Exchange Health information technology Health system HIE Hospital IBM Mayo Clinic Medicare Medicine Military Health System Patient Patient portal Patient Protection and Affordable Care Act United States United States Department of Defense United States Department of Veterans Affairs
  • Upcoming Events

Blog at WordPress.com.
  • Reblog
  • Subscribe Subscribed
    • healthcarereimagined
    • Join 154 other subscribers
    • Already have a WordPress.com account? Log in now.
    • healthcarereimagined
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Copy shortlink
    • Report this content
    • View post in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d