Author: Tronserve admin
Saturday 24th July 2021 09:32 PM
Endpoint AI in a Matter of Microwatts
Eta Compute now has production silicon for ‘world’s most energy-efficient’ endpoint AI processor
The ultra-low-power AI silicon startup Eta Compute has production silicon of its first product. The ECM3532 promises to enable AI and ML computation in IoT applications such as sensor nodes for a power budget of just microwatts.
The ECM3532 is a dual-core (Arm Cortex-M3 plus NXP CoolFlux DSP) SoC which can support sensor fusion applications in the microwatt range for battery-powered or energy-harvesting designs. Always-on image processing and sensor applications can be achieved with a power budget of 100µW.
“We believe that power consumption, latency and data generation combined with RF transmission are all factors limiting many sensing applications,” said Ted Tewksbury, CEO, Eta Compute. The new chip “essentially eliminates battery capacity as a barrier to thousands of IoT consumer and industrial applications,” he said.
Eta Compute calls its device “the world’s most energy-efficient edge AI processor” and is targeting it squarely at the AIoT, or artificial intelligence in internet of things devices. Typical applications are performing sensor fusion, sound classification, image classification or person detection without sending data to the cloud, to minimize power spent on wireless transmission. But with the limited power budgets these IoT endpoints have, power consumption of the chip really has to be less than a milliwatt to make sense, Tewksbury said.
“By virtue of the fact that we have a hundred to a thousand times greater energy efficiency [than competitors], we can do a hundred to a thousand times more inferences for a given battery life, or for the same level of functionality we can extend the battery life by that same factor,” Tewksbury said.
First, a proprietary voltage and frequency scaling technique on which Eta Compute holds seven patents (the company also has eight more pending). Continuous voltage and frequency scaling (CVFS) allows the voltage and clock frequency of both the DSP and the MCU core to be adjusted to meet the variable workloads of IoT devices.
“The internal supply voltage [can be adjusted] commensurate with that clock rate. So when the clock rate is low, we can reduce the voltage all the way down to the minimum required to sustain that clock rate, and when frequency goes up, we increase the voltage. Since power goes as voltage squared, we get an enormous reduction in the power consumption,” Tewksbury said.
Traditional dynamic voltage and frequency scaling methods are achieved by changing the state of a PLL (phase locked loop), which takes time. Eta Compute’s CVFS technique is achieved without a PLL, since the clock frequency is determined internally via a self-timed architecture.
“Since we don’t have PLLs… we can do this very quickly and continuously, both in terms of time as well as in voltage. So every single clock cycle we’re monitoring the workload and adjusting that clock in such a way that we minimize the energy per inference,” Tewksbury said. “We’re also continuously changing that voltage, so that it’s not just a discrete number of voltages as some of our competitors have, but it can change anywhere from 0.54V all the way up to 1.2V in a continuous manner.”
Another key ingredient is the chip’s hybrid multi-core architecture, a combination of an Arm Cortex-M3 MCU core and an NXP CoolFlex DSP core. The CVFS technique is used on both cores, independently — that is, they can run at different voltages and frequencies to minimize the energy used.
Either (or both) cores can be used for the AI/ML workload, said Tewksbury, pointing out that workloads such as signal conditioning and feature extraction are better suited to the DSP. Workloads are allocated between the cores by software.
The final ingredient in Eta Compute’s secret sauce is optimization of neural networks for specific applications that can increase power efficiency by an order of magnitude compared to designs from the standard TensorFlow framework.
The ECM3532 is Eta Compute’s first production product. A forerunner, the ECM3531, was only available as engineering samples – it used the same cores, but both SRAM and Flash have been increased in the new version. The previous version also operated CVFS on the microcontroller core, but in the ECM3532 Eta Compute has expanded this technique to both the microcontroller and the DSP cores.
Samples of the ECM3532 are available now and mass production is expected to start in Q2 2020.