With its Swarm Learning and Machine Learning Development System offerings, HPE wants to help companies realize the benefits of machine learning models faster.
If companies see the value of machine learning for certain tasks or processes, they may become discouraged with implementing models. HPE wants to help them by launching two solutions: Swarm Learning and Machine Learning Development System. The first is based on swarm learning that aims to exploit the value of data generated at the edge or across distributed sites. The second provides a hardware-based model training solution to avoid companies having to build their own ad-hoc infrastructures.
The Machine Learning Development System is based on the 2021 acquisition of Determined AI. It offers an open source platform to accelerate machine learning model training and integrates with HPE hardware. Various configurations are possible for this, but HPE indicates that a typical “small configuration” consists of an Apollo 6500 Gen10 server with sufficient power for ML model units, HPE ProLiant DL325 switches and servers. Aruba CX 6300. includes Nvidia’s Quantum InfiniBand networking platform and HPE’s specialized software: Machine Learning Development Environment and Performance Cluster.
Sample pre-packaged configuration for training machine learning models (Photo Credit: HPE)
HPC Computing for ML
According to Peter Rutten, vice president of research at IDC, the concept is essentially about bringing high-performance computing (HPC) capabilities into enterprise machine learning so they don’t have to design their own systems. “Now that AI is more mature, companies are really asking for this type of system,” he said.
“This need to have to design your own system is the biggest barrier to introducing AI into business.” For some, using cloud resources may be an option, but often the data needed to train AI models is sensitive and business-critical, so this type of option is not an option. impossible for some of them, not to mention the regulatory restrictions of certain industries, which make it absolutely impossible for others.
Decentralized ML in “swarm”
With its Swarm Learning product, HPE is trying to address the sensitive nature of machine learning data. Swarm Learning’s decentralized framework uses containerization to achieve two goals: first, it allows machine learning tasks to run on edge systems, without having to go back and forth to a central data center and thus get accurate information, faster than they could. . Second, it allows like-minded companies to share learning outcomes from the AI model without having to share the underlying data, which can have benefits for the entire industry. “Take the example of seven hospitals that are trying to solve problems with AI models, but since they cannot share their data, model training will be limited,” explained Rutten. These will be inaccurate, with inherent potential bias, depending on patient demographics and a host of other factors. “To solve this problem, Swarm Learning doesn’t share the data, only the model training results and combines them into a single model that will be trained from all the data,” Rutten said.
He points out that this “swarm” learning technique is relatively new, which means its widespread adoption could take time. On the other hand, he says, HPE’s machine learning development system Machine Learning Development System directly targets a current stress point, and this is perhaps the more interesting announcement of the two. “It’s almost an aaS (as a service) offering for the enterprise data center,” he said. “This is exactly what people look for when training AI models in their business,” he added.