Meta Platforms Inc. today announced its latest big contribution to the Open Compute Project: Grand Teton, a new, next-generation platform for artificial intelligence at large scale.
Meta was famously one of the founding members of the OCP, which came into being way back in 2012. At the time it was focused mostly on creating standards for the interoperability of data center hardware such as servers and networking systems, but today its work is much more focused.
In a blog post, Meta said one of the greatest challenges it faces today is scaling up AI workloads, which have become increasingly more powerful and sophisticated and require a high-performance infrastructure to support them.
“How can we continue to facilitate and run the models that drive the experiences behind today’s innovative products and services?” Meta Vice President of Engineering Alexis Björlin asked. “And what will it take to enable the AI behind the innovative products and services of the future? As we move into the next computing platform, the metaverse, the need for new open innovations to power AI becomes even clearer.”
This is the logic behind Meta’s decision to contribute Grand Teton, its next-gen graphics processing unit-based hardware platform for AI, to the OCP. Announced at today’s OCP Global Summit in San Jose, California, Grand Teton is said to be the successor to Meta’s Zion-EX platform.
It comes with multiple performance enhancements such as four times as much host-to-GPU bandwidth, two times the compute and data network bandwidth, and two times as much power. It also features an integrated chassis, whereas Zion-EX was comprised of multiple independent subsystems.
Meta explained that Grand Teton is designed to better support memory-bandwidth-bound workloads at Meta, such as its open-source deep learning recommendation models. It’s also said to be optimized for compute-bound workloads such as content understanding.
The integrated chassis design meanwhile makes Grand Teton easier to deploy, as it can be integrated with data center fleets much faster. It also has fewer potential points of failure, enabling greater scale with more reliability, Meta said.
Grand Teton was actually just one of several new contributions Meta has made to the OCP. Other announcements included Open Rack v3, the latest edition of Meta’s Open Rack hardware, designed with increased flexibility. Open Rack v3 comes with a frame and the power infrastructure to support multiple use cases, including Grand Teton.
In addition, Meta announced a next-generation storage architecture for AI infrastructure called Grand Canyon, featuring improved hardware security and future upgrades of key commodities. The platform is designed to support higher-density hard disk drives without performance degradation and improved power efficiency.
Meta explained the need for these innovations, saying that although it’s all-in on AI, the future of AI cannot be built by Meta alone. Rather, it said, AI development will only come from collaboration and the sharing of ideas and technologies such as the above, through organizations like the OCP.
“We’re eager to continue working together to build new tools and technologies to drive the future of AI,” Björlin said. “Whether it’s developing new approaches to AI today or radically rethinking hardware design and software for the future, we’re excited to see what the industry has in store next.”
Image: Meta Platforms
Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.