Skip to main content

Meta Begins Training Llama-3

Announced on Instagram by Mark Zuckerburg, Meta looks to reach new heights in the world of LLM's.
Created on January 19|Last edited on January 19
Meta, led by CEO Mark Zuckerberg, has embarked on the development of its latest generative AI model, Llama 3, as disclosed in a recent video posted on Instagram. This move signals a significant stride in Meta's AI endeavors. Zuckerberg highlighted the company's intent to release AI models as open source when feasible, emphasizing a continuous commitment to AI development.

Reorganization

In a detailed Instagram video, Zuckerberg revealed organizational changes to bolster these AI efforts. The Fundamental AI Research unit, previously part of the Reality Labs team under CTO Andrew Bosworth, will now integrate with the Family of Apps unit led by Chris Cox. This reorganization brings AI research and product development closer together, aligning with Meta's goal of achieving artificial general intelligence. Yann LeCun and Joelle Pineau, leaders in AI research, will report directly to Cox following this shift.

H-100's

Highlighting Meta's investment in AI, Zuckerberg shared that the company plans to conclude the year with over 340,000 Nvidia H100 GPUs, projecting an equivalent capacity of nearly 600,000 H100s when considering other chips. This substantial investment in hardware is a clear indicator of Meta's dedication to advancing its AI capabilities.

Performance Gains

The Nvidia H100, an advancement over the A100, offers enhanced AI performance with improved efficiency. The A100, used in training GPT-4, required about 25,000 units, showcasing its capability in handling large-scale AI tasks. The H100's newer architecture promises even greater potential for future AI developments. Meta's substantial investment in Nvidia H100 GPUs hints at a possible preparation to launch a competitor to OpenAI's ChatGPT.

Redefining Scale

These developments at Meta represent a significant step in the evolution of AI applications, reflecting the company's ambitious vision for the future of artificial intelligence in enhancing product experiences. While a significant portion of Meta's H100 GPUs are expected to support inference and various AI-related tasks, the scale of their investment also suggests a push towards achieving a higher level of training capacity. With such a large number of high-performance GPUs, Meta appears to be gearing up for more ambitious AI training projects, potentially setting new standards in the field.

Sources:
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.