Announcing Llama 3.1 405B and 70B models from Meta on OCI Generative AI

September 10, 2024 | 3 minute read
Text Size 100%:

Today, we’re announcing the general availability of Meta's Llama 3.1 models (70B and 405B) on the Oracle Cloud Infrastructure (OCI) Generative AI service. According to Meta, this release enables them to bring open intelligence to all through their open source models. This latest release expands context length to 128K, adds support across eight languages, and includes Llama 3.1 405B, the first frontier-level open source AI model. This latest release brings more flexibility, control, and state-of-the-art capabilities that rival the best closed source models.  

OCI Generative AI offers the following core features:

  • Llama 3.1 405B: On-demand inferencing, dedicated hosting
  • Llama 3.1 70B: On-demand inferencing, dedicated hosting, and fine-tuning
  • Llama 3.1 8B: Available with OCI Data Science through Bring Your Own Model in AI Quick Actions

Llama 3.1 405B is available for on-demand inferencing and dedicated hosting in the Chicago region, plus dedicated hosting in Frankfurt, London, and São Paulo. Llama 3.1 70B is available for on-demand inferencing and dedicated hosting and fine-tuning in the Chicago, Frankfurt, London, and São Paulo regions.

The Llama 3.1 model family offers the following key features:

  • Model sizes: 8B, 70B, and 405B parameters
  • Context length: 128K tokens (16-times increase from Llama 3)
  • Multilingual support: Eight languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai

Model overview

According to Meta, Llama 3.1 405B offers the world’s largest publicly available large language model (LLM) and is ideal for enterprise-level applications and research and development. It excels in general knowledge, synthetic data generation, advanced reasoning and contextual understanding, long-form text, multilingual translation, coding, math, and tool use.

Llama 3.1 70B is perfect for content creation, conversational AI, and enterprise applications. Its strengths include text summarization, classification, sentiment analysis, language modeling, dialogue systems, and coding assistance.

Llama 3.1 8B is optimized for limited computational resources. It’s best for efficient text summarization, classification, sentiment analysis, and translation in low-latency scenarios.

Performance

Meta has rigorously tested Llama 3.1 on over 150 benchmark datasets, demonstrating significant improvements across all major categories compared to Llama 3. You can find more training details and performance benchmarks in the Llama 3.1 model card and evaluations documentation.

The OCI experience so far

On July 31, we announced OCI Data Science’s support for Llama 3.1 405B  through a Bring Your Own Container(BYOC) approach. OCI Data Science supports leading industry GPUs, allowing this large model to be deployed. Meta-Llama-3.1-405B-Instruct-FP8 is the specific version of the model, quantized to reduce infrastructure requirements.

With the availability of Llama 3.1 to OCI Generative AI, OCI customers can use Llama 3.1 405B and 70B as a service with no need to manage containers or infrastructure. Customers can access the models through chat as either on-demand hosted model APIs or dedicated hosted endpoints. Llama 3.1 70B fine-tuning using Low Rank Adaptation (LoRA) is also supported through our managed fine-tuning offering with private, dedicated hosting for custom models.

For more information on using Llama 3.1 models in your AI projects on Oracle Cloud Infrastructure, visit the Generative AI service documentation or contact your Oracle representative.

 

For more information see the following resources:

Elena Albright

David Miller

David is a Sr. Principal Product Manager for the OCI Generative AI service.


Previous Post

Enterprise chatbot with Oracle Digital Assistant, OCI Data Science, LangChain & Oracle Database 23ai

Shekhar Chavan | 5 min read

Next Post


Oracle CloudWorld 2024: Key announcements and highlights

Niharika Kalra | 3 min read