Today, we’re announcing the general availability of Meta's Llama 3.1 models (70B and 405B) on the Oracle Cloud Infrastructure (OCI) Generative AI service. According to Meta, this release enables them to bring open intelligence to all through their open source models. This latest release expands context length to 128K, adds support across eight languages, and includes Llama 3.1 405B, the first frontier-level open source AI model. This latest release brings more flexibility, control, and state-of-the-art capabilities that rival the best closed source models.
OCI Generative AI offers the following core features:
Llama 3.1 405B is available for on-demand inferencing and dedicated hosting in the Chicago region, plus dedicated hosting in Frankfurt, London, and São Paulo. Llama 3.1 70B is available for on-demand inferencing and dedicated hosting and fine-tuning in the Chicago, Frankfurt, London, and São Paulo regions.
The Llama 3.1 model family offers the following key features:
According to Meta, Llama 3.1 405B offers the world’s largest publicly available large language model (LLM) and is ideal for enterprise-level applications and research and development. It excels in general knowledge, synthetic data generation, advanced reasoning and contextual understanding, long-form text, multilingual translation, coding, math, and tool use.
Llama 3.1 70B is perfect for content creation, conversational AI, and enterprise applications. Its strengths include text summarization, classification, sentiment analysis, language modeling, dialogue systems, and coding assistance.
Llama 3.1 8B is optimized for limited computational resources. It’s best for efficient text summarization, classification, sentiment analysis, and translation in low-latency scenarios.
Meta has rigorously tested Llama 3.1 on over 150 benchmark datasets, demonstrating significant improvements across all major categories compared to Llama 3. You can find more training details and performance benchmarks in the Llama 3.1 model card and evaluations documentation.
On July 31, we announced OCI Data Science’s support for Llama 3.1 405B through a Bring Your Own Container(BYOC) approach. OCI Data Science supports leading industry GPUs, allowing this large model to be deployed. Meta-Llama-3.1-405B-Instruct-FP8 is the specific version of the model, quantized to reduce infrastructure requirements.
With the availability of Llama 3.1 to OCI Generative AI, OCI customers can use Llama 3.1 405B and 70B as a service with no need to manage containers or infrastructure. Customers can access the models through chat as either on-demand hosted model APIs or dedicated hosted endpoints. Llama 3.1 70B fine-tuning using Low Rank Adaptation (LoRA) is also supported through our managed fine-tuning offering with private, dedicated hosting for custom models.
For more information on using Llama 3.1 models in your AI projects on Oracle Cloud Infrastructure, visit the Generative AI service documentation or contact your Oracle representative.
For more information see the following resources:
David is a Sr. Principal Product Manager for the OCI Generative AI service.
Previous Post