Large language model (LLM) finetuning is a way to enhance the performance of pretrained LLMs for specific tasks or domains, with the aim of achieving improved inference quality with limited resources. Finetuning is crucial for domain-specific applications where pretrained models lack necessary context, taxonomy, or specialized knowledge. This blog post delves into different finetuning options, including unsupervised, supervised, and instruction-based methods, discussing the appropriate use case for each method. We also discussed advanced techniques to update pretrained LLM weights, such as full finetuning, adapter-based tuning, and parameter-efficient fine-tuning, each with distinct advantages and limitations. These techniques enable LLMs to adapt more effectively to tasks, balancing efficiency with performance depending on the approach chosen.
While trying to apply an LLM-based solution to a business problem, customers often ask whether to fine-tune or optimize the prompt. It depends on the complexity of the problem, the size of the dataset, the level of accuracy expected from the system and the budget associated with it. You can solve a plethora of business problems by crafting the prompt carefully with the simplest zero-shot approach with no example. You can solve many problems by few-shot or in-context learning approaches, where the prompt contains one or more examples for the LLM to learn from those and generate similar response. You can view these approaches as run-time optimization where no LLM weights are modified. Despite being simple, easy-to-implement, and effective, prompt-based techniques don’t work all the time.
Despite the extensive training data utilized to train LLMs, finetuning is necessary, particularly in the context of domain-specific applications. While pretrained LLMs excel at capturing general language patterns and semantics from vast corpora, their effectiveness in addressing specific tasks within specialized domains can be significantly enhanced through finetuning.
Consider the following use cases in this context:
Because the domain data is unseen, pretrained LLMs often fall short of expectations because their inability to comprehend the intricacies of medical data. This issue can lead to inaccurate summaries that have the potential to negatively impact patient care.
In both the use cases, generic pretrained LLMs lack specialized domain knowledge and can’t produce the optimal output. Finetuning on targeted datasets within these specific domains bridges this gap leading to significant improvement in accuracy and effectiveness.
LLM finetuning comes in several varieties, including the following examples:
In the previous section, we explored various methodologies for finetuning LLMs based on the structure of the training dataset. This section dives into the various techniques used to update the weights of pretrained LLMs. LLM weights refer to the parameters learned by LLMs during training. These parameters determine how input data is processed and transformed into meaningful output. These weights are the core of the model’s language understanding. The optimization of LLM weights is crucial for finetuning the model’s performance on specific tasks, as adjusting these parameters enables the model to better capture the underlying patterns and complexities present in the data, that ultimately optimizes its performance towards our desired objectives.
These traditional fine-tuning techniques described don’t always guarantee human-preferred outputs. Some advanced techniques proposed can bridge the gap, such as reinforcement learning from human feedback (RLHF) and direct performance optimization (DPO). We explore these topics in depth in a future article.
Finetuning LLMs presents a flexible and powerful way to tailor advanced AI tools to meet specific business or research needs. By using different finetuning methods—whether unsupervised, supervised, or instruction-based—organizations can significantly enhance the applicability and accuracy of LLMs in specialized domains. Techniques such as full finetuning, adapter-based tuning, and parameter-efficient tuning further refine this customization process, allowing for a targeted approach that maximizes performance while minimizing resource consumption. Ultimately, understanding and applying these techniques can transform a general-purpose LLM into a specialized tool that drives innovation and efficiency in any field.
Read part 1 of this 5-part blog series - "Navigating the frontier: Key considerations for developing a generative AI integration strategy for the enterprise"
Read part 2 of this 5-part blog series - "Comprehensive tactics for optimizing large language models for your application"
Read part 3 of this 5-part blog series - "Beginner’s Guide to Engineering Prompts for LLMs"
Look out for part 5 of this series on retrieval-augmented generation (RAG).
For more information, see the following resources:
Sandip Ghoshal is a Principal Applied Scientist with Oracle Cloud Infrastructure's Generative AI Group. As a part of the Gen-AI Sciences team Sandip implements state-of-the-art machine learning algorithms, builds prototypes and explores conceptually new solutions for OCI products and customers. Before OCI, Sandip lead the machine learning division of Oracle Content Management and was a founder of Oracle Smart Content.
Other than Machine Learning, Sandip loves hiking, if you don't hear back from him right away, there's a good chance he's scaling a peak or trekking through a remote trail.
Sid Padgaonkar is the Senior Director with OCI's Strategic Customers Group. Sid if focused on GEN AI product incubations, outbound product management and GTM strategy.