Fine-tuning GPT-3 for Improved Performance on Custom Data on AWS

This is the third article in a series that will guide you through the process of creating and fine-tuning an OpenAI chatbot on an AWS EC2 instance.

In this article, we will explore the process of fine-tuning the GPT-3 model with a custom dataset on AWS. GPT-3 is a powerful language model that can be used to create chatbots capable of generating natural language responses. While the pre-trained GPT-3 model is capable of generating coherent and contextually relevant responses, fine-tuning it with a custom dataset can further enhance its language processing capabilities and tailor it to specific use cases.

Before we begin, you will need to deploy Python on AWS, sign up for an OpenAI API key and install OpenAI on AWS.

Fine-tune ChatGPT

Step 1: Prepare a Custom Dataset

The first step is to create a custom dataset in JSONL file format. Each line of the file comprises a “prompt” followed by its corresponding “completion”.

{"prompt": "<prompt text \n\n###\n\n>", "completion": "<ideal generated text> \n"}
{"prompt": "<prompt text \n\n###\n\n>", "completion": "<ideal generated text> \n"}
{"prompt": "<prompt text \n\n###\n\n>", "completion": "<ideal generated text> \n"}

In our example, the file ‘travel.jsonl ’will contain approximately 100 entries of questions and answers on travel insurance. Below is an extract of the training file.

{"prompt":"What types of travel insurance policies are available? \n\n###\n\n","completion":" There are many different types of travel insurance policies available, including single-trip policies, multi-trip policies, and annual policies. You can also choose policies that offer coverage for specific needs, such as medical emergencies, trip cancellations, or adventure activities.\n"}
{"prompt":"What does travel insurance typically cover? \n\n###\n\n","completion":" Travel insurance can cover a variety of unexpected events, such as trip cancellations, flight delays, medical emergencies, lost or stolen luggage, and emergency evacuation. The specific coverage will depend on the policy you choose.\n"}
{"prompt":"Are there any exclusions or limitations to travel insurance coverage? \n\n###\n\n","completion":" Yes, there are often exclusions or limitations to travel insurance coverage. For example, some policies may not cover pre-existing medical conditions or certain adventure activities. It's important to read the policy carefully and understand the exclusions before purchasing.\n"}
{"prompt":"How do I choose the right travel insurance policy for my needs? \n\n###\n\n","completion":" When choosing a travel insurance policy, consider your destination, activities planned, and any specific risks you may face. It's also important to compare policies and read reviews from other travelers to find a reputable provider.\n"}
{"prompt":"What is the difference between single-trip and multi-trip travel insurance policies? \n\n###\n\n","completion":" Single-trip travel insurance policies provide coverage for one specific trip, while multi-trip policies cover multiple trips within a set time period. Multi-trip policies are often more cost-effective if you travel frequently.\n"}
{"prompt":"How much does travel insurance cost, and how is the premium calculated? \n\n###\n\n","completion":" The cost of travel insurance will depend on factors such as your destination, the length of your trip, and the level of coverage you need. The premium is typically calculated as a percentage of the total trip cost.\n"}
{"prompt":"Can I purchase travel insurance after booking my trip, or do I need to buy it before? \n\n###\n\n","completion":" You can purchase travel insurance after booking your trip, but it's recommended to buy it as soon as possible to ensure coverage for trip cancellations or other unexpected events.\n"}
{"prompt":"What should I do if I need to make a claim on my travel insurance policy? n\n###\n\n","completion":" If you need to make a claim on your travel insurance policy, contact your provider as soon as possible and provide all necessary documentation, such as receipts or medical records.\n"}
{"prompt":"How can I find a reputable travel insurance provider, and what should I look for when comparing policies? \n\n###\n\n","completion":" To find a reputable travel insurance provider, read reviews from other travelers, and compare policies from multiple providers. When comparing policies, look for coverage that meets your specific needs and be aware of any exclusions or limitations.\n"}
{"prompt":"Does travel insurance cover trip cancellations due to COVID-19? n\n###\n\n","completion":" Some travel insurance policies may provide coverage for trip cancellations due to COVID-19, while others may not. It's important to check the policy's specific coverage and exclusions before purchasing.\n"}

Step 2: Validate the Training Data

We will validate the training data using a CLI data preparation tool provided by OpenAI.

Run this command in the OpenAI CLI:

openai tools fine_tunes.prepare_data -f travel.jsonl

The output is as follows:

Step 3: Upload Training Date and Train Model

Next, we will upload the training data and train the model with

openai --api-key 'sk-xxxx' api fine_tunes.create -t "travel.jsonl" -m davinci

Replace ‘sk-xxxx’ with your API key. The original GPT-3 base models are the only models that are available for fine-tuning. We will select the davinci model for fine-tuning.

The model options are shown on the OpenAI website.

The progress of the fine-tuning is shown below:

If the stream is interrupted, we can restart it with the command:

openai --api-key 'sk-xxxx' api fine_tunes.follow -i ft-xxxx

Replace ‘sk-xxxx’ with your API key and ‘ft-xxxx’ with the fine-tuned model reference.

The model fine-tuning is shown below:

Step 4: Test Fine-Tuned Model on OpenAI CLI

We can test the fine-tuned model via OpenAI CLI

The model is tested with the command:

openai --api-key 'sk-xxxx' api completions.create -m davinci:ft-xxxx-xx-xx-xx-xx-xx -p "Can I purchase travel insurance for a cruise?"

Again, replace ‘sk-xxxx’ with your API key and the fine-tuned model reference ‘davinci:ft-xxxx-xx-xx-xx-xx-xx’.

The result from the fine-tuned model is

Step 5: Test Fine-Tuned Model on the Playground

The steps are as follows:

  • Navigate to the OpenAI Playground.
  • In the Model dropdown menu, select the fine-tuned model to test.
  • In the right panel, set the parameters Temperature = 0, Maximum length = 150 , Stop sequences = ###, Top P = 1, Frequency penalty = 0, Presence penalty = 0.
  • In the Prompt box, enter your question “Can I purchase travel insurance for a cruise?” and click the ‘Submit’ button.

OpenAI Playground for Fine-tuned Model

Final Thoughts

Do you need help with fine-tuning your own GPT-3 model?

I am happy to help. Don’t hesitate to reach out at [email protected].

Also read:

  • A Beginners Guide to Deploying Python on AWS
  • Hands-On Tutorial: Using OpenAI and Gradio to Create a Chatbot on AWS

Original Post>