LLM Trainer
Now that you have generated your LLM Dataset, you can train your LLM Model. This article is a deep dive into the LLM Trainer and all its settings. To get a short training overview, check out Templates: Train an LLM
- Start by creating a new Canvas. Drag the LLM Trainer Element onto the Canvas.
- Open the LLM Trainer Element settings and make the following adjustments:
- Dataset Folder Path: Using the “Select Directory” button, choose the folder where you saved your LLM dataset during LLM Dataset Generation
- Artifact Save Path: An optional setting to create a backup save of the adapter file outside of Navigator’s built in Artifact Registry.
- Base Model Assets Path: Using the “Select Directory” button, choose the folder where you would like to save your base model.
- Evaluator API Key: Add a Groq, OpenAI, Claude, or Gemini API key to enable the Faithfulness and Relevancy benchmarks in your training metrics. You will still get BLEU and ROUGE scores without adding an API key
If you need a free API key, you can generate one for Groq here.
Model Selection
Navigator supports a variety of open source LLMs, and is adding more all the time. Change the model you are training here
For most use cases that you expect to train or run inference on consumer hardware, we recommend using a 7B parameter model. This is the current sweet spot between model performance and size for consumer devices. All LLM features in Navigator work well with 7B parameter models.
See the full list of supported models HERE
Advanced Settings
- Dataset Utilization Rate: Percentage that sets how much of the dataset to use.
- Batch size: Number of entries of the dataset to train at one time. Lower numbers use less memory, higher numbers train faster. We recommend starting with 4
- Learning Rate: Determines the step size or rate at which the model weights (parameters) are updated during training. Lower means a smaller step and likely slower convergence, but is more stable compared to a higher learning rate
- Quantization Rank: Reduces the number of bits used to train each parameter. Think of this like compressing an audio file, or lower resolution images or videos. Smaller Quantization Ranks use less memory but lose accuracy.
- Seed: Starting point for random number generation, usually used for repeatability in experiments
Training Metrics
To see the training metrics of current and past runs, click the “Canvas” Dropdown at the top of the menu, then select “Run History”
From here you can see the Summary and Report pages for each of your runs
Hover over the tool tip at the end of each score to learn more about what each one means for your model.