Training
Description
Use the teklia-qwen train command to train a new Qwen model.
| Parameter | Description | Type | Default |
|---|---|---|---|
|
Path to the train set in JSONL format. |
|
|
|
Path to the validation set in JSONL format. |
|
|
|
Path to the model to fine-tune. |
|
|
|
Path to the training configuration. More below. |
|
|
|
Path to the model to resume from. |
|
|
Training configuration
A sample configuration file is available at qwen/train/config.yaml. Each parameter is described over there.
The first block must be updated for each training:
output_dir: "output" # Directory to save the model
num_train_epochs: 10 # Number of training epochs
per_device_train_batch_size: 4 # Batch size for training
per_device_eval_batch_size: 4 # Batch size for evaluation
report_to: "wandb" # Reporting tool for tracking metrics
max_length: null # Do not truncate input tokens (default: 1024)
run_name: "QWEN Fine-tuning" # Set custom run name for wandb
Other parameters should generally remain unchanged and only be modified with caution.
Requirements
Training depends on Flash Attention. Make sure to install it by running:
pip install -e .[flash]
Examples
Train a model
-
Command to use:
WANDB_MODE=offline teklia-qwen train --train-dataset train.jsonl \ --val-dataset val.jsonl \ --model /models/QWEN/Qwen3-VL-8B-Instruct/ \ --config experiment.yaml -
Output: All checkpoints will be saved in
config.output_dir -
Synchronize with Weights & Biases
wandb sync wandb/offline-run-yyyymmdd_hhmmss-runid/
Resume training
You can also resume a training with --resume-from-ckpt.
-
Command to use:
WANDB_MODE=offline teklia-qwen train --train-dataset train.jsonl \ --val-dataset val.jsonl \ --model /models/QWEN/Qwen3-VL-8B-Instruct/ \ --config experiment.yaml \ --resume-from-ckpt output/checkpoint-5/ -
Output: This will:
-
shuffle the dataset with a new data seed,
-
resume the trainer state,
-
resume the wandb run.
-