A Step-by-step Coding Guide To Efficiently Fine-tune Qwen3-14b Using Unsloth Ai On Google Colab With Mixed Datasets And Lora Optimization

Trending 8 hours ago
ARTICLE AD BOX

Fine-tuning LLMs often requires extended resources, time, and memory, challenges that tin inhibit accelerated experimentation and deployment. Unsloth AI revolutionizes this process by enabling fast, businesslike fine-tuning state-of-the-art models for illustration Qwen3-14B pinch minimal GPU memory, leveraging precocious techniques specified arsenic 4-bit quantization and LoRA (Low-Rank Adaptation). In this tutorial, we locomotion done a applicable implementation connected Google Colab to fine-tune Qwen3-14B utilizing a operation of reasoning and instruction-following datasets, combining Unsloth’s FastLanguageModel utilities pinch trl.SFTTrainer users tin execute powerful fine-tuning capacity pinch conscionable consumer-grade hardware.

%%capture import os if "COLAB_" not successful "".join(os.environ.keys()): !pip instal unsloth else: !pip instal --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo !pip instal sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer !pip instal --no-deps unsloth

We instal each nan basal libraries required for fine-tuning nan Qwen3 exemplary utilizing Unsloth AI. It conditionally installs limitations based connected nan environment, utilizing a lightweight attack connected Colab to guarantee compatibility and trim overhead. Key components for illustration bitsandbytes, trl, xformers, and unsloth_zoo are included to alteration 4-bit quantized training and LoRA-based optimization.

from unsloth import FastLanguageModel import torch model, tokenizer = FastLanguageModel.from_pretrained( model_name = "unsloth/Qwen3-14B", max_seq_length = 2048, load_in_4bit = True, load_in_8bit = False, full_finetuning = False, )

We load nan Qwen3-14B exemplary utilizing FastLanguageModel from nan Unsloth library, which is optimized for businesslike fine-tuning. It initializes nan exemplary pinch a discourse magnitude of 2048 tokens and loads it successful 4-bit precision, importantly reducing representation usage. Full fine-tuning is disabled, making it suitable for lightweight parameter-efficient techniques for illustration LoRA.

model = FastLanguageModel.get_peft_model( model, r = 32, target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"], lora_alpha = 32, lora_dropout = 0, bias = "none", use_gradient_checkpointing = "unsloth", random_state = 3407, use_rslora = False, loftq_config = None, )

We use LoRA (Low-Rank Adaptation) to nan Qwen3 exemplary utilizing FastLanguageModel.get_peft_model. It injects trainable adapters into circumstantial transformer layers (like q_proj, v_proj, etc.) pinch a rank of 32, enabling businesslike fine-tuning while keeping astir exemplary weights frozen. Using “unsloth” gradient checkpointing further optimizes representation usage, making it suitable for training ample models connected constricted hardware.

from datasets import load_dataset reasoning_dataset = load_dataset("unsloth/OpenMathReasoning-mini", split="cot") non_reasoning_dataset = load_dataset("mlabonne/FineTome-100k", split="train")

We load 2 pre-curated datasets from nan Hugging Face Hub utilizing nan library. The reasoning_dataset contains chain-of-thought (CoT) problems from Unsloth’s OpenMathReasoning-mini, designed to heighten logical reasoning successful nan model. The non_reasoning_dataset pulls wide instruction-following information from mlabonne’s FineTome-100k, which helps nan exemplary study broader conversational and task-oriented skills. Together, these datasets support a well-rounded fine-tuning objective.

def generate_conversation(examples): problems = examples["problem"] solutions = examples["generated_solution"] conversations = [] for problem, solution successful zip(problems, solutions): conversations.append([ {"role": "user", "content": problem}, {"role": "assistant", "content": solution}, ]) return {"conversations": conversations}

This function, generate_conversation, transforms earthy question–answer pairs from nan reasoning dataset into a chat-style format suitable for fine-tuning. For each problem and its corresponding generated solution, a speech is conducted successful which nan personification asks a mobility and nan adjunct provides nan answer. The output is simply a database of dictionaries pursuing nan building expected by chat-based connection models, preparing nan information for tokenization pinch a chat template.

reasoning_conversations = tokenizer.apply_chat_template( reasoning_dataset["conversations"], tokenize=False, ) from unsloth.chat_templates import standardize_sharegpt dataset = standardize_sharegpt(non_reasoning_dataset) non_reasoning_conversations = tokenizer.apply_chat_template( dataset["conversations"], tokenize=False, ) import pandas arsenic pd chat_percentage = 0.75 non_reasoning_subset = pd.Series(non_reasoning_conversations).sample( int(len(reasoning_conversations) * (1.0 - chat_percentage)), random_state=2407, ) data = pd.concat([ pd.Series(reasoning_conversations), pd.Series(non_reasoning_subset) ]) data.name = "text"

We hole nan fine-tuning dataset by converting nan reasoning and instruction datasets into a accordant chat format and past combining them. It first applies nan tokenizer’s apply_chat_template to person system conversations into tokenizable strings. The standardize_sharegpt usability normalizes nan instruction dataset into a compatible structure. Then, a 75-25 operation is created by sampling 25% of nan non-reasoning (instruction) conversations and combining them pinch nan reasoning data. This blend ensures nan exemplary is exposed to logical reasoning and wide instruction-following tasks, improving its versatility during training. The last mixed information is stored arsenic a single-column Pandas Series named “text”.

from datasets import Dataset combined_dataset = Dataset.from_pandas(pd.DataFrame(data)) combined_dataset = combined_dataset.shuffle(seed=3407) from trl import SFTTrainer, SFTConfig trainer = SFTTrainer( model=model, tokenizer=tokenizer, train_dataset=combined_dataset, eval_dataset=None, args=SFTConfig( dataset_text_field="text", per_device_train_batch_size=2, gradient_accumulation_steps=4, warmup_steps=5, max_steps=30, learning_rate=2e-4, logging_steps=1, optim="adamw_8bit", weight_decay=0.01, lr_scheduler_type="linear", seed=3407, report_to="none", ) )

We return nan preprocessed conversations, wrap them into a Hugging Face Dataset (ensuring nan information is successful a accordant format), and shuffle nan dataset pinch a fixed seed for reproducibility. Then, nan fine-tuning trainer is initialized utilizing trl’s SFTTrainer and SFTConfig. The trainer is group up to usage nan mixed dataset (with nan matter file section named “text”) and defines training hyperparameters for illustration batch size, gradient accumulation, number of warmup and training steps, learning rate, optimizer parameters, and a linear learning complaint scheduler. This configuration is geared towards businesslike fine-tuning while maintaining reproducibility and logging minimal specifications (with report_to=”none”).

trainer.train() starts nan fine-tuning process for nan Qwen3-14B exemplary utilizing nan SFTTrainer. It trains nan exemplary connected nan prepared mixed dataset of reasoning and instruction-following conversations, optimizing only nan LoRA-adapted parameters acknowledgment to nan underlying Unsloth setup. Training will proceed according to nan configuration specified earlier (e.g., max_steps=30, batch_size=2, lr=2e-4), and advancement will beryllium printed each logging step. This last bid launches nan existent exemplary adjustment based connected your civilization data.

model.save_pretrained("qwen3-finetuned-colab") tokenizer.save_pretrained("qwen3-finetuned-colab")

We prevention nan fine-tuned exemplary and tokenizer locally to nan “qwen3-finetuned-colab” directory. By calling save_pretrained(), nan adapted weights and tokenizer configuration tin beryllium reloaded later for conclusion aliases further training, locally aliases for uploading to nan Hugging Face Hub.

In conclusion, pinch nan thief of Unsloth AI, fine-tuning monolithic LLMs for illustration Qwen3-14B becomes feasible, utilizing constricted resources, and is highly businesslike and accessible. This tutorial demonstrated really to load a 4-bit quantized type of nan model, use system chat templates, operation aggregate datasets for amended generalization, and train utilizing TRL’s SFTTrainer. Whether you’re building civilization assistants aliases specialized domain models, Unsloth’s devices dramatically trim nan obstruction to fine-tuning astatine scale. As open-source fine-tuning ecosystems evolve, Unsloth continues to lead nan measurement successful making LLM training faster, cheaper, and much applicable for everyone.


Check retired nan COLAB NOTEBOOK. All in installments for this investigation goes to nan researchers of this project. Also, feel free to travel america on Twitter and don’t hide to subordinate our 95k+ ML SubReddit and Subscribe to our Newsletter.

Asif Razzaq is nan CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing nan imaginable of Artificial Intelligence for societal good. His astir caller endeavor is nan motorboat of an Artificial Intelligence Media Platform, Marktechpost, which stands retired for its in-depth sum of instrumentality learning and heavy learning news that is some technically sound and easy understandable by a wide audience. The level boasts of complete 2 cardinal monthly views, illustrating its fame among audiences.

More