🤖Fine-tuning Falcon 7B/40B LLM

Falcon is a family of open-source large language models (LLMs) with 7 billion and 40 billion parameters trained on one trillion tokens.

We can fine-tune Falcon on Q Blocks cloud by running these commands for installation and execution:

GPU configuration:

  • We would recommend choosing a 40GB or higher GPU such as 1x A100 40GB/80GB, 1x A6000 or 2x A100 80GB from the Data center nodes option on Q Blocks platform while launching a GPU instance.

Install miniconda

# Download latest miniconda.
wget -nc https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Install. -b is used to skip prompt
bash Miniconda3-latest-Linux-x86_64.sh -b

# Activate.
eval "$(/home/qblocks/miniconda3/bin/conda shell.bash hook)"

# (optional) Add activation cmd to bashrc so you don't have to run the above every time.
printf '\neval "$(/home/qblocks/miniconda3/bin/conda shell.bash hook)"' >> ~/.bashrc

Setup env

Install using the yaml file:

# Create and activate env. -y skips confirmation prompt.
conda create -n falcon-env python=3.9 -y
conda activate falcon-env

# newest torch with cuda 11.8
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

# Install other dependencies
pip install -U accelerate einops sentencepiece git+https://github.com/huggingface/transformers.git && \
pip install -U trl git+https://github.com/huggingface/peft.git && \
pip install scipy datasets bitsandbytes wandb

Start the run

Download script and execute it in conda environment:

# Download finetuning script
wget https://qbcontent.nyc3.cdn.digitaloceanspaces.com/finetuning/finetune-falcon.py

eval "$(/home/qblocks/miniconda3/bin/conda shell.bash hook)"
conda activate falcon-env

# Single GPU, falcon 7B, 4bit quantization
torchrun --nnodes 1 --nproc_per_node 1 \
ft.py \
-m ybelkada/falcon-7b-sharded-bf16 \
-q 4bit

# 8x GPUs, falcon 40B, 8bit quantization
torchrun --nnodes 1 --nproc_per_node 2 \
finetune-falcon.py \
-m tiiuae/falcon-40b \
-q 4bit

More parameters can be specified such as:

--dataset_name --steps --batch_size_per_device

Last updated