Q Blocks Documentation
  • 👋Welcome to Q Blocks
  • 🌐GPU Computing at Scale
  • 💻Launch a Q Blocks GPU instance
    • Using Dashboard UI
    • Using Rest APIs
  • 💰GPU Instance Pricing
  • 🤖Fine-tuning Falcon 7B/40B LLM
  • 🔑IAM: Share access with team
  • 🤔Q Blocks How To Guide
    • Create a new user
    • Upload data using SCP command
    • Use Visual Studio Code with Q Blocks instances
    • Port forwarding to run web services
    • Launch Jupyter Hub in Q Blocks Instance
    • Launch TensorBoard in Q Blocks instance
    • Setup Horovod and OpenMPI in Q Blocks Instance
    • Setup AIM for ML experiment tracking
    • Disco Diffusion AI Art on Q Blocks
    • Stable Diffusion Text to Image GPU server on Q Blocks
    • Setup Docker with Nvidia GPU support
    • Enable port forwarding on a Docker container in Q Blocks instance
    • Run production ready lightweight kubernetes using K3s in Q Blocks instance
    • ↗️Upgrade CUDA to v12.2
Powered by GitBook
On this page

Fine-tuning Falcon 7B/40B LLM

PreviousGPU Instance PricingNextIAM: Share access with team

Last updated 1 year ago

Falcon is a family of open-source large language models (LLMs) with 7 billion and 40 billion parameters trained on one trillion tokens.

We can fine-tune Falcon on Q Blocks cloud by running these commands for installation and execution:

GPU configuration:

  • We would recommend choosing a 40GB or higher GPU such as 1x A100 40GB/80GB, 1x A6000 or 2x A100 80GB from the Data center nodes option on while launching a GPU instance.

Install miniconda

# Download latest miniconda.
wget -nc https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Install. -b is used to skip prompt
bash Miniconda3-latest-Linux-x86_64.sh -b

# Activate.
eval "$(/home/qblocks/miniconda3/bin/conda shell.bash hook)"

# (optional) Add activation cmd to bashrc so you don't have to run the above every time.
printf '\neval "$(/home/qblocks/miniconda3/bin/conda shell.bash hook)"' >> ~/.bashrc

Setup env

Install using the yaml file:

# Create and activate env. -y skips confirmation prompt.
conda create -n falcon-env python=3.9 -y
conda activate falcon-env

# newest torch with cuda 11.8
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

# Install other dependencies
pip install -U accelerate einops sentencepiece git+https://github.com/huggingface/transformers.git && \
pip install -U trl git+https://github.com/huggingface/peft.git && \
pip install scipy datasets bitsandbytes wandb

Start the run

Download script and execute it in conda environment:

# Download finetuning script
wget https://qbcontent.nyc3.cdn.digitaloceanspaces.com/finetuning/finetune-falcon.py

eval "$(/home/qblocks/miniconda3/bin/conda shell.bash hook)"
conda activate falcon-env

# Single GPU, falcon 7B, 4bit quantization
torchrun --nnodes 1 --nproc_per_node 1 \
ft.py \
-m ybelkada/falcon-7b-sharded-bf16 \
-q 4bit

# 8x GPUs, falcon 40B, 8bit quantization
torchrun --nnodes 1 --nproc_per_node 2 \
finetune-falcon.py \
-m tiiuae/falcon-40b \
-q 4bit

More parameters can be specified such as:

--dataset_name --steps --batch_size_per_device

🤖
Q Blocks platform