Open-Source Project Trains Small LLM in 2 Hours for $3
Why this is here: The MiniMind project enables training a full large language model pipeline from scratch in just two hours for approximately three dollars using a personal GPU.
An open-source project offers code to train a 26 million parameter language model from scratch in two hours for about three dollars.
The MiniMind project provides a full pipeline for developing large language models from zero. It includes code for dataset cleaning, pretraining, supervised fine-tuning, LoRA, DPO, and reinforcement learning.
The project's smallest model version is 7,000 times smaller than GPT-3. Its goal is to make LLM training accessible on ordinary personal GPUs.
Core algorithms are rewritten from scratch in PyTorch, avoiding abstract third-party library interfaces. The project also extends to visual multimodal capabilities with MiniMind-V.
Training costs are estimated at three dollars for GPU server rental. The two-hour training duration is based on an NVIDIA 3090 GPU.
The project offers a simplified structure for learning and experimentation. It supports single or multi-GPU setups and visualization tools. The project aims to lower LLM learning barriers.