Skip to content
Snippets Groups Projects
Commit 48f7db60 authored by Risto Luukkonen's avatar Risto Luukkonen
Browse files

Add readme

parent 36f58292
No related branches found
No related tags found
No related merge requests found
Trainer script for gpt2-like model with huggingface and deepspeed.
Step 1)
clone folders testing/venv_trainer/ and tokenizer/
Step 2)
create virtual environment, upgrade pip (pip3 install --upgrade pip) and install deepspeed as Iiro instructed earlier (just search this channel w/ "deepspeed"). Install also transformers and datasets.
Step 3)
Set trainer.py-file line 100.
tokenizer_path = your_tokenizer_dir
Step 4)
Setup your own configs at trainer.bash
Set paths correctly to your own locations, maybe remove comment from --cpus-per-task=10 to get data tokenization more efficient etc.
Step 5)
sbatch trainer.bash
Notes:
* Script isn't ready and has lot's of tweaks.
* confligts of ds_config.json and TrainingArguments result to crashing. Use "auto"-values in ds_config to propagate correct values to deepspeed engine.
* If you experience slow startups you may want use singularity-module to load torch. It may hurt performance, but is fast with imports. Process is something like this:
1. module load torch.
2. pip install deepspeed datasets transformers --user
3. make sure you've got the following on your slurm-script:
```
export TORCH_EXTENSIONS_DIR=/PATH/TO/SOME/DIR/
module load pytorch
module load gcc/9.1.0
module load cuda/11.1.0
export SINGULARITYENV_APPEND_PATH="/users/$USER/.local/bin"
export CPATH=/appl/spack/install-tree/gcc-9.1.0/python-3.6.8-ecovls/include/python3.6m:$CPATH
```
4. You may need to change python path from your deepspeed-launcher located at /users/$USER/.local/bin/deepspeed to `which python` pointing to singularity-python
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment