Add readme

48f7db60 · Risto Luukkonen · 36f58292 · 48f7db60
Commit 48f7db60 authored 3 years ago by Risto Luukkonen
--- a/readme.md
+++ b/readme.md
+Trainer script for gpt2-like model with huggingface and deepspeed.
+Step 1)
+clone folders testing/venv_trainer/ and tokenizer/
+Step 2)
+create virtual environment, upgrade pip (pip3 install --upgrade pip) and install deepspeed as Iiro instructed earlier (just search this channel w/ "deepspeed"). Install also transformers and datasets.
+Step 3)
+Set trainer.py-file line 100. 
+tokenizer_path = your_tokenizer_dir
+Step 4)
+Setup your own configs at trainer.bash
+Set paths correctly to your own locations, maybe remove comment from --cpus-per-task=10 to get data tokenization more efficient etc.
+Step 5)
+sbatch trainer.bash
+Notes: 
+* Script isn't ready and has lot's of tweaks.
+* confligts of ds_config.json and TrainingArguments result to crashing. Use "auto"-values in ds_config to propagate correct values to deepspeed engine.
+* If you experience slow startups you may want use singularity-module to load torch. It may hurt performance, but is fast with imports. Process is something like this:
+1. module load torch. 
+2. pip install deepspeed datasets transformers --user 
+3. make sure you've got the following on your slurm-script:
+```
+export TORCH_EXTENSIONS_DIR=/PATH/TO/SOME/DIR/
+module load pytorch
+module load gcc/9.1.0
+module load cuda/11.1.0
+export SINGULARITYENV_APPEND_PATH="/users/$USER/.local/bin"
+export CPATH=/appl/spack/install-tree/gcc-9.1.0/python-3.6.8-ecovls/include/python3.6m:$CPATH
+```
+4. You may need to change python path from your deepspeed-launcher located at /users/$USER/.local/bin/deepspeed to `which python` pointing to singularity-python