From 6dc3a1445b422f6e8f5da3d9dd2dfd00d30e02bf Mon Sep 17 00:00:00 2001 From: Tyler Date: Wed, 2 Jul 2025 16:42:00 -0700 Subject: [PATCH] README and script messages --- README.md | 4 ++++ setup.sh | 6 +++++- setup_lm.sh | 6 +++++- 3 files changed, 14 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index cb90c2f..cdd4c56 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,8 @@ To create a conda environment with the necessary dependencies, run the following ./setup.sh ``` +Verify it worked by activating the conda environment with the command `conda activate b2txt25`. + ## Python environment setup for ngram language model and OPT rescoring We use an ngram language model plus rescoring via the [Facebook OPT 6.7b](https://huggingface.co/facebook/opt-6.7b) LLM. A pretrained 1gram language model is included in this repository at `language_model/pretrained_language_models/openwebtext_1gram_lm_sil`. Pretrained 3gram and 5gram language models are available for download [here](https://datadryad.org/dataset/doi:10.5061/dryad.x69p8czpq) (`languageModel.tar.gz` and `languageModel_5gram.tar.gz`). Note that the 3gram model requires ~60GB of RAM, and the 5gram model requires ~300GB of RAM. Furthermore, OPT 6.7b requires a GPU with at least ~12.4 GB of VRAM to load for inference. @@ -65,3 +67,5 @@ Our Kaldi-based ngram implementation requires a different version of torch than ```bash ./setup_lm.sh ``` + +Verify it worked by activating the conda environment with the command `conda activate b2txt25_lm`. diff --git a/setup.sh b/setup.sh index d01e6ea..c8b84b6 100755 --- a/setup.sh +++ b/setup.sh @@ -34,4 +34,8 @@ pip install \ transformers==4.53.0 \ tokenizers==0.21.2 \ accelerate==1.8.1 \ - bitsandbytes==0.46.0 \ No newline at end of file + bitsandbytes==0.46.0 + +echo +echo "Setup complete! Verify it worked by activating the conda environment with the command 'conda activate b2txt25'." +echo diff --git a/setup_lm.sh b/setup_lm.sh index 0bbdb10..17f109c 100755 --- a/setup_lm.sh +++ b/setup_lm.sh @@ -53,4 +53,8 @@ cd language_model/runtime/server/x86 python setup.py install # cd back to the root directory -cd ../../../.. \ No newline at end of file +cd ../../../.. + +echo +echo "Setup complete! Verify it worked by activating the conda environment with the command 'conda activate b2txt25_lm'." +echo