AnythingLLM Desktop
The environment suppresses the usual screen outputs, keeps query histories and allows for document embedding.
Installation
export root=/rds/project/rds-4o5vpvAowP0/software
mkdir -p ${root}/AnythingLLMDesktop
curl -fsSL https://cdn.anythingllm.com/latest/installer.sh | \
sed "/OUTDIR/d; /Downloading/i\
OUTDIR=${root}/AnythingLLMDesktop
" | sh
# $HOME/AnythingLLMDesktop by default
ln -s /rds/project/rds-4o5vpvAowP0/software/AnythingLLMDesktop ~/AnythingLLMDesktop
Our batch script is start.sh
indead of the start
provided,
export STORAGE_DIR=root=/rds/project/rds-4o5vpvAowP0/software/AnythingLLMDesktop
export desktop=anythingllm-desktop
cd ${STORAGE_DIR}
$desktop/$desktop --no-sandbox > /dev/null 2>&1
Ollama
First, we allocate a directory which can hold large file,
export OLLAMA_MODELS=/rds/usr/$USER/hpc-work/ollama
ln -sf ${OLLAMA_MODELS} $HOME/.ollama
then we make models visible,
module load ceuadmin/ollama
ollama serve &
ollama list
Handling of GGUF format is possible, as shown here
huggingface-cli download TheBloke/MistralLite-7B-GGUF mistrallite.Q4_K_M.gguf --local-dir ./models
echo FROM ./models/mistrallite.Q4_K_M.gguf > Modelfile
ollama create mistrallite -f Modelfile
ollama list
llama.cpp
Since ollama is extremely slow from the console, this is considered on two aspects:
- No use of AnythingLLM Desktop, e.g.,
llama-run deepseek-r1
. The speed is very impressive. - Multithreading under AnythingLLM Desktop, e.g.,
llama-server -m deepseek-r1 --port 8080 -t 8
. This is furnished by selectingLocal AI
and the right end point, e.g.,http://localhost:8080/v1
.