This repository enables you to use GraphRAG with any models from Hugging Face, as well as your own models.
To set up this repository and utilize GraphRAG with local models from Ollama, follow these steps:
Ensure that Python version 3.10 is being used to prevent any compatibility issues.
conda create -n graphrag-ollama-local python=3.10
conda activate graphrag-ollama-local
Install Ollama by visiting their official website or by running the following commands:
curl -fsSL https://ollama.com/install.sh | sh # Ollama for Linux
pip install ollama
Select from available models such as Mistral, Gemma2, or Qwen2 for the LLM, and any embedding model provided by Ollama:
ollama pull llama3 # LLM model
ollama pull nomic-embed-text # Embedding model
Clone this repository to your local machine:
git clone https://github.com/TheAiSingularity/graphrag-local-ollama.git
This step is crucial. Ensure the package is installed with the following command:
pip install -e .
Create a directory where the experiment data and results will be stored:
mkdir -p ./ragtest/input
Copy the sample data from the input/
folder to ./ragtest/input
. You can also add your own .txt
files here:
cp input/* ./ragtest/input
Initialize the ./ragtest
directory to generate the required files:
python -m graphrag.index --init --root ./ragtest
Copy the pre-configured settings.yaml
file to the ./ragtest
directory:
cp settings.yaml ./ragtest
You can modify this file to experiment with different models. For LLM models, use options like llama3
, mistral
, or phi3
. For embedding models, choose options like mxbai-embed-large
or nomic-embed-text
. The full list of models provided by Ollama is available on their website, which can be deployed locally.
To add your own custom model, refer to the guide Testing a New Model
.
Default API base URLs:
- LLM:
http://localhost:11434/v1
- Embeddings:
http://localhost:11434/api
Run the indexing process to generate a graph:
python -m graphrag.index --root ./ragtest
Execute a query using the global method:
python -m graphrag.query --root ./ragtest --method global "What are some common applications and challenges of using MRI in medical imaging?"
To test a new model, follow these steps:
-
Change the model name.
-
Install the required dependencies:
pip install -r requirements.txt
- Convert the Hugging Face model to GGUF format:
python convert_hf_to_gguf.py -h
- Use the following command to convert the custom model:
python3 llama.cpp/convert_hf_to_gguf.py custom_model --outfile custom_model.gguf --outtype q8_0
Ollama supports importing GGUF models in the Modelfile
:
- Create a file named
Modelfile
, with aFROM
instruction pointing to the local file path of the model you want to import:
FROM ./custom_model.gguf
- Create the model in Ollama:
ollama create custom_model -f Modelfile
- Run the model:
ollama run custom_model
You can also just use ollama list
to check if the model is correctly added.
- Update
settings.yaml
to includecustom_model
and copy it to the./ragtest
directory:
cp settings.yaml ./ragtest
- Run the indexing process:
python -m graphrag.index --root ./ragtest
Note : Some custom models will make your graphrag crash if they do not follow the openai prompt style (change in settings.yaml
), embedding (change in settings.yaml
), or output a bad JSON format. So make sure to make the appropriate changes to adapt the graph rag to your own model.
Inspired by : graph-rag-ollama