【English | 中文 | 日本語 | 한국어 | Filipino | Français | Slovenčina | Português | Español | Türkçe | हिंदी | বাংলা | Tiếng Việt | Русский | العربية | فارسی | Italiano】
【📝 Paper | 🌐 Website | 💻 Software | 📰 Citation】
- Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas. Agent Laboratory consists of specialized agents driven by large language models to support you through the entire research workflow—from conducting literature reviews and formulating plans to executing experiments and writing comprehensive reports.
- This system is not designed to replace your creativity but to complement it, enabling you to focus on ideation and critical thinking while automating repetitive and time-intensive tasks like coding and documentation. By accommodating varying levels of computational resources and human involvement, Agent Laboratory aims to accelerate scientific discovery and optimize your research productivity.
- Agent Laboratory consists of three primary phases that systematically guide the research process: (1) Literature Review, (2) Experimentation, and (3) Report Writing. During each phase, specialized agents driven by LLMs collaborate to accomplish distinct objectives, integrating external tools like arXiv, Hugging Face, Python, and LaTeX to optimize outcomes. This structured workflow begins with the independent collection and analysis of relevant research papers, progresses through collaborative planning and data preparation, and results in automated experimentation and comprehensive report generation. Details on specific agent roles and their contributions across these phases are discussed in the paper.
- OpenAI: o1, o1-preview, o1-mini, gpt-4o
- DeepSeek: deepseek-chat (deepseek-v3)
To select a specific llm set the flag --llm-backend="llm_model"
for example --llm-backend="gpt-4o"
or --llm-backend="deepseek-chat"
. Please feel free to add a PR supporting new models according to your need!
- We recommend using python 3.12
- Clone the GitHub Repository: Begin by cloning the repository using the command:
git clone git@github.com:SamuelSchmidgall/AgentLaboratory.git
- Set up and Activate Python Environment
python -m venv venv_agent_lab
- Now activate this environment:
source venv_agent_lab/bin/activate
- Install required libraries
pip install -r requirements.txt
- Install pdflatex [OPTIONAL]
sudo apt install pdflatex
- This enables latex source to be compiled by the agents.
- [IMPORTANT] If this step cannot be run due to not having sudo access, pdf compiling can be turned off via running Agent Laboratory via setting the
--compile-latex
flag to false:--compile-latex "false"
- Now run Agent Laboratory!
python ai_lab_repo.py --api-key "API_KEY_HERE" --llm-backend "o1-mini" --research-topic "YOUR RESEARCH IDEA"
or, if you don't have pdflatex installed
python ai_lab_repo.py --api-key "API_KEY_HERE" --llm-backend "o1-mini" --research-topic "YOUR RESEARCH IDEA" --compile-latex "false"
To run Agent Laboratory in copilot mode, simply set the copilot-mode flag to "true"
python ai_lab_repo.py --api-key "API_KEY_HERE" --llm-backend "o1-mini" --research-topic "YOUR RESEARCH IDEA" --copilot-mode "true"
Writing extensive notes is important for helping your agent understand what you're looking to accomplish in your project, as well as any style preferences. Notes can include any experiments you want the agents to perform, providing API keys, certain plots or figures you want included, or anything you want the agent to know when performing research.
This is also your opportunity to let the agent know what compute resources it has access to, e.g. GPUs (how many, what type of GPU, how many GBs), CPUs (how many cores, what type of CPUs), storage limitations, and hardware specs.
In order to add notes, you must modify the task_notes_LLM structure inside of ai_lab_repo.py
. Provided below is an example set of notes used for some of our experiments.
task_notes_LLM = [
{"phases": ["plan formulation"],
"note": f"You should come up with a plan for TWO experiments."},
{"phases": ["plan formulation", "data preparation", "running experiments"],
"note": "Please use gpt-4o-mini for your experiments."},
{"phases": ["running experiments"],
"note": f'Use the following code to inference gpt-4o-mini: \nfrom openai import OpenAI\nos.environ["OPENAI_API_KEY"] = "{api_key}"\nclient = OpenAI()\ncompletion = client.chat.completions.create(\nmodel="gpt-4o-mini-2024-07-18", messages=messages)\nanswer = completion.choices[0].message.content\n'},
{"phases": ["running experiments"],
"note": f"You have access to only gpt-4o-mini using the OpenAI API, please use the following key {api_key} but do not use too many inferences. Do not use openai.ChatCompletion.create or any openai==0.28 commands. Instead use the provided inference code."},
{"phases": ["running experiments"],
"note": "I would recommend using a small dataset (approximately only 100 data points) to run experiments in order to save time. Do not use much more than this unless you have to or are running the final tests."},
{"phases": ["data preparation", "running experiments"],
"note": "You are running on a MacBook laptop. You can use 'mps' with PyTorch"},
{"phases": ["data preparation", "running experiments"],
"note": "Generate figures with very colorful and artistic design."},
]
When conducting research, the choice of model can significantly impact the quality of results. More powerful models tend to have higher accuracy, better reasoning capabilities, and better report generation. If computational resources allow, prioritize the use of advanced models such as o1-(mini/preview) or similar state-of-the-art large language models.
However, it’s important to balance performance and cost-effectiveness. While powerful models may yield better results, they are often more expensive and time-consuming to run. Consider using them selectively—for instance, for key experiments or final analyses—while relying on smaller, more efficient models for iterative tasks or initial prototyping.
When resources are limited, optimize by fine-tuning smaller models on your specific dataset or combining pre-trained models with task-specific prompts to achieve the desired balance between performance and computational efficiency.
If you lose progress, internet connection, or if a subtask fails, you can always load from a previous state. All of your progress is saved by default in the state_saves
variable, which stores each individual checkpoint. Just pass the following arguments when running ai_lab_repo.py
python ai_lab_repo.py --api-key "API_KEY_HERE" --research-topic "YOUR RESEARCH IDEA" --llm-backend "o1-mini" --load-existing True --load-existing-path "state_saves/LOAD_PATH"
If you are running Agent Laboratory in a language other than English, no problem, just make sure to provide a language flag to the agents to perform research in your preferred language. Note that we have not extensively studied running Agent Laboratory in other languages, so be sure to report any problems you encounter.
For example, if you are running in Chinese:
python ai_lab_repo.py --api-key "API_KEY_HERE" --research-topic "YOUR RESEARCH IDEA (in your language)" --llm-backend "o1-mini" --language "中文"
There is a lot of room to improve this codebase, so if you end up making changes and want to help the community, please feel free to share the changes you've made! We hope this tool helps you!
Source Code Licensing: Our project's source code is licensed under the MIT License. This license permits the use, modification, and distribution of the code, subject to certain conditions outlined in the MIT License.
If you would like to get in touch, feel free to reach out to sschmi46@jhu.edu
@misc{schmidgall2025agentlaboratoryusingllm,
title={Agent Laboratory: Using LLM Agents as Research Assistants},
author={Samuel Schmidgall and Yusheng Su and Ze Wang and Ximeng Sun and Jialian Wu and Xiaodong Yu and Jiang Liu and Zicheng Liu and Emad Barsoum},
year={2025},
eprint={2501.04227},
archivePrefix={arXiv},
primaryClass={cs.HC},
url={https://arxiv.org/abs/2501.04227},
}