A healthcare chatbot which aims to analyze symptoms, provide information about diseases and suggest medicines for treatment.
This chatbot has been specifically built for analyzing symptoms of various diseases, extracting and accessing valuable information about diseases, including signs & symptoms, treatments, precautions, and other medical care needed, along with accessing medicinal information. The idea here is to provide the users with a relatively faster conversation experience where they wouldn’t have to spend a lot of time on typing messages for the chatbot, rather would get the things done by just selecting an option and only providing the relevant information.
I would like to mention here that for this project, I haven’t used specific healthcare APIs or datasets like MIMIC-III. As a result, it might not appear as a very optimal and efficient way of designing a chatbot, especially for a field like healthcare where accurate predictions play a crucial role. However, I have designed this project to get started and explore the ways of creating chatbots from scratch, using traditional machine learning algorithms along with natural language processing and other Python-based libraries.
Project Overview
The chatbot built here provides three options for the user to choose from:
- Analyze Symptoms: The user can provide the symptoms of their health condition, and the input is then used in the model to predict the disease they might be suffering from. The dataset used for this model includes conversations between doctors and patients along with the disease label as part of the diagnosis. Natural Language Processing (NLP) was used to preprocess the data, and essential keywords were extracted from the conversations. The snapshot below shows the word cloud with the most commonly occurring words in the conversations of the dataset.
For the model, a voting mechanism technique was used. The idea was to increase the overall accuracy of the model. To achieve this, a group of classifiers like Logistic Regression, Support Vector Machine, Multinomial Naive Bayes, and Random Forest Classifier were used. When an input is passed to the model, all these classifiers work concurrently to predict the best possible disease, and the majority of the votes decide the prediction of the disease. The accuracy achieved by the trained model was 98.3%.
- Access Information about Diseases: This section allows users to access information about any disease. For this, the Wikipedia module available in Python was used. Wikipedia Module is a Python library that makes it easy to access and parse data from Wikipedia. This provides a very convenient way to view specific sections of the Wikipedia page, search for articles, get summaries, and fetch other details about Wikipedia pages programmatically. When the user enters a search query (name of a disease or a symptom), the possible search results are displayed, and post the selections, the available sections on the Wikipedia page are shown, making it easier for the user to access information quickly with just a click of a button.
More information about the Python module: Wikipedia package on PyPI
- Medicinal Information: This section provides users with medicinal data, where they can enter the name of any disease and the best-matched medicines available in the market or pharmacies can be shown along with their composition and side effects. The dataset used here was available on Kaggle (11000 Medicine Details). To facilitate the working of this section, TF-IDF vectorizer was used along with the Nearest Neighbors model (metric: cosine similarity). The Nearest Neighbors model is based on the idea that similar instances are close to each other in the feature space. It is a non-parametric method, meaning it makes no assumptions about the underlying data distribution. The model is often referred to as K-Nearest Neighbors (K-NN) when used for classification. The model was used on a dataset containing 11,000 medicinal information entries provided by the renowned online pharmacy 1mg.
The GUI of the chatbot was built using Streamlit, which provides multiple features to create chatbots conveniently. These chat elements are designed to be used in conjunction with each other, but they can also be used separately.
You can refer to this link to read the documentation of Streamlit: Streamlit
The screenshots attached below demonstrate the working of the chatbot