This project, part of an educational program, aims to de-congest national highways by analyzing road traffic data from various toll plazas. Using Kafka, we stream data like vehicle_id, vehicle_type, toll_plaza_id, and timestamp as vehicles pass through toll plazas. This repository contains scripts and instructions for setting up a data pipeline to collect this streaming data and load it into a database for further analysis.
- Apache Kafka
- Zookeeper
- Python 3.x
- Relevant Python libraries: (list any specific libraries used in the scripts)
-
Zookeeper Setup:
- Start the Zookeeper server, a prerequisite for Kafka.
-
Kafka Setup:
- Start the Kafka server after initializing the Zookeeper server.
- Create a Kafka topic named
toll
for streaming the toll data.
-
Toll Traffic Generator (
toll_traffic_generator.py
):- Simulates traffic data at toll plazas.
- Streams data to the Kafka topic.
-
Streaming Data Reader (
streaming_data_reader.py
):- Reads streaming data from Kafka.
- Processes and loads data into a database.
-
Running the Toll Traffic Generator:
- Run
python3 toll_traffic_generator.py
. - Ensure Kafka is consuming data from the
toll
topic.
- Run
-
Running the Streaming Data Reader:
- Run
python3 streaming_data_reader.py
. - Verify data loading into the database.
- Run
This project is part of an educational program. Contributions adhering to project standards are welcome.
MIT License
Copyright (c) 2024 Farhad Vadiee