Scrape Master is a web scraping project designed to scrape data from a sample e-commerce website. This documentation provides an overview of the project structure, usage instructions, and details on the components included.
Scrape Master is a web scraping project aimed at extracting data from a sample e-commerce website. The website serves as a test bed for the scraping script, which retrieves product information such as name, price, details, and reviews. The website is hosted here on GitHub, https://markcarcillar.github.io/scrape_master/. The data can be saved in various formats, including JSON, XML, and CSV.
The project's file structure is organized as follows:
index.html
: Sample website used for scraping.img/
: The folder containing images.main.py
: The main script for web scraping.output/
: The folder where the script's results are saved in JSON, XML, or CSV format..gitignore
: Configuration file to ignore files when using Git.requirements.txt
: A list of Python module requirements for the project.
To get started with Scrape Master, follow these steps:
-
Clone the repository to your local machine:
git clone https://github.com/markcarcillar/scrape_master.git
-
Install the required Python modules:
pip install -r requirements.txt
The primary functionality of Scrape Master is to scrape data from the sample e-commerce website. To use it, follow these steps:
-
Ensure you have completed the "Getting Started" section.
-
Run the scraping script
main.py
:python main.py
-
The script will prompt you to choose the output format (JSON, XML, CSV).
-
The scraped data will be saved in the
output/
folder.
The script provides options to save the scraped data in different formats:
- JSON: Use the
products.json
file. - XML: Use the
products.xml
file. - CSV: Use the
products.csv
file.
Contributions to Scrape Master are welcome! If you have ideas for improvements or bug fixes, please submit issues and pull requests on the GitHub repository.
This project is licensed under the MIT License.
By using Scrape Master, you can easily scrape data from a sample e-commerce website and save it in various formats for further analysis or integration into other projects. Feel free to explore the code and customize it to your specific needs.
Happy scraping!