A FastAPI-based service that processes CSV files containing person health data. The API validates, transforms, and provides processed health data through a streaming response.
This API provides a robust pipeline for processing health-related CSV data. It follows a modular architecture with error handling, logging, and configurable environments. The service is containerized and uses Poetry for dependency management.
- CSV file processing with built-in validation
- Streaming response for efficient data handling
- Containerized deployment
- Python 3.12 or higher
- Poetry
- Docker (optional)
- Clone the repository:
git clone https://github.com/johnpapwinter/person-health-data-pipeline.git
cd person-health-pipeline
- Install dependencies using Poetry:
poetry install
- Run the application:
poetry run uvicorn main:app --reload
The API will be available at http://localhost:8000
- Build the Docker image:
docker build -t person-health-pipeline .
- Run the container:
docker run -p 8000:8000 person-health-pipeline
After starting the server, visit:
- Swagger UI:
http://localhost:8000/docs
- ReDoc:
http://localhost:8000/redoc
POST /api/v1/pipeline/process
Processes a CSV file containing person health data.
Request
- Content-Type:
multipart/form-data
- Body: CSV file with required columns:
- GivenName
- Gender
- Age
- Kilograms
- Centimeters
Response
- Success: 200 OK
- Content-Type:
text/csv
- Body: Processed CSV file
- Content-Type:
- Error: 400 Bad Request
- Invalid file format or data
- Error: 500 Internal Server Error
- Processing errors
The input CSV file must contain the following columns:
GivenName,Gender,City,Age,Kilograms,Centimeters
Jennifer,female,Somerville,20,80.5,172