My primary objective is to kickstart an exploratory data analysis (EDA) project focused on enhancing the U.S. public transportation systems. The EDA isn't about drawing conclusions but rather igniting exploratory thinking around an idea I'm interested in—improving public transportation. By investigating different variables, I aim to find correlations that could guide deeper research and problem-solving in the future.
In this EDA project, I put my technical skills to the test by gathering data from multiple sources like APIs, CSV files, and other databases. I used Python along with its powerful libraries like Pandas for data cleaning and manipulation, Matplotlib and Seaborn for data visualization, and Numpy and Scipy for statistical analysis. The key focus was to identify correlations between variables like population density, vehicle ownership, and education levels, among others. Although I didn't draw any definitive conclusions, the skills I employed were instrumental in setting the stage for more in-depth analysis and decision-making related to public transportation.
- Excel
- Python
- Pandas
- Matplotlib
- API
- Numpy
- Seaborn
- scipy
- Jupyter Notebook
-
How does the population density of a state correlate with the percentage of people who use public transportation? Is there a higher percentage of public transportation users in more densely populated states?
-
Does the availability of registered vehicles in a state have a negative correlation with public transportation ridership? In other words, do states with more registered vehicles have lower percentages of public transportation users?
-
Is there a positive correlation between the percentage of individuals with access to cars and the percentage of people using public transportation? Do states with higher car access also have higher public transportation ridership due to the complementarity between the two modes of transportation?
-
How does the level of education in a state (measured by the percentage of individuals with at least a bachelor's degree) correlate with public transportation ridership? Is there a stronger preference for public transportation in states with higher education levels?
-
Does the unemployment rate in a state have any significant correlation with the percentage of people using public transportation? Are individuals more likely to rely on public transportation during periods of higher unemployment?
Variable | File | Source |
---|---|---|
Uses Public Transportation | pub.csv | Census.gov |
Population Estimate | API Call | Census.gov |
Population Density | API Call | Census.gov |
Cars per Household | carmod.csv | National Equity Atlas |
Registered Vehicles | carmod.csv | National Equity Atlas |
No Access to Cars | carmod.csv | National Equity Atlas |
Gross Domestic Production | gdpmod.csv | 2023 World Population Review |
Highschool Education | edu1.csv | 2023 World Population Review |
Bachelors Education | edu1.csv | 2023 World Population Review |
Unemployment Rate | unemmod.csv | Iowa Community Indicators Program, 175 Heady Hall |