UBER DATA ANALYSIS TO PREDICT THE PRICE.
Here I’m going to use jupyter notebook for the execution of this project and it will be coded in python programming language. The python libraries which are used below:
• Numpy- For the computation and processing of the multidimensional and single dimensional array elements.
• Pandas - For data cleaning, manipulations and analysis.
• Seaborn & Plotly- for data visualization.
• Matplotlib – It’s a low level graph plotting library that serves as a visualization utility.
• Itertool – It’s a module that provides a various functions that work on iterators to produce complex iterators.
• gc – It provides the automatic garbage collector underlying memory management mechanism.
• os – It provides functions for interacting with operating system.
• Sys - It provides variables & functions which are used to manipulate diff parts of python runtime environment.
• %matplotlib inline – It set’s the backend of Matplotlib to the inline backend.
In my project I have used Linear Regression Algorithms for training.
Linear regression is one of the most known and easily understood algorithms in statistics and machine learning. It always attempts to model the relationship between any two variables by fitting a linear equation i.e., a straight line to the observing data. One variable is named to be an explanatory variable example -it can be your income, and the other is named to be a dependent variable example-can be your expenses. From a machine learning point of view, it is considered to be the simplest model that one can try out on your data. If you have a look that the data always follow a straight-line trend, linear regression is responsible to give you quick and exceptionally accurate results.
1. We have to plot or dependent variable Y-axis against the independent variable i.e., X-axis.
2. We have to plot a straight line and then measure correlation.
3. We have to keep changing the direction of the straight line until we can get the best correlation possible.
4. We extend from the given line to find possibly new values on y-axis.