Sentiment analysis is the technique used to determine whether data is positive, negative or neutral.
The project aims to classify the tweets into a good tweet or a racist/sexist tweet. It is easier to predict rather than making use of human resource to spend their time identify which tweet belongs to what category. The training data already consists of the classified tweets into good and bad with the labels 0 and 1 respectively.
Thus, the model tries to predict the labels of the tweets in the data set.
- The Notebook file is
Twitter Sentiment Analysis.ipynb
. - The output file is
submit.csv
.
public leaderbord: 617 private leaderbord: 617
My test score is 0.6713043478
The Word embedding model with Tensor flow and Keras gives the highest accuracy of 58.6%, to classify the tweets into good and bad tweets.
Thus, the above model can be used to predict tweet is a good tweet and which is a Racist/Sexist tweet.
Better Feature extraction and more data can be carried considered to increase the accuracy of the classification of the model.