Task 7: Challenge
It's time to test your skills!
The challenge is to analyze and perform Sentiment Analysis on the tweets using the US Airline Sentiment dataset. This dataset will help to gauge people's sentiments about each of the major U.S. airlines.
This data comes from Crowdflower's Data for Everyone library and constitutes Twitter reviews about how travelers in February 2015 expressed their feelings on Twitter about every major U.S. airline. The reviews have been classified as positive, negative, and neutral.
Instructions
Import the dataset from Airline-Sentiment-2-w-AA.
Here are some samples from the dataset:
Split the dataset into a training set and a testing set in an 80:20 ratio.
Run an experiment where the target column is airline_sentiment using only the default Transformers. You can exclude all other columns from the dataset except the 'text' column.
Run another instance of the same experiment, but this time include the Tensorflow models and the built-in transformers.
Next, repeat the experiment with a custom recipe from Driverless AI recipes.
Using Logloss as the scorer, observe the following outcomes:
- Which experiment out of the three gives the minimum Logloss value and why?
- How variable importance change as you change the selected transformers?
- Submit and view feedback for this page
- Send feedback about H2O Driverless AI | Tutorials to cloud-feedback@h2o.ai