Demo datasets
Overview
In H2O Label Genie, you can use demo datasets to explore supported annotation tasks.
Access demo datasets
To access a demo dataset in H2O Label Genie, consider the following instructions:
- On the H2O Label Genie navigation menu, click Datasets.
- In the datasets table, select one of the demo datasets in H2O Label Genie.
Note
- After selecting a demo dataset, click New annotation task to annotate the dataset.
- To learn how to annotate your dataset, see Tutorials
Demo datasets in H2O Label Genie
Amazon reviews demo
- Dataset name:
amazon-reviews-demo
- Description: The dataset contains user reviews (in text format) and ratings (from 0 to 5) of Amazon products.
- Dataset columns:
stars
,comment
- Problem type: Text classification, text regression, text-entity recognition
- License: CC0 1.0 Universal (CC0 1.0)
Car or coffee demo
- Dataset name:
car-or-coffee-demo
- Description: The dataset contains images of cars and coffee.
- Problem type: Image classification, object detection
- License: Pexels license
Twitter demo
- Dataset name:
twitter-demo
- Description: The dataset contains tweets that can be used to analyze tweet sentiments and recognize the emotion in text tweets.
- Dataset columns:
text
,sentiment
- Problem type: Text classification, text-entity recognition
- License: Attribution 4.0 International (CC BY 4.0)
Text readability demo
- Dataset name:
text-readability-demo
- Description: This dataset contains excerpts, and it is part of the CLEAR Corpus.
- Dataset columns:
id
,excerpt
- Problem type: Text regression, text-entity recognition
- License: MIT license
CNN Daily Mail sample
- Dataset name:
cnn-dailymail-sample
- Description: The dataset contains human-generated abstract summaries from news stories published on the CNN and Daily Mail websites.
- Dataset columns:
id
,text
,summary
- Problem type: Text summarization, text classification
- License: MIT license
Plant pathology demo
- Dataset name:
plant-pathology-demo
- Description: This dataset contains images of healthy and diseased apple leaves for plant pathology recognition.
- Problem type: Image classification, image regression, object detection
- License: Attribution 4.0 International (CC BY 4.0)
ESC10 audio demo
- Dataset name:
esc10-audio-demo
- Description: This dataset contains 5-second-long recordings of environmental sounds organized into ten classes (with 40 examples per class). Clips in this dataset have been manually extracted from public field recordings gathered by the Freesound.org project.
- Problem type: Audio classification
- License: Attribution 3.0 Unported (CC BY 3.0)
Amnist demo
- Dataset name:
amnist-demo
- Description: The dataset contains a collection of 600 audio samples of spoken digits (0-9) of sixty different speakers.
- Problem type: Audio regression
- License: MIT license
Feedback
- Submit and view feedback for this page
- Send feedback about H2O Label Genie to cloud-feedback@h2o.ai