This is our repository of egocentric activity datasets.
This page includes GTEA, GTEA Gaze and GTEA Gaze+ datasets.
We are now working on expanding GTEA Gaze+ dataset. Stay tuned!
Alireza Fathi, Xiaofeng Ren, James M. Rehg,
Learning to Recognize Objects in Egocentric Activities, CVPR, 2011
Yin Li, Zhefan Ye, James M. Rehg.
Delving into Egocentric Actions, CVPR 2015
To record the sequences, we stuffed a table with various kinds of food, dishes and snacks. We asked each subject to wear the Tobii glasses and calibrated the gaze. Then we asked the subject to take a sit and make whatever food they feel like having. The beginning and ending time of the actions are annotated. Each action consists of a verb and a set of nouns. For example pouring milk into cup. In our experiments we extract images from video at 15 frames per second. Action annotations are based on frame numbers. The following sequences are used for training: 1, 6, 7, 8, 10, 12, 13, 14, 16, 17, 18, 21, 22 and the following sequences are used for testing: 2, 3, 5, 20.
Alireza Fathi, Yin Li, James M. Rehg,
Learning to Recognize Daily Actions using Gaze, ECCV, 2012
We collected this dataset at Georgia Tech's AwareHome. This dataset consists of seven meal-preparation activities, performed by 26 subjects. Subjects perform the activities based on the given cooking recipes (get the recipes here).
Activities are: American Breakfast, Pizza, Snack, Greek Salad, Pasta Salad, Turkey Sandwich and Cheese Burger. SMI glasses record a HD video of subjects activities at 24 frames per second. They also record subject's gaze at 30 fps.
For each activity, we used ELAN to annotate its actions. An activity is a meal-preparation task such as making pizza, and an action is a short temporal segment such as putting sauce on the pizza crust, dicing the green peppers, washing the mushrooms, etc.