UNBIASED LOOK AT DATASET BIAS
Antonia Torralba Alexei A. Efros
MIT CMU
Presented by:
Vivek Dubey
Harika Sabbella
NAMING DATASETS
1) Caltech-101
2) UIUC
3) MSRC
4) Tiny Images
5) ImageNet
6) PASCAL VOC
7) LabelMeS
8) SUN-09
9) 15 Scenes
10) Corel
11) Caltech-256
12) COIL-100
ARE CURRENT DATASETS BIASED?
! Randomly sampled 1000 images from training images
for the 12 datasets and trained a 12-way linear SVM
classifier
! Tested classifier on 300 random images from the test
images of the 12 datasets
RESULTS OF “NAME THAT DATASET” CLASSIFIER
CONFUSION MATRIX
TRAINING THE “CARS” CLASSIFIER
! Applied classifier to
object crops of cars
from five datasets
! Classifier was still
able to tell the
datasets apart (61%
performance)
OUTLINE
Step 1) Understanding how bias sneaks into our datasets
Step 2) Raise awareness in the visual recognition commu
data/sets/classifier/images/12/CLASSIFIER/DATASETS/Image/Step/commu/
data/sets/classifier/images/12/CLASSIFIER/DATASETS/Image/Step/commu/
-->