Skip to content

Commit 10569a9

Browse files
add data preparation
1 parent bc1dfce commit 10569a9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+296630
-0
lines changed

DataPreparation/FiveFolds/original_txt_files/fold_0_data.txt

+4,497
Large diffs are not rendered by default.

DataPreparation/FiveFolds/original_txt_files/fold_1_data.txt

+3,746
Large diffs are not rendered by default.

DataPreparation/FiveFolds/original_txt_files/fold_2_data.txt

+3,954
Large diffs are not rendered by default.

DataPreparation/FiveFolds/original_txt_files/fold_3_data.txt

+3,460
Large diffs are not rendered by default.

DataPreparation/FiveFolds/original_txt_files/fold_4_data.txt

+3,835
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_0/agegender_test.txt

+3,879
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_0/agegender_train.txt

+11,136
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_0/agegender_train_subset.txt

+1,249
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_0/agegender_val.txt

+1,242
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_1/agegender_test.txt

+3,005
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_1/agegender_train.txt

+11,905
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_1/agegender_train_subset.txt

+1,342
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_1/agegender_val.txt

+1,348
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_2/agegender_test.txt

+3,121
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_2/agegender_train.txt

+11,814
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_2/agegender_train_subset.txt

+1,306
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_2/agegender_val.txt

+1,323
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_3/agegender_test.txt

+2,866
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_3/agegender_train.txt

+12,056
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_3/agegender_train_subset.txt

+1,352
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_3/agegender_val.txt

+1,335
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_4/agegender_test.txt

+3,387
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_4/agegender_train.txt

+11,593
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_4/agegender_train_subset.txt

+1,285
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_test_per_fold_agegender/test_fold_is_4/agegender_val.txt

+1,277
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/age_test.txt

+4,316
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/age_train.txt

+11,823
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/age_train_subset.txt

+1,295
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/age_val.txt

+1,284
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/gender_test.txt

+4,007
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/gender_train.txt

+12,256
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/gender_train_subset.txt

+1,354
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_0/gender_val.txt

+1,339
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/age_test.txt

+3,101
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/age_train.txt

+12,894
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/age_train_subset.txt

+1,429
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/age_val.txt

+1,428
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/gender_test.txt

+3,624
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/gender_train.txt

+12,574
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/gender_train_subset.txt

+1,402
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_1/gender_val.txt

+1,404
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/age_test.txt

+3,339
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/age_train.txt

+12,655
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/age_train_subset.txt

+1,416
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/age_val.txt

+1,429
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/gender_test.txt

+3,191
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/gender_train.txt

+12,986
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/gender_train_subset.txt

+1,455
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_2/gender_val.txt

+1,425
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/age_test.txt

+2,975
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/age_train.txt

+12,991
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/age_train_subset.txt

+1,454
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/age_val.txt

+1,457
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/gender_test.txt

+3,318
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/gender_train.txt

+12,854
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/gender_train_subset.txt

+1,417
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_3/gender_val.txt

+1,430
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/age_test.txt

+3,693
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/age_train.txt

+12,371
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/age_train_subset.txt

+1,378
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/age_val.txt

+1,359
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/gender_test.txt

+3,463
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/gender_train.txt

+12,717
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/gender_train_subset.txt

+1,404
Large diffs are not rendered by default.

DataPreparation/FiveFolds/train_val_txt_files_per_fold/test_fold_is_4/gender_val.txt

+1,422
Large diffs are not rendered by default.

datapreparation.py

+188
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,188 @@
1+
# Copyright 2015, Gil Levi and Tal Hassner
2+
#
3+
# The SOFTWARE provided in this page is provided "as is", without any guarantee made as to its suitability or fitness for any particular use.
4+
# It may contain bugs, so use of this tool is at your own risk. We take no responsibility for any damage of any sort that may unintentionally
5+
# be caused through its use.
6+
#
7+
# The purpose of this repository is to assist readers in reproducing our results on age and gender classification for facial images as
8+
# described in the following work:
9+
#
10+
# Gil Levi and Tal Hassner, Age and Gender Classification Using Convolutional Neural Networks, IEEE Workshop on Analysis and Modeling of
11+
# Faces and Gestures (AMFG), at the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, June 2015
12+
#
13+
# Project page: http://www.openu.ac.il/home/hassner/projects/cnn_agegender/
14+
# ==============================================================================
15+
# MIT License
16+
#
17+
# Modifications copyright (c) 2018 Image & Vision Computing Lab, Institute of Information Science, Academia Sinica
18+
#
19+
# Permission is hereby granted, free of charge, to any person obtaining a copy
20+
# of this software and associated documentation files (the "Software"), to deal
21+
# in the Software without restriction, including without limitation the rights
22+
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
23+
# copies of the Software, and to permit persons to whom the Software is
24+
# furnished to do so, subject to the following conditions:
25+
#
26+
# The above copyright notice and this permission notice shall be included in all
27+
# copies or substantial portions of the Software.
28+
#
29+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
30+
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
31+
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
32+
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
33+
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
34+
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
35+
# SOFTWARE.
36+
# ==============================================================================
37+
import os
38+
import random
39+
import sys
40+
import argparse
41+
42+
age_list=['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)']
43+
gender_list=['m','f']
44+
45+
def main(args):
46+
47+
# creat output dir
48+
if not os.path.exists(args.outfilesdir):
49+
os.mkdir(args.outfilesdir)
50+
51+
for cur_test_fold_ind in range(5):
52+
53+
# make output dirs
54+
cur_fold_out_foldername='test_fold_is_{0}'.format(cur_test_fold_ind)
55+
cur_fold_out_foldername=os.path.join(args.outfilesdir,cur_fold_out_foldername)
56+
if not os.path.exists(cur_fold_out_foldername):
57+
os.mkdir(cur_fold_out_foldername)
58+
59+
# read raw data set
60+
cur_test_fold_filename = 'fold_{0}_data.txt'.format(cur_test_fold_ind)
61+
cur_test_fold_filename = os.path.join(args.rawfoldsdir, cur_test_fold_filename)
62+
with open(cur_test_fold_filename) as f:
63+
def_lines=f.readlines()
64+
65+
def_lines.pop(0)
66+
# for test files
67+
full_test_list = []
68+
for def_line in def_lines:
69+
70+
def_dic={}
71+
subject_dir = def_line.split('\t')[0]
72+
image_subject = def_line.split('\t')[2]
73+
image_name='landmark_aligned_face.{0}.{1}'.format(image_subject,def_line.split('\t')[1])
74+
75+
image_age = def_line.split('\t')[3]
76+
if image_age=='(25 23)':
77+
image_age='(25 32)'
78+
79+
image_gender = def_line.split('\t')[4]
80+
81+
def_dic['subject_dir'] = subject_dir
82+
def_dic['image_name'] = image_name
83+
def_dic['image_subject']= image_subject
84+
def_dic['image_age'] = image_age
85+
def_dic['image_gender'] = image_gender
86+
87+
full_test_list.append(def_dic)
88+
89+
images_num = len(full_test_list)
90+
indices=random.sample(set(range(0,images_num)), images_num)
91+
92+
agegender_test_txt_filename=os.path.join(cur_fold_out_foldername, 'agegender_test.txt')
93+
if os.path.exists(agegender_test_txt_filename):
94+
os.remove(agegender_test_txt_filename)
95+
96+
agegender_test_txt_file = open(agegender_test_txt_filename,'w+')
97+
for ind in indices:
98+
subject_dir = full_test_list[ind]['subject_dir']
99+
image_name = full_test_list[ind]['image_name']
100+
image_age = full_test_list[ind]['image_age']
101+
image_gender = full_test_list[ind]['image_gender']
102+
image_subject= full_test_list[ind]['image_subject']
103+
104+
if image_age in age_list and image_gender in gender_list:
105+
image_age_index=age_list.index(image_age)
106+
image_gender_index=gender_list.index(image_gender)
107+
s='{0}/{1} {2} {3}\n'.format(subject_dir,image_name,image_age_index,image_gender_index)
108+
agegender_test_txt_file.write(s)
109+
110+
agegender_test_txt_file.close()
111+
112+
# for train, val files
113+
full_train_list = []
114+
train_folds_indices=list(set(range(5)) - set([cur_test_fold_ind]))
115+
for train_fold_ind in train_folds_indices:
116+
# read raw data
117+
cur_train_fold_filename='fold_{0}_data.txt'.format(train_fold_ind)
118+
cur_train_fold_filename=os.path.join(args.rawfoldsdir,cur_train_fold_filename)
119+
with open(cur_train_fold_filename) as f:
120+
def_lines = f.readlines()
121+
122+
def_lines.pop(0)
123+
for def_line in def_lines:
124+
125+
def_dic={}
126+
subject_dir =def_line.split('\t')[0]
127+
image_subject=def_line.split('\t')[2]
128+
image_name='landmark_aligned_face.{0}.{1}'.format(image_subject,def_line.split('\t')[1])
129+
130+
image_age=def_line.split('\t')[3]
131+
if image_age == '(25 23)':
132+
image_age='(25 32)'
133+
134+
image_gender=def_line.split('\t')[4]
135+
136+
def_dic['subject_dir'] =subject_dir
137+
def_dic['image_name'] =image_name
138+
def_dic['image_subject']=image_subject
139+
def_dic['image_age'] =image_age
140+
def_dic['image_gender'] =image_gender
141+
142+
full_train_list.append(def_dic)
143+
144+
images_num=len(full_train_list)
145+
indices=random.sample(set(range(0,images_num)), images_num)
146+
147+
val_indices=indices[:images_num/10]
148+
train_indices=indices[(images_num/10) + 1:]
149+
train_subset_indices=indices[(images_num/10) + 1: 2* (images_num/10)]
150+
151+
cases=['val','train','train_subset']
152+
for case,indices in zip(cases,[val_indices,train_indices,train_subset_indices]):
153+
154+
agegender_txt_filename=os.path.join(cur_fold_out_foldername,'agegender_{0}.txt'.format(case))
155+
if os.path.exists(agegender_txt_filename):
156+
os.remove(agegender_txt_filename)
157+
158+
agegender_txt_file=open(agegender_txt_filename, 'w+')
159+
for ind in indices:
160+
subject_dir=full_train_list[ind]['subject_dir']
161+
image_name=full_train_list[ind]['image_name']
162+
image_age=full_train_list[ind]['image_age']
163+
image_gender=full_train_list[ind]['image_gender']
164+
image_subject=full_train_list[ind]['image_subject']
165+
166+
if image_age in age_list and image_gender in gender_list:
167+
image_age_index=age_list.index(image_age)
168+
image_gender_index=gender_list.index(image_gender)
169+
s='{0}/{1} {2} {3}\n'.format(subject_dir,image_name,image_age_index,image_gender_index)
170+
agegender_txt_file.write(s)
171+
172+
agegender_txt_file.close()
173+
174+
175+
def parse_arguments(argv):
176+
parser = argparse.ArgumentParser()
177+
178+
parser.add_argument('--inputdir', type=str, default='./adiencedb/aligned',
179+
help='directory of adience dataset')
180+
parser.add_argument('--rawfoldsdir', type=str, default='./DataPreparation/FiveFolds/original_txt_files',
181+
help='directory of raw folds')
182+
parser.add_argument('--outfilesdir', type=str, default='./DataPreparation/FiveFolds/train_val_test_per_fold_agegender',
183+
help='directory stored the output files separate from raw data')
184+
185+
return parser.parse_args(argv)
186+
187+
if __name__ == '__main__':
188+
main(parse_arguments(sys.argv[1:]))

0 commit comments

Comments
 (0)