-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnotes.txt
38 lines (34 loc) · 1.48 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Prior: P(L)
Labels:
Parents occupation (O): {usual, pretentious, great_pret}
Childs Nursery (N): {proper, less_proper, improper, critical, very_crit}
Family form (F): {complete, completed, incomplete, foster}
Number of children (C): {1, 2, 3, more}
Housing (H): {convenient, less_conv, critical}
Finance (I): {convenient, inconv}
Social (S): {non-prob, slightly_prob, problematic}
Health (A): {recommended, priority, not_recom}
Make 3 functions:
getPriorCount
getFeatureCPT
getPredictions
getPriorCount() * Count the labels*
- Iterate through training file
- Count occurences of each label (recommend, not_recom)
- Calculate prior probabilities, (label count) / total sample count
- return the count's as priorCountsList
getFeatureCPT() * Create CPT *
- For each feature
- Iterate through training data
- For each label, count the occurences of each feature given the label
- Calcuate conditional probabilites as the count divided by the total count of the label value
- Store conditonal probabilites in feature_cpt
getPredictions() * Make predictions *
- Iterate through val_data
- For each sample
- Calcualte the probability of each label using the Formulation
- Predict the label with maximum probability
- Add predicted label to predictions list
Formulation:
max( P(L|O,N,F,C,H,I,S,A)) =
max(P(O|L) P(N|L) P(F|L) P(C|L) P(H|L) P(I|L) P(S|L) P(A|L) P(L))