Update fixres

patrick-llgc · patrick-llgc · commit 2f8fbf637b24 · 2020-02-28T06:56:57.000-08:00
diff --git a/paper_notes/fixres.md b/paper_notes/fixres.md
@@ -5,19 +5,18 @@ _February 2020_
 tl;dr: Conventional imageNet classification has a train/test resolution discrepancy (domain shift).
 
 #### Overall impression
-Scale invariance/equivariance is not guaranteed in CNN (only shift invariance). The same model with different test time input will yield very different statistics. 
+Scale invariance/equivariance is not guaranteed in CNN (only shift invariance). The same model with different test time input will yield very different statistics. The distribution of activation changes at test time, the values are not in the range that the final cls layers were trained for.
 
 In ImageNet training, conventional way is to use 10-time crop (center, four corners, and their mirrors) and test time is always central crop. This leads to a discrepancy of the statistics in training/test.
 
-A similar work is MultiGrain, where the p-pooling is adjusted to match the train/test-time stats.
+Simple solution: **finetune last layer** with test time scale and resolution, as the final stage of training.
 
 
 #### Key ideas
-- The distribution of activation changes at test time, the values are not in the range that the final cls layers were trained for.
-- Simple solution: **finetune last layer** with test time scale and resolution.
-- Larger test crops yields better results.
 
 #### Technical details
+- Larger test crops yields better results.
+- A similar work is MultiGrain, where the p-pooling is adjusted to match the train/test-time stats.
 - GeM (generalized mean pooling) p-pooling: a generalization of average pooling and max pooling
 	- cf LSE pooling in [From Image-level to Pixel-level Labeling with Convolutional Networks](https://arxiv.org/abs/1411.6228) CVPR 2015
 	- Image/instance retrieval requires adjusting p-pooling for better accuracy