For Q1. how to get the mask region as input; Maybe it just doing frame-wise decrease with GT and binarize the mask?
For Q1. how to get the mask region as input; Maybe it just doing frame-wise decrease with GT and binarize the mask?