Hi, thank you for the nice repository! I had a question regarding the training of the VisualGenome models. The following is written in the paper:
"The object detection model is a ResNet50-FPN detector trained on Visucal Genome"
I have two questions:
- I cannot figure out where I can find the pretrained weights of this model? Or are we expected to train this ourselves?
- Why is the Backbone resnet different for VisualGenome and OpenImages?
Thanks for the help!