Visual Image Grounding by image separation. This repository is the implementation of the paper https://arxiv.org/abs/2104.09829
- Linux machine
- At least one NVIDIA GPU
- At least CUDA 10.2
- Anaconda (Installation instructions: https://docs.anaconda.com/anaconda/install/)
Clone the repository:
git clone TBD
Enter the directory:
cd GroundingBySeparation
Create and activate the conda environment:
conda deactivate # deactivate any active environments
conda create -n gbs python=3.6 # install the conda environment with conda dependencies
conda activate gbs # activate the environment
conda install -c conda-forge libjpeg-turbo
conda install pytorch==1.5.1 torchvision==0.6.1 cudatoolkit=10.2 -c pytorch
pip install -r requirements.txt # install pip requirements
We follow the data preperations of Akbari et al. 2019: https://github.com/hassanhub/MultiGrounding
Please prepare the data in the same format as described in the above repository.
After data preperation, either place the data in <GBS_ROOT_DIR>/data
or set the path in the environment variable LMDB_DATA_ROOT
To train a network run the following command from the project root directory:
python3 train.py --experiment_name my_first_run
The parameter --experiment_name
sets the sub-directory to save all the model checkpoints and tensorboard logs.
This name needs to be unique for each experiment. If the same name is used, the training will automatically continue from the last saved checkpoint (unless define otherwise in the parameters)
A list of all the train parameters and the discription can be seen in
python3 train.py --help
To evaluate the test results for the experiment my_first_run
run the following command:
python3 evaluate_benchmark.py --experiment_name evaluate_my_first_run --training_experiment_path ./Outputs/my_first_run
A list of all the train parameters and the discription can be seen in
python3 evaluate_benchmark.py --help
Our code fully supports Distributed Data Parallel (DDP) and can work on multiple gpus and multiple nodes.
In order to run the code with more than one GPU (or node) just add the following prefix to the train.py
or
evaluate_benchmark.py
calls.
python3 -m torch.distributed.launch --nproc_per_node=<NUM_GPUS_PER_NODE> --nnodes=<TOTAL_NUMBER_OF_NODES> --node_rank=<CURRENT_NODE_RANK> --master_addr=<HOSTNAME_OF_MASTER_NODE> --master_port=<SOME_PORT> train.py --args
for example to run on a single node with 8 gpus use the following:
python3 -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr=localhost --master_port=54321 train.py --args
to run on two nodes (with hostnames node1
and node2
) with four gpus each run the following:
On node1
:
python3 -m torch.distributed.launch --nproc_per_node=4 --nnodes=2 --node_rank=0 --master_addr=node1 --master_port=54321 train.py --args
On node2
:
python3 -m torch.distributed.launch --nproc_per_node=4 --nnodes=2 --node_rank=1 --master_addr=node1 --master_port=54321 train.py --args
When using this code please cite:
@inproceedings{arbelle2021GBS,
title={Detector-Free Weakly Supervised Grounding by Separation},
author={Arbelle, Assaf and Doveh, Sivan and Alfassy, Amit and Shtok, Joseph and Lev, Guy and Schwartz, Eli and Kuehne, Hilde and Levi, Hila Barak and Sattigeri, Prasanna and Panda, Rameswar and others},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={1801--1812},
year={2021}
}