This is a fork from the BCQA Benchmarking Complex QA repo.
The scripts used for making the OpenAI API calls and the data processing and analysis can be found in the data_analysis folder.
BCQA is a benchmark for a wide range of complex Qa tasks. It also aims to provide a easy to use framework for evaluating retrieval and reasoning approaches for answering complex multi-hop questions.
- Create a conda environment
conda create -n bcqa python=3.10
pip install -e .
- To be able to use GPU:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
- The data paths are absolute for my pc so you need to change it to fit yours.
The evaluation scripts for retreival and LLMs are in the evaluation folder
For instance to run dpr retreival for Wikimultihopqa run
python3 evaluation/wikimultihop/run_dpr_inference.py
Before running the above script make sure you have configured the correct paths for the data and corpus files in evaluation/config.ini
Example:
wikimultihopqa = /home/bcqa/BCQA/2wikimultihopQA
wikimultihopqa-corpus = /home/bcqa/BCQA/wiki_musique_corpus.json
- Install
black
:pip install black
orconda install black
- In your IDE: Enable formatting on save.
- Install
isort
:pip install isort
orconda install isort
- In your IDE: Enable sorting import on save.
In VS Code, you can do this using the following config:
{
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
}
Use type hints for everything! No exceptions.
Write a docstring for every function (except the main function). We use the Google format. In VS Code, you can use autoDocstring.
def sum(a: float, b: float) -> float:
"""Compute the sum of a and b.
Args:
a (float): First number.
b (float): Second number.
Returns:
float: The sum of a and b.
"""
return a + b