This repository contains the code to our paper (Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval) accepted at EMNLP 2022. In this work, we evaluate performance for Business FAQ retrieval both with standard FAQ retrieval techniques using query-Question (q-Q) similarity and few-shot intent detection techniques.We propose a novel approach to scale multi-tenant FAQ applications in real-world context by contrastive fine-tuning of the last layer in sentence Bi-Encoders along with tenant-specific weight switching.
-
Create a virtual environment and setup the requirements:
pip install -r requirements.txt -
Download data from here -> FAQ Data
-
Extract under:
src/data
data
├── dialoglue
│ ├── banking
│ ├── clinc
│ ..
└── hint3
└── v1
├── test
└── train
- Download relevant models as needed (Fasttext/Glove etc)
-
Change the parameters in
config.yamlunder thesrc/configfolder and run the scripts. Check config file parameters for more details -
Evaluate base embeddings/models:
python -m src.evaluateFor example:
- For evaluating the Dialoglue BANKING77 dataset with default Bert Embeddings, these are the configuration changes required in
config.yaml:
DATASETS: DATASET_SOURCE: "dialoglue" DATASET_NAME: "banking" OOS_CLASS_NAME: "NO_NODES_DETECTED" DATA_SUBSET: "train_5" N_LABELS: 77 EVALUATION: EVALUATION_METHOD : "BERT_EMBEDDINGS" MODEL_NAME : "bert-base-uncased" TOKENIZER_NAME : "bert-base-uncased"
- Then run:
python -m src.evaluate
- For evaluating the Dialoglue BANKING77 dataset with default Bert Embeddings, these are the configuration changes required in
-
Fine-tuning the bi-encoders/cross-encoders with question pairs/triplets
-
Create Question Pair Training data if needed:
python -m src.utils.question_pairs -
Train a bi-encoder / cross-encoder with Question Pairs
python -m src.train -
Change the model to the trained nodel folder and then run
python -m src.evaluate
For example:
- For finetuning the Sentence Bert model with the Dialoglue BANKING77 dataset, these are the configuration changes required in
config.yaml
DATASETS: DATASET_SOURCE: "dialoglue" DATASET_NAME: "banking" OOS_CLASS_NAME: "NO_NODES_DETECTED" DATA_SUBSET: "train_5" N_LABELS: 77 TRAINING: MODEL_TYPE : "BI_ENCODER" MODEL_NAME : "sentence-transformers/all-mpnet-base-v2" TOKENIZER_NAME : "sentence-transformers/all-mpnet-base-v2" LAYERS_TO_UNFREEZE : [11] NUM_ITERATIONS : 10000 SCHEDULER : "WarmupLinear" VALIDATION_SPLIT : 0.2
-
Then run
python -m src.train -
Update the evaluation parameters in
config.yaml
EVALUATION: EVALUATION_METHOD : "BERT_EMBEDDINGS" MODEL_NAME : "<model_folder>" TOKENIZER_NAME : "sentence-transformers/all-mpnet-base-v2"
- Then run
python -m src.evaluate
-
-
Pretraining, followed by fine-tuning
-
Under the
"data"folder create a folder called"pretrain":mkdir pretrain -
Set the required parameters in the
config.yamlfile. -
Generate Offline Triplets for Pre-training:
python -m src.utils.gen_pretraining_data -
Pretrain the bi-encoder with offline Triplets:
python -m src.pretrain -
Fine-tuning the pre-trained model: Follow same steps as listed in 3.
-
-
Running inference with trained models with client weight switching
-
Train the model for 2 datasets(tenants) separately and store the last layer weights:
python -m src.train -
The tenant specific weights will get stored under
MODEL_DIR/clientsas specified under INFERENCE inconfig.yaml -
Change parameters in
config.yamlunderINFERENCE -
Specify the tenant names under
CLIENT_NAMES,INFERENCE_FILE_PATH,MODEL_NAMEand theLAYERS_TO_LOAD(the last layer of the model) and then run:python -m src.predict -
A sample
inference.txtis present under the data folder. This contains mixed utterances from 2 sample tenants
-
| Parameter | Usage |
|---|---|
| DATASET_SOURCE | Specify dataset for running evaluation / training |
| DATASET_NAME | Specify the dataset name of the corresponding DATASET_SOURCE that should be used for evalaution/training |
| OOS_CLASS_NAME | Set "NO_NODES_DETECTED" for HINT3 datasets / "oos" for CLINC150 |
| DATA_SUBSET | Training dataset to use. "train" / "subset_train" for HINT3 datasets, "train_5"/"train_10" for Dialoglue |
| N_LABELS | Number of labels present in training data.Used by the data loaders |
| Parameter | Usage |
|---|---|
| SUB_SAMPLE_QQ | Set True if subsampling is needed for Question Pair generation else False |
| SAMPLE_SIZE_PER_DATASET | Required sample size if sub-sampling |
| DATA_VAL_SPLIT | Used to create a validation dataset after question pair generation eg. 0.2. Only required if using Triplets for validation |
| GENERATE_TRIPLETS | Set True if triplet generation is required for finetuning |
| HARD_SAMPLE | Set to True if needed. Used with Sub-sampling |
| MODEL_TYPE | Set one of these "BI_ENCODER"/"BERT_CLASSIFIER"/"SBERT_CROSS_ENCODER" |
| NUM_ITERATIONS | Number of training iterations |
| TRAIN_OUTPUT_DIR | Output directory for storing the models. Default is "./models/" |
| MODEL_NAME | Specify the model like "sentence-transformers/all-MiniLM-L6-v2"/"bert-base-uncased" /"cross-encoder/stsb-distilroberta-base" |
| TOKENIZER_NAME | Specify the corresponding tokenizer name eg. "sentence-transformers/all-MiniLM-L6-v2" |
| LAYERS_TO_UNFREEZE | Specify model layer to unfreeze - 11,5 eg. [5] |
| LOSS_METRIC | Specify a loss metric for Sentence Bert Bi-Encoder models - "ContrastiveLoss" / "BatchHardTripletLoss" |
| LEARNING_RATE | Default is 2e-5 |
| SCHEDULER | Specify scheduler - "WarmupLinear" for SentenceBert / "linear" for Bert |
| VALIDATION_SPLIT | Validation split for evaluating models during training if separate validation dataset doesnt exist |
| Parameter | Usage |
|---|---|
| EVALUATION_METHOD | "BERT_EMBEDDINGS" / "BERT_CLASSIFIER" / "SBERT_CROSS_ENCODER" / "BM25" / "GLOVE" / "FASTTEXT" / "TFIDF_WORD_EMBEDDINGS" / "TFIDF_CHAR_EMBEDDINGS" / "CV_EMBEDDINGS" |
| MODEL_NAME | Specify a model name which is implemented via Huggingface - "bert-base-uncased" / "models/convbert" (DialoGLUE convbert model) / "sentence-transformers/all-MiniLM-L6-v2" |
| TOKENIZER_NAME | Corresponding tokenizer for the model |
| BATCH_SIZE | Batch size for data loaders for evaluation |
| FASTTEXT_MODEL_PATH | Location of the fasttext model "models/fasttext_ecom_model_2.bin" |
| GLOVE_MODEL_PATH | Location of glove model "models/glove.6B/glove.6B.300d.txt" |
| CHECK_SUCCESS_RATE | Set True / False for checking during evaluation |
| CHECK_PRECISION | Set True / False for checking during evaluation |
| CHECK_MAP | Set True / False for checking during evaluation |
| CHECK_NDCG | Set True / False for checking during evaluation |
| CHECK_MRR | Set True / False for checking during evaluation |
| CHECK_F1_MACRO | Set True / False for checking during evaluation |
| CHECK_F1_MICRO | Set True / False for checking during evaluation |
| CHECK_F1_WEIGHTED | Set True / False for checking during evaluation |
| CHECK_OOS_ACCURACY | Set True / False for checking during evaluation |
| OOS_THRESHOLD | Thresholds at which OOS accuracy should be checked. eg.[0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9] |
| K_VAL | List of k values for top-k evaluation for all metrics. Specify less than 10. Eg. [1,2,3,5] |
| Parameter | Usage |
|---|---|
| SAMPLE_SIZE_PER_DATASET | Sample size to be taken for each dataset(Curekart,SOFMattress,PowerPlay11,BANKING77,CLINC150 and HWU64) |
| VAL_SPLIT | Validation split for pretraining. Eg. 0.1 |
| GENERATE_TRIPLETS | Set True/False depending on pretraining requirements |
| GENERATE_PAIRS | Set True/False depending on pretraining requirements |
| HARD_SAMPLE | Number of steps per epoch |
| PRETRAIN_DATA_PATH | Path where intermediate data for pretraining should be saved |
| STEPS_PER_EPOCH | Set True if Hard sampling is needed |
| NUM_TRAIN_EPOCHS | Number of train epochs |
| TRAIN_OUTPUT_DIR | Output directory for the trained model and checkpoints |
| MODEL_NAME | "sentence-transformers/all-MiniLM-L6-v2" / "sentence-transformers/all-mpnet-base-v2" |
| LOSS_METRIC | Loss metric required "ContrastiveLoss" / TripletLoss" |
| BATCH_SIZE | Batch size for pretraining. Default is 32 |
| LEARNING_RATE | Default is 2e-5 |
| SCHEDULER | 'WarmupLinear' |
Parameters here is used to test the final inference with tenant weight switching.
| Parameter | Usage |
|---|---|
| TENANT_NAMES | Set of tenant names for whom the inference engine is setup. For example ["curekart", "powerplay11"] |
| INFERENCE_FILE_PATH | Path to the file where the inference data is present. "data/inference.txt" |
| MODEL_DIR | Path to model directory where tenant weights should be stored . Default is "models" |
| MODEL_NAME | Base model to be loaded. Eg "sentence-transformers/all-MiniLM-L6-v2" |
| EMBEDDING_FILE_NAME | The file name of the embeddings of the train set which is stored in numpy format. "embeddings.npy" |
| TEXTS_FILE_NAME | The file name for texts from the train set which is stored in numpy format."texts.npy" |
| LABELS_FILE_NAME | The file name containing the labels from the train set stored in numpy format. "labels.npy" |
| LAYERS_TO_LOAD | Specify the model layer to swap for each tenant. This should match with how it was trained. Eg. [5] |
Evaluation
-
For evaluation with
BM25/GLOVE/FASTTEXT/TFIDF_WORD_EMBEDDINGS/TFIDF_CHAR_EMBEDDINGS/CV_EMBEDDINGS- SetEVALUATION_METHODtoBM25/GLOVE/FASTTEXT/TFIDF_WORD_EMBEDDINGS/TFIDF_CHAR_EMBEDDINGS/CV_EMBEDDINGS. For Glove, Fasttext etc, the model path (GLOVE_MODEL_PATH,FASTTEXT_MODEL_PATH) should be specified correctly. -
For evaluation with
BERT/Sentence BERT embeddings, setEVALUATION_METHODto"BERT_EMBEDDINGS". SetMODEL_NAMEandTOKENIZER_NAMEwith any BERT model which can be loaded with Huggingface. Eg."bert-base-uncased"/"sentence-transformers/all-MiniLM-L6-v2" -
For evaluation using
BERTin a classifier approach, setEVALUATION_METHODto"BERT_CLASSIFIER". SetMODEL_NAME&TOKENIZER_NAMEto any of the Huggingface BERT classifier models. -
For evaluation using
SBERT cross encoders, setEVALUATION_METHODto"SBERT_CROSS_ENCODER". SetMODEL_NAME&TOKENIZER_NAMEto an SBERT cross encoder model like"cross-encoder/stsb-distilroberta-base".
Fine-tuning
-
For fine-tuning bi-encoder
SBERTmodels, underTRAINING, set MODEL_TYPE as"BI_ENCODER", SetMODEL_NAME&TOKENIZER_NAMEto Sentence BERT model like"sentence-transformers/all-MiniLM-L6-v2". SetLAYERS_TO_UNFREEZEas[5]which will depend on the model chosen. -
For fine-tuning
cross encoder SBERTmodels, underTRAINING, setMODEL_TYPEas"SBERT_CROSS_ENCODER", SetMODEL_NAME&TOKENIZER_NAMEto Sentence Bert model like"cross-encoder/stsb-distilroberta-base".SetLAYERS_TO_UNFREEZEto the last layer which will depend on the model chosen. -
For fine-tuning
BERTbased models as a classifier,underTRAININGsetMODEL_TYPEas"BERT_CLASSIFIER"and SetMODEL_NAME&TOKENIZER_NAMEto any of the Huggingface Bert classifier models. SetLAYERS_TO_UNFREEZEto the last layer which will depend on the model chosen.
Pre-training
- For pre-training, set a
MODEL_NAMEto a Sentence Bert Bi-encoder model. Set theLOSS_METRIC
Training data
-
For Question pair / Triplet generation, under
TRAINING, setSUB_SAMPLE_QQ : Trueif sampling needs to be done. In such a case, setSAMPLE_SIZE_PER_DATASETandHARD_SAMPLEflag as well. Set theMODEL_NAMEwhich will be used for generating the hard samples. If training with Triplets, setGENERATE_TRIPLETS : Trueto enable triplet generation. -
For Pretraining data, create the folder
"data/pretrain". SetGENERATE_TRIPLETS : Truefor Triplets andGENERATE_PAIRS : Truefor Question Pairs. SetSAMPLE_SIZE_PER_DATASETandHARD_SAMPLEflag as well. Set theMODEL_NAMEwhich will be used for generating the hard samples.
Coming soon