LegalQAEval evals

This GitHub repository preserves a script used to evaluate extractive question answering models on the LegalQAEval legal question answering benchmark.

Currently, the best performing models on the benchmark are Isaacus' Kanon Answer Extractor and Kanon Answer Extractor Mini as shown below.

To take into account both the ability of models to determine when a question has an answer that is extractable from a particular text as well as their ability to extract the correct answers from texts, this code evaluates models as if they were classification models, where a ground label would only be positive if an example had an answer and the prediction would only be positive if the answer was correctly extracted by a model (correct in this context meaning that the answer had a Levenshtein similarity to any of the ground truth answers that was greater than 0.4). Matthews’ correlation coefficient, widely regarded as the gold standard for evaluating the balanced predictive power of classifiers, is then used to determine the overall performance of the models.

The full benchmarks of the most popular legal and general-purpose information extraction models on LegalQAEval may be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
isaacus--legalqaeval.metrics.csv		isaacus--legalqaeval.metrics.csv
legalqaeval_evals.py		legalqaeval_evals.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LegalQAEval evals

About

Uh oh!

Releases

Packages

Languages

License

isaacus-dev/legalqaeval-evals

Folders and files

Latest commit

History

Repository files navigation

LegalQAEval evals

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages