CLUE-python

The python implementation of CLUE: Concept-Level Uncertainty Estimation for Large Language Models paper.

Overview

CLUE can be used to derive an explainable uncertainty in black box LLM generation using NLI.

In addition to paper's contribution for measuring Concept level uncertainty for LLMs, I introduce a new way of using CLUE and Concept level uncertainty for measuring Context Usability. Given the Input, Retrieved Contexts and the Generated output, there is an evaluator that measures the uncertainty in every context chunk with respect to the generated output and provide an uncertainty score against each contexts. The basic idea is the context uncertainty score is likely to be inversely proposal to the context's contribution in the generation and hence shows its usability. In simple words: High Uncertainty ≈ Non useful.

In summary, I have two evaluators:

Vanilla CLUE - measures Concept Level Uncertainty
Context Usability - measures context usefulness (i.e. Contribution of each context chunk into its generation)

Models/Platforms

To generate output sequences and concepts: groq
To generate entailment scores: bart-large-mnli

Environment vars

GROQ_API_KEY=

Demo

Bash/shell

git clone https://github.com/nikilpatel94/CLUE-python.git
cd CLUE-python
conda create --name CLUE-python python=3.12.0
conda activate CLUE-python
pip install -r requirements.txt
EXPORT GROQ_API_KEY=your_groq_key
python ./src/example.py

Windows powershell

git clone https://github.com/nikilpatel94/CLUE-python.git
cd CLUE-python
conda create --name CLUE-python python=3.12.0
conda activate CLUE-python
pip install -r requirements.txt
$env:GROQ_API_KEY="your_groq_key"
python .\src\example.py

Limitations

Evaluations from the paper are not included in this implementation
Multilingual evaluations can be unpredictable to evaluate with the used NLI model
Dataset evaluations are pending

Immediate Road-map

Fix pooling of concepts
Model Agnostic LLM usage - Support for openai, Ollama and other LLMs
Streamline lib and model imports
Add Context usability
Dataset Validation

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
dist		dist
images		images
samples		samples
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.config		setup.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CLUE-python

Overview

Models/Platforms

Environment vars

Demo

Limitations

Immediate Road-map

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

nikilpatel94/CLUE-python

Folders and files

Latest commit

History

Repository files navigation

CLUE-python

Overview

Models/Platforms

Environment vars

Demo

Limitations

Immediate Road-map

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages