Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforcing a conditional logic in responses #1457

Open
reuning opened this issue Feb 27, 2025 · 6 comments
Open

Enforcing a conditional logic in responses #1457

reuning opened this issue Feb 27, 2025 · 6 comments
Assignees

Comments

@reuning
Copy link

reuning commented Feb 27, 2025

I've been working on using this to categorize twitter posts into 1 of 7 categories. One issue I've had so far is that models seem to be overly willing to identify meaning in posts that don't have a lot of content. I thought about solving this by trying to use a chain-of-thought style approach where the prompt asks to first identify if there is enough information and then code the tweet if there is enough (and if there isn't enough it should return "OTHER").

The problem I'm having is that there doesn't seem to be an obvious way to force that requirement. So I will get posts identified as having not enough information and then a non-OTHER code applied.

Below is what I'm using. I'm afraid I might be missing something obvious.

class Information(str, Enum):
    Yes = "Yes"
    No = "No"

class Code(str, Enum):
    MACROECONOMICS = "MACROECONOMICS"
    CIVIL_RIGHTS = "CIVIL RIGHTS"
    HEALTH = "HEALTH"
    IMMIGRATION = "IMMIGRATION"
    DEFENSE = "DEFENSE"
    CAMPAIGNING = "CAMPAIGNING"
    OTHER = "OTHER"

class Coding(BaseModel):
    information: Information = Field(..., description="Enough information to categorize the post?")
    conclusion: Code = Field(..., description="Final coding of post")

json_schema = Coding.model_json_schema()

instructions = f"""
You are a world class AI model who answers questions in JSON using the following schema: \n<schema>\n{json_schema}\n</schema>
You will use this to classify social media posts from Members of the United States Congress into one of seven categories:

 - MACROECONOMICS includes posts about economic trends, economic policies, unemployment, taxation, or the budget.
 - CIVIL RIGHTS includes posts about racial or gender discrimination, voting rights, the first amendment or privacy.
 - HEALTH  includes posts about health care policies, regulation of insurance companies, mental health, or disease prevention.
 - IMMIGRATION includes posts about immigrants, refugees, undocumented individuals, or deportation 
 - DEFENSE includes posts about the military, alliances, veteran affairs, or the national guard. It does not include posts about the police.
 - CAMPAIGNING includes posts urging people to vote, finding a polling place, donating to a campaign, or sharing an endorsement. It does not include posts about voting rights in general.
 - OTHER includes anything that does not fit into the above categories or where there is not enough information to categorize it.

Before you categorize the post you should first decide if there is enough information in the post to categorize it. If there is not enough information then code the post as "OTHER". 

If the post is about multiple topics, then pick the category that is the most important to the post. "

"""

I'll get output like:

Coding(information=<Information.No: 'No'>, conclusion=<Code.CAMPAIGNING: 'CAMPAIGNING'>)
@rlouf
Copy link
Member

rlouf commented Feb 27, 2025

I don't think that Pydantic exposes the kind of conditional logic that Json Schema supports. However, we can try a different approach by unrolling the conditional dependency using the Regex DSL.

from outlines.types.dsl import Alternatives, String

information_query = "Enough information to categorize the post? "
conclusion = "Final coding of post "
choice = Alternatives([String("macroeconomics"), String("civil rights")])

template = information_query + ( ("No. " + conclusion + "OTHER") | ("Yes. " + conclusion + choice))

I apologize for the syntax, it is still a bit rough around the edges. Opened #1459 to correct this.

@reuning
Copy link
Author

reuning commented Feb 27, 2025

This looks really helpful. Two, perhaps stupid questions if you have a moment:

  1. Would I then use the regex generator to implement this?
  2. One thing I've struggled with in the documentation is what to add to the prompt. I'd guess in this case I'd want to tell it the format of the response to look like the above sentence?

@rlouf
Copy link
Member

rlouf commented Feb 27, 2025

  1. Yes sorry for not specifying this
  2. Ideally you would add one or two examples in the prompt

@cpfiffer
Copy link
Contributor

My two cents here -- Remi's right that Pydantic doesn't support conditionals, so there's not really a great way to enforce the non-response you're looking for.

One option is to try Optional[Code], which might help the model understand that it can output nothing at all rather than default it to OTHER.

class Coding(BaseModel):
    information: Information = Field(..., description="Enough information to categorize the post?")
    conclusion: Optional[Code] = Field(..., description="Final coding of post")

Your model may also be somewhat confused because the question asks whether or not there is enough information to assign a code, and then even if it says no it must choose a code which can include OTHER. In some sense null and OTHER are different things -- OTHER means "this does not fall under another code" while null means "I can't determine a code, as there is not sufficient information".

This is sadly not really an enforced output thing but it might help.

If this were my project, I would ignore any codes where the Information field is No, though this can be mildly token-inefficient. Fortunately the codes are quite short so you won't be burning too much compute.

@cpfiffer
Copy link
Contributor

Since this issue is about conditional logic, I figured I'd stick this in here for discussion. It doesn't seem like Outlines supports conditionals in JSON schemas (which may already be a known issue):

import time
import outlines
from outlines import models, generate
from outlines_core.fsm.json_schema import build_regex_from_schema
from outlines.fsm.guide import  RegexGuide
import json
from transformers import AutoTokenizer

model_name = "HuggingFaceTB/SmolLM2-135M"
model = models.transformers(
    model_name,
    device="auto",
)
tk = AutoTokenizer.from_pretrained(model_name)

schema = """
{
  "type": "object",

  "properties": {
    "name": { "type": "string" },
    "credit_card": { "type": "number" },
    "billing_address": { "type": "string" }
  },

  "required": ["name"],

  "dependentRequired": {
    "credit_card": ["billing_address"]
  }
}
"""

generator = generate.json(model, schema)

prompts = [
    "Return a json of a person",
    "Return a json of a person with a credit card",
    "Return a json of a person without a credit card",
    "Return a json of a person with a billing address",
    "Return a json of a person without a billing address",
]

def template(thing):
    return tk.apply_chat_template(
        [
            {"role": "user", "content": thing},
        ],
        tokenize=False,
        add_generation_prompt=True,
    )

res = generator([template(prompt) for prompt in prompts])

for prompt, res in zip(prompts, res):
    print(prompt)
    print(res)
    print()

which yields

Return a json of a person
{'name': 'Paul', 'credit_card': 1, 'billing_address': '01 Main Shooting Road, Victoria, Florida 32411'}

Return a json of a person with a credit card
{'name': 'album', 'credit_card': 25000}

Return a json of a person without a credit card
{'name': 'John', 'billing_address': 'United States'}

Return a json of a person with a billing address
{'name': 'ladina', 'billing_address': '250013 123 Main Street'}

Return a json of a person without a billing address
{'name': 'Jordan', 'billing_address': '123 West Street, Harrisburg, Pennsylvania, USA'}

Any of these that include a credit card must include a billing address as per the schema, so seems it's not currently implemented.

@rlouf
Copy link
Member

rlouf commented Mar 4, 2025

With the latest release you should be able to write:

from outlines.types import either

information_query = "Enough information to categorize the post? "
conclusion = "Final coding of post "
choices = either("macroeconomics", "civil rights")

outline = information_query + either(
    ("No. " + conclusion + "OTHER"),
    ("Yes. " + conclusion + choices)
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants