Use additional context in LLM prompt, update OpenAI dep by flodolo · Pull Request #4040 · mozilla/pontoon

flodolo · 2026-03-26T05:33:11Z

Update OpenAI package to latest version (2.29.0)
Improve prompt formulation
Pass string ID, comment, terminology matches when available
Move OpenAI GPT version to settings

translate/src/modules/machinery/getEntityStringId.ts

flodolo · 2026-03-26T05:33:43Z

pontoon/machinery/openai_service.py

-            f"ENGLISH SOURCE:\n{english_text}\n\n"
-            f"MACHINE TRANSLATION TO REFINE:\n{translated_text}"
-        )
+        # TODO: remove before merge.


This is here to help with testing, needs to be removed before merge.

flodolo · 2026-03-26T05:34:41Z

pontoon/machinery/management/commands/refine_translation.py


 class Command(BaseCommand):
-    help = "Refines machine translations using OpenAI's GPT-4 API with specified characteristics"
+    help = "Refines machine translations using OpenAI's GPT API with specified characteristics"


At some point we're going to move away from GPT-4, not sure there is a benefit in having references to GTP-4 in the code.

- Update OpenAI package to latest version (2.29.0) - Improve prompt formulation - Pass string ID, comment, terminology matches when available - Move OpenAI GPT version to settings

mathjazz

Nice work!

Why are we passing all this data from frontend to backend, instead of just passing entity ID and then retrieving all the data from the DB?

pontoon/machinery/openai_service.py

flodolo · 2026-04-02T05:39:24Z

Why are we passing all this data from frontend to backend, instead of just passing entity ID and then retrieving all the data from the DB?

The front-end already has all the data. Are we OK with extra queries to do this on the backend? I believe we'd need:

A couple to get comments (1 for entity+section+comment, 1 for pinned comment).
Scan terminology for matches.
Get term translations (1 query per matched term).

Is this off?

mathjazz · 2026-04-02T12:52:17Z

The front-end already has all the data. Are we OK with extra queries to do this on the backend?

The frontend already having the data is not a strong reason to make the client the source of truth.

Also:

Server-owned data shouldn’t come from the client.
Duplication of logic: if the frontend needs to know how to assemble all the context for GPT, that logic is now split across client and server.
This is a GET endpoint, and it is sending potentially large text blobs plus JSON payloads in query params. That is awkward for URL length limits, logging exposure, and caching behavior. This endpoint probably wants POST even aside from the DB-question.

flodolo · 2026-04-02T12:55:39Z

Makes sense. I'll look into moving this to the back-end (next week at this point).

mathjazz

Nice job! Deployed to DEV. Works fine!

Left one more note.

mathjazz · 2026-04-07T17:41:35Z

pontoon/machinery/openai_service.py

+        translated_text,
+        characteristic,
+        locale,
+        entity_id=None,


Why not entity_key?

No reason, in my head it's always ID 🤷🏼

I'll update and remove the debug print.

flodolo commented Mar 26, 2026

View reviewed changes

flodolo force-pushed the improve_llm_prompt branch 2 times, most recently from 5a56d9c to ada5db9 Compare March 26, 2026 13:07

Use additional context in LLM prompt, update OpenAI dep

aa2327a

- Update OpenAI package to latest version (2.29.0) - Improve prompt formulation - Pass string ID, comment, terminology matches when available - Move OpenAI GPT version to settings

flodolo force-pushed the improve_llm_prompt branch from ada5db9 to aa2327a Compare March 26, 2026 13:14

mathjazz reviewed Apr 1, 2026

View reviewed changes

pontoon/machinery/openai_service.py Outdated Show resolved Hide resolved

string -> entity in var names

58fa4fe

flodolo added 2 commits April 2, 2026 16:16

Switch endpoint to POST

f85973b

Move logic to backend

cc5f4fc

flodolo force-pushed the improve_llm_prompt branch from fab834f to cc5f4fc Compare April 2, 2026 14:39

Update package to 2.30.0

d1a725c

flodolo requested a review from mathjazz April 7, 2026 06:37

mathjazz approved these changes Apr 7, 2026

View reviewed changes

Rename entity_id to entity_key, drop debug print

425e9cf

flodolo merged commit 8b2b7a4 into mozilla:main Apr 7, 2026
8 checks passed

flodolo deleted the improve_llm_prompt branch April 10, 2026 08:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use additional context in LLM prompt, update OpenAI dep#4040

Use additional context in LLM prompt, update OpenAI dep#4040
flodolo merged 6 commits intomozilla:mainfrom
flodolo:improve_llm_prompt

flodolo commented Mar 26, 2026

Uh oh!

Uh oh!

flodolo Mar 26, 2026

Uh oh!

flodolo Mar 26, 2026

Uh oh!

mathjazz left a comment

Uh oh!

Uh oh!

flodolo commented Apr 2, 2026 •

edited

Loading

Uh oh!

mathjazz commented Apr 2, 2026

Uh oh!

flodolo commented Apr 2, 2026

Uh oh!

mathjazz left a comment

Uh oh!

mathjazz Apr 7, 2026

Uh oh!

flodolo Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

flodolo commented Mar 26, 2026

Uh oh!

Uh oh!

flodolo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

flodolo Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

mathjazz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

flodolo commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathjazz commented Apr 2, 2026

Uh oh!

flodolo commented Apr 2, 2026

Uh oh!

mathjazz left a comment

Choose a reason for hiding this comment

Uh oh!

mathjazz Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

flodolo Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

flodolo commented Apr 2, 2026 •

edited

Loading