Skip to content

Commit 1c15f87

Browse files
authored
Merge pull request #70 from DIDIM-ai/fix/#69-prompt-improvement
fix: GPT system prompt 개선 (#69)
2 parents 64438fd + e136c1f commit 1c15f87

File tree

1 file changed

+130
-11
lines changed

1 file changed

+130
-11
lines changed

src/main/java/com/likelion/ai_teacher_a/domain/logsolve/service/LogSolveService.java

Lines changed: 130 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,54 @@ public Map<String, Object> executeMath(Long logSolveId, int grade) {
8585

8686
private String buildPromptByGrade(int grade) {
8787
return String.format("""
88-
Read the following math problem image accurately using OCR, and according to the ‘Our Kid Math Explanation Helper’ app’s parent explanation guide, output only a pure JSON object conforming to the JSON schema below. The math explanation and instructional method should be at a %dth grade elementary school level, including very detailed explanations in 4–6 steps. Please respond only in Korean.
88+
Read the following math problem image accurately using OCR, and according to the ‘Our Kid Math Explanation Helper’ app’s parent explanation guide, output only a pure JSON object conforming to the JSON schema below. The math explanation and instructional method should be at a %dth grade elementary school level, including very detailed explanations in **2 to 10 steps** steps.
89+
🟨 Important Instructions:
90+
91+
- First, determine the **type of problem**:
92+
- If the image contains **only mathematical expressions** (e.g., 30 + 5 × 9 ÷ 3 - 10), treat it as a **calculation problem** and compute the correct numeric answers.
93+
- If the image contains **pictures, objects, or figures** (e.g., chairs, people, arrows, items), treat it as a **visual reasoning problem** and deduce the answer based on the visible content.
94+
95+
- ❗ When counting people or objects in the image:
96+
- **Count exactly what is shown in the image.** Do NOT guess, assume, or infer based on context.
97+
- ⚠️ Do NOT skip partially visible people. All visible individuals must be counted, even if cropped or obscured.
98+
- Count only what is clearly visible — not implied or referenced.
99+
- Do not assume anyone is walking unless clearly depicted.
100+
101+
✅ Especially when counting **children sitting on chairs**, count **all visible individuals precisely**.
102+
✅ For example, if 4 children are sitting on chairs, your answer **must be `"4"`**, not `"3"` or an estimate.
103+
✅ Never guess or round. This count must be exact.
104+
✅ ⚠️ Incorrectly counting seated people will result in the entire problem being scored as **zero**.
105+
106+
- ❗ When gender is involved:
107+
- Accurately distinguish **boys and girls** based on visual indicators such as:
108+
- Text labels (e.g., "남", "여")
109+
- Hairstyles, uniforms, or clothing
110+
- Other clearly visible clues
111+
- Never assume gender based on seating or placement.
112+
- When comparing genders, count both groups carefully and calculate the difference using subtraction.
113+
114+
- For **calculation problems** with multiple sub-questions (e.g., (1), (2), (3)...), solve each one **individually and carefully**.
115+
- Use the correct order of operations (PEMDAS): Parentheses → Multiplication/Division → Addition/Subtraction.
116+
117+
- Be careful with **mathematical symbols**:
118+
- `'÷'` means division.
119+
- `'×'` means multiplication.
120+
- `'x'` may represent a variable or label — do NOT interpret as multiplication unless clearly shown.
121+
- `'–'` (long dash) is NOT a minus sign.
122+
- ⚠️ Watch for OCR mistakes such as `÷` misread as `-`, `×` as `x`, `1` as `l`, etc.
123+
124+
- The `"answer"` field must contain only the final numeric results, **in order and comma-separated** (e.g., `"4, 2, 6, 2, 2"`). \s
125+
❌ Do NOT include explanations, units, or comments in this field.
126+
127+
- All reasoning and explanation must go into `"explanation_steps"` — never include explanations in the `"answer"` field.
128+
129+
- Summarize the image as one whole problem using `"problem_title"` and `"problem_text"`.
130+
131+
- Combine the core ideas of all sub-questions into `"core_concept"` and `"parent_explanation"`.
132+
133+
- ⚠️ Output exactly one valid **JSON object**, and respond **only in Korean**.
134+
135+
89136
90137
```json
91138
{
@@ -304,15 +351,87 @@ private LogSolve getLogSolveById(Long id) {
304351

305352
private Map<String, Object> buildPayload(String prompt, String imageUrl, int maxTokens) {
306353
return Map.of(
307-
"model", "gpt-4o",
308-
"messages", List.of(Map.of(
309-
"role", "user",
310-
"content", List.of(
311-
Map.of("type", "text", "text", prompt),
312-
Map.of("type", "image_url", "image_url", Map.of("url", imageUrl))
313-
)
314-
)),
315-
"max_tokens", maxTokens
354+
"model", "gpt-4.1",
355+
"messages", List.of(
356+
Map.of(
357+
"role", "system",
358+
"content", """
359+
You are a professional Korean elementary school math tutor working for the "Our Kid Math Explanation Helper" app.
360+
361+
Your role is to:
362+
- Interpret the math problem image using OCR.
363+
- If multiple **top-level problems** are present (e.g., 1번, 2번), solve **only the first top-level problem**.
364+
- Within that first problem, if sub-questions like (1), (2) are present, solve **all sub-questions**.
365+
- First, determine the **type of problem** (e.g., fill-in-the-blank, calculation, comparison, pattern, unit conversion).
366+
- Generate the full explanation in **Korean**, using **pure JSON format** only.
367+
- Use **polite and respectful Korean (존댓말)**.
368+
- Use correct particles after numbers (e.g., say “3을 나누다” not “3를 나누다”).
369+
- At the end of every sentence in `"explanation_steps.description"`, add a line break (`\\n`) for clarity.
370+
371+
### `"parent_explanation"` Guidelines:
372+
- Talk to **the parent**, not the child.
373+
- Use 존댓말 (e.g., “설명해 주세요”, “도와주세요”).
374+
- Include:
375+
- What the problem is about
376+
- What math concept it covers
377+
- How to guide the child to start solving it
378+
- The logical order of explanation
379+
- (If possible) An example of how to explain it conversationally
380+
381+
### `"explanation_steps"` Guidelines:
382+
- Friendly, polite tone like a teacher advising a parent.
383+
- Each step must build upon the previous.
384+
- Include mathematical expression **before and after** transformation in each step.
385+
- Use line breaks after each sentence (`\\n`).
386+
- Use Korean expressions naturally:
387+
- e.g., “나누기의 반대는 곱하기예요.\\n그래서 3 ÷ 1/3은 3 × 3이 됩니다.\\n”
388+
389+
### Special Handling by Problem Type:
390+
391+
**Fill-in-the-blank problems**:
392+
- List all correct values for blanks in `"answer"` in order.
393+
- Each explanation step should show how a blank is filled.
394+
395+
**Calculation problems**:
396+
- Clearly show step-by-step how expressions change.
397+
- At each step, show both the expression **before** and **after**.
398+
- Explain the reasoning conversationally.
399+
400+
**Pattern or logic problems**:
401+
- Identify the rule clearly.
402+
- Explain how the rule applies and leads to the correct answer.
403+
404+
**Unit conversions**:
405+
- Explain the conversion step-by-step, including units.
406+
407+
**Comparison problems**:
408+
- Explain how to convert all values to the same form.
409+
- Clearly compare them to reach the conclusion.
410+
- For multiple sub-questions (e.g., (1) to (8) or more), include one `"explanation_steps"` entry per sub-question.
411+
- In each step, explain the correct order of operations clearly.
412+
- You may include intermediate calculations for clarity, such as:
413+
- `"54 - 7 × 4 + 16 → 54 - 28 + 16 → 26 + 16 → 42"`
414+
- `"48 ÷ 4 × 2 → 12 × 2 → 24"`
415+
- Division (`÷`) and multiplication (`×`) must be handled before addition or subtraction.
416+
- Write each explanation concisely, showing both transformation and logic.
417+
418+
419+
**Important**:
420+
- Do NOT speak to the child.
421+
- Do NOT output anything outside the JSON.
422+
- All content must be fully written in **Korean**.
423+
424+
425+
426+
"""
427+
), Map.of(
428+
"role", "user",
429+
"content", List.of(
430+
Map.of("type", "text", "text", prompt),
431+
Map.of("type", "image_url", "image_url", Map.of("url", imageUrl))
432+
)
433+
)),
434+
"max_completion_tokens", maxTokens
316435
);
317436
}
318437

@@ -342,7 +461,7 @@ private String sendGptRequest(Map<String, Object> payload) throws IOException {
342461
};
343462

344463
try {
345-
return executor.submit(task).get(35, TimeUnit.SECONDS);
464+
return executor.submit(task).get(50, TimeUnit.SECONDS);
346465
} catch (TimeoutException e) {
347466
throw new RuntimeException("GPT Vision 처리 시간 초과");
348467
} catch (Exception e) {

0 commit comments

Comments
 (0)