Google Vision 기반 OCR 추가 및 텍스트 번역 #222

rll2641 · 2025-09-27T08:40:43Z

📝 작업 내용

Google Vision API 기반 OCR 추가
중국어, 영어, 일본어 -> 한국어 번역
새로운 엔드포인트 추가 (blogs/ocr/extract)
Dockerfile Entrypoint 가상환경 경로로 수정
S3에서 파일 가져오기

OCR 모델 사용후기

tesseract - 정확성 안좋음. feature/ocr
paddle - numpy 버전 다운그레이드 필요 -> 충돌가능성 O
easyocr - 의존성 용량 과다 -> 14기가. 별도 서버 분리 필요 -> 얘가 제일 정확함
google api - tesseract 보다 정확. 하지만 API라 네트워크 발생 및 유료 (월1000건무료)

🔗 관련 이슈

Closes #이슈번호
Related to #이슈번호

💬 추가 요청사항

✅ 체크리스트

코드 품질

커밋 컨벤션 준수 (feat/fix/docs/refactor 등)
불필요한 코드/주석 제거

테스트

로컬 환경에서 동작 확인 완료
기존 기능에 영향 없음 확인

배포 준비

환경변수 추가/변경사항 문서화
DB 마이그레이션 필요 여부 확인
배포 시 주의사항 없음

1. S3에서 이미지 추출 (json 미포함, 로컬 저장X) 2. google-vision api로 중국어, 일본어, 영어 -> 한국어 변환 3. 개행 및 불 필요한 텍스트 전처리 4. 엔드포인트 추가 (blogs/ocr/extract) 5. Dockerfile 주석 ENTRYPORIN 경로 변경

rll2641 added 7 commits September 27, 2025 01:05

feature : google-api ocr

bef5719

feature : easyocr

abb0631

test : 외부주입 테스트용

d44956a

test : 외부주입 테스트용2

76ab508

chore : __init__ 파일 추가

ffb5238

Merge branch 'develop' into feature/paddleocr

9833416

rll2641 self-assigned this Sep 27, 2025

rll2641 added the enhancement New feature or request label Sep 27, 2025

rll2641 added 2 commits September 27, 2025 17:41

fix : S3 폴더 경로 수정

553d280

fix : API key 경로 지정을 위한 마운트

337910e

thkim7 marked this pull request as ready for review September 27, 2025 09:30

thkim7 merged commit 78d6163 into develop Sep 27, 2025
7 checks passed

thkim7 deleted the feature/paddleocr branch September 27, 2025 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Google Vision 기반 OCR 추가 및 텍스트 번역 #222

Google Vision 기반 OCR 추가 및 텍스트 번역 #222

Uh oh!

rll2641 commented Sep 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Google Vision 기반 OCR 추가 및 텍스트 번역 #222

Google Vision 기반 OCR 추가 및 텍스트 번역 #222

Uh oh!

Conversation

rll2641 commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 작업 내용

OCR 모델 사용후기

🔗 관련 이슈

💬 추가 요청사항

✅ 체크리스트

코드 품질

테스트

배포 준비

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rll2641 commented Sep 27, 2025 •

edited

Loading