Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
d2c9336
워크플로우 default_config 사용 로직개발 (#190)
kakusiA Sep 23, 2025
b09558f
check-session, permissons api에 대한 테스트 코드 작성 (#200)
bwnfo3 Sep 25, 2025
c993a53
Merge pull request #203 from Kernel180-BE12/main
can019 Sep 25, 2025
0fd54d6
feat: S3 하고 RDB 연동
thkim7 Sep 25, 2025
9076e8a
chore: poetry run black . & spotlessApply
thkim7 Sep 25, 2025
e779b1b
Merge pull request #205 from Kernel180-BE12/feature/s3-rds
thkim7 Sep 25, 2025
ac8c784
refactor: 크롤링 서비스 순차적 크롤링에서 비동기 크롤링으로 변경
thkim7 Sep 25, 2025
426b4cf
refactor: RDB와 selection task 실행시 task_run_id가 mismatch 되는 문제 해결
thkim7 Sep 25, 2025
f9845f3
chore: poetry run black .
thkim7 Sep 25, 2025
d561ab3
chore: spotlessApply
thkim7 Sep 25, 2025
51a257c
Merge pull request #208 from Kernel180-BE12/feature/refactor_service
thkim7 Sep 25, 2025
f3210d3
Workflow 수동 실행 및 Retry 로직 테스트 코드 작성 (#209)
jihukimme Sep 25, 2025
773b61f
Workflow 생성 api (#211)
bwnfo3 Sep 25, 2025
f8b7026
Timezone Instant(UTC) 마이그레이션 (#210)
can019 Sep 25, 2025
6296dcf
feat: ScheduleCreateDto
bwnfo3 Sep 26, 2025
821a478
feat: ScheduleMapper 스케줄 생성 관련 메서드 추가
bwnfo3 Sep 26, 2025
9523385
feat: WorkflowCreateDto에 스케줄 관련 추가
bwnfo3 Sep 26, 2025
72c9a10
feat: CreateWorkflow메서드에 스케줄 등록 추가
bwnfo3 Sep 26, 2025
c7fc11c
feat: WorkflowCreateFlowE2eTest에 스케줄 등록 관련 테스트 코드 추가
bwnfo3 Sep 26, 2025
140698f
chore: spotlessApply
bwnfo3 Sep 26, 2025
9e4aa31
Jackson Timezone 직렬화가 되지 않던 문제 해결 (#212)
can019 Sep 26, 2025
bef5719
feature : google-api ocr
rll2641 Sep 26, 2025
abb0631
feature : easyocr
rll2641 Sep 26, 2025
d44956a
test : 외부주입 테스트용
rll2641 Sep 26, 2025
76ab508
test : 외부주입 테스트용2
rll2641 Sep 26, 2025
a905f9d
ExecutionLog API 구현 및 traceId 일관성 개선 (#215)
can019 Sep 27, 2025
61150f9
User 관련 api test 및 api document 작성 (#217)
can019 Sep 27, 2025
cfc6397
Gradle 캐싱을 통해 CI (Java) 속도 개선 (#218)
can019 Sep 27, 2025
bb23534
chore: version 0.1.0-SNAPSHOT으로 build.gradle update
can019 Sep 27, 2025
b4a62ff
chore: Document artifcat step 분리
can019 Sep 27, 2025
369e616
fix: Working directory document-java step에 추가
can019 Sep 27, 2025
8b524d7
feat: OCR 처리된 텍스트 블로그 콘텐츠 생성에 추가(초안) - 스키마
thkim7 Sep 27, 2025
11f3d49
feat: OCR 처리된 텍스트 블로그 콘텐츠 생성에 추가(초안) - 유틸
thkim7 Sep 27, 2025
4f9bdc7
feat: OCR 처리된 텍스트 블로그 콘텐츠 생성에 추가(초안) - 서비스
thkim7 Sep 27, 2025
6544dc8
chore: poetry run black .
thkim7 Sep 27, 2025
0810416
feat : google-vision 기반 OCR 작성
rll2641 Sep 27, 2025
ffb5238
chore : __init__ 파일 추가
rll2641 Sep 27, 2025
9833416
Merge branch 'develop' into feature/paddleocr
rll2641 Sep 27, 2025
553d280
fix : S3 폴더 경로 수정
rll2641 Sep 27, 2025
337910e
fix : API key 경로 지정을 위한 마운트
rll2641 Sep 27, 2025
4d8f85e
Merge pull request #216 from Kernel180-BE12/feature/schedule-crud
thkim7 Sep 27, 2025
78d6163
Merge pull request #222 from Kernel180-BE12/feature/paddleocr
thkim7 Sep 27, 2025
23067f1
Merge pull request #221 from Kernel180-BE12/feature/ocr-rag
thkim7 Sep 27, 2025
13a8096
Gradle action v4로 upgrade (#220)
can019 Sep 27, 2025
86d21a5
Merge branch 'main' into develop
can019 Sep 27, 2025
0a9fc5e
chore: fix lint
can019 Sep 27, 2025
e92bf72
chore: Fix javadoc merged
can019 Sep 27, 2025
346e4f4
Spring 설정 docker image에서 분리 및 Log4j2 자동 감시 추가 (#228)
can019 Sep 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions .github/workflows/ci-java.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,47 @@ on:
- ".github/workflows/ci-java.yml"

permissions:
contents: read
contents: write # Dependency graph 생성용
packages: write
security-events: write
checks: write
pull-requests: write
pages: write # GitHub Pages 배포
id-token: write # GitHub Pages 배포
actions: write

jobs:
dependency-submission:
if: github.event_name != 'pull_request'
runs-on: ubuntu-latest
steps:
- name: Checkout sources
uses: actions/checkout@v4

- name: Setup Java
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: '21'

- name: Generate and submit dependency graph
uses: gradle/actions/dependency-submission@v4
with:
build-root-directory: apps/user-service

spotless-check:
if: github.event.pull_request.draft == false
name: Lint Check
runs-on: ubuntu-latest

steps:
- name: Debug cache settings
run: |
echo "Event name: ${{ github.event_name }}"
echo "Event type: ${{ github.event.action }}"
echo "Cache read-only condition: ${{ github.event_name == 'pull_request' }}"
echo "GitHub ref: ${{ github.ref }}"

- name: Checkout repository
uses: actions/checkout@v4

Expand All @@ -44,6 +70,11 @@ jobs:
uses: gradle/actions/setup-gradle@v3
with:
cache-read-only: ${{ github.event_name == 'pull_request' }}
gradle-home-cache-cleanup: false
gradle-home-cache-includes: |
caches
notifications
wrapper

- name: Grant execute permission for Gradle wrapper
run: chmod +x ./gradlew
Expand Down Expand Up @@ -73,7 +104,7 @@ jobs:
distribution: 'temurin'

- name: Setup Gradle
uses: gradle/actions/setup-gradle@v3
uses: gradle/actions/setup-gradle@v4
with:
cache-read-only: ${{ github.event_name == 'pull_request' }}

Expand Down
22 changes: 21 additions & 1 deletion .github/workflows/deploy-java.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ jobs:
source: "docker/production/promtail-config.yml"
target: "~/app"

- name: Copy promtail-config to EC2
- name: Copy agent-config to EC2
uses: appleboy/[email protected]
with:
host: ${{ secrets.SERVER_HOST }}
Expand All @@ -89,6 +89,26 @@ jobs:
target: "~/app"
overwrite: true

- name: Copy application-production.yml to EC2
uses: appleboy/[email protected]
with:
host: ${{ secrets.SERVER_HOST }}
username: ubuntu
key: ${{ secrets.SERVER_SSH_KEY }}
source: "apps/user-service/src/main/resources/application-production.yml"
target: "~/app/docker/production/config/application-production.yml"
overwrite: true

- name: Copy log4j2-production.yml to EC2
uses: appleboy/[email protected]
with:
host: ${{ secrets.SERVER_HOST }}
username: ubuntu
key: ${{ secrets.SERVER_SSH_KEY }}
source: "apps/user-service/src/main/resources/log4j2-production.yml"
target: "~/app/docker/production/config/log4j2-production.yml"
overwrite: true

- name: Deploy on EC2
uses: appleboy/[email protected]
with:
Expand Down
21 changes: 9 additions & 12 deletions apps/pre-processing-service/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,11 @@
FROM python:3.11-slim AS builder
WORKDIR /app

# 필수 OS 패키지 (기존 + Chrome 설치용 패키지 추가)
# 필수 OS 패키지 설치
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
wget \
unzip \
gnupg \
ca-certificates \
build-essential \
&& rm -rf /var/lib/apt/lists/*

# Poetry 설치
Expand All @@ -20,16 +18,15 @@ RUN poetry self add "poetry-plugin-export>=1.7.0"
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

# 의존성 해결 → requirements로 export → pip로 설치(= 반드시 /opt/venv에 설치됨)
# poetry → requirements로 export → pip로 설치
COPY pyproject.toml poetry.lock ./
RUN poetry export --without dev -f requirements.txt -o requirements.txt \
&& pip install --no-cache-dir -r requirements.txt

# ---- runtime ----
FROM python:3.11-slim AS final
WORKDIR /app

# Chrome과 ChromeDriver 설치를 위한 패키지 설치
# Chrome과 ChromeDriver 설치를 위한 패키지 설치 (삭제 예정 - 마운트 방식)
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
unzip \
Expand All @@ -38,22 +35,22 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*

# Chrome 설치 (블로그 방식 - 직접 .deb 파일 다운로드)
# Chrome 설치 (삭제 예정 - 마운트 방식)
RUN wget -q https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb \
&& apt-get update \
&& apt-get install -y ./google-chrome-stable_current_amd64.deb \
&& rm ./google-chrome-stable_current_amd64.deb \
&& rm -rf /var/lib/apt/lists/*

# MeCab & 사전 설치 (형태소 분석 의존)
# MeCab & 사전 설치 (삭제 예정 - 마운트 방식)
RUN apt-get update && apt-get install -y --no-install-recommends \
mecab \
libmecab-dev \
wget \
build-essential \
&& rm -rf /var/lib/apt/lists/*

# 한국어 사전 수동 설치
# 한국어 사전 수동 설치 (삭제 예정 - 마운트 방식)
RUN cd /tmp && \
wget https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-2.1.1-20180720.tar.gz && \
tar -zxf mecab-ko-dic-2.1.1-20180720.tar.gz && \
Expand All @@ -70,5 +67,5 @@ ENV PATH="/opt/venv/bin:$PATH"
# 앱 소스
COPY . .

# (권장 대안) 코드에서 uvicorn import 안 하고 프로세스 매니저로 실행하려면:
ENTRYPOINT ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "app.main:app", "-b", "0.0.0.0:8000", "--timeout", "120"]
# gunicorn으로 FastAPI 앱 실행 - 타임아웃 240초 설정
ENTRYPOINT ["gunicorn", "-k", "uvicorn.workers.UvicornWorker", "app.main:app", "-b", "0.0.0.0:8000", "--timeout", "240"]
17 changes: 17 additions & 0 deletions apps/pre-processing-service/app/api/endpoints/blog.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,27 @@
from app.utils.response import Response
from app.service.blog.blog_create_service import BlogContentService
from app.service.blog.blog_publish_service import BlogPublishService
from app.service.ocr.S3OCRProcessor import S3OCRProcessor

router = APIRouter()


@router.post(
"/ocr/extract",
response_model=ResponseImageTextExtract,
summary="S3 이미지에서 텍스트 추출 및 번역",
)
async def ocr_extract(request: RequestImageTextExtract):
"""
S3 이미지에서 텍스트 추출 및 번역
"""
processor = S3OCRProcessor(request.keyword)

result = processor.process_images()

return Response.ok(result)


@router.post(
"/rag/create",
response_model=ResponseBlogCreate,
Expand Down
3 changes: 3 additions & 0 deletions apps/pre-processing-service/app/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,9 @@ class BaseSettingsConfig(BaseSettings):
# 테스트/추가용 필드
OPENAI_API_KEY: Optional[str] = None # << 이 부분 추가

# OCR 번역기 설정
google_application_credentials: Optional[str] = None

def __init__(self, **kwargs):
super().__init__(**kwargs)

Expand Down
44 changes: 39 additions & 5 deletions apps/pre-processing-service/app/model/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,13 @@ class S3ImageInfo(BaseModel):
..., title="원본 URL", description="크롤링된 원본 이미지 URL"
)
s3_url: str = Field(..., title="S3 URL", description="S3에서 접근 가능한 URL")
# 새로 추가: 파일 크기 정보 (이미지 선별용)
file_size_kb: Optional[float] = Field(
None, title="파일 크기(KB)", description="이미지 파일 크기"
)
file_name: Optional[str] = Field(
None, title="파일명", description="S3에 저장된 파일명"
)


# 상품별 S3 업로드 결과
Expand Down Expand Up @@ -274,14 +281,18 @@ class RequestBlogCreate(RequestBase):
keyword: Optional[str] = Field(
None, title="키워드", description="콘텐츠 생성용 키워드"
)
translation_language: Optional[str] = Field(
None,
title="번역한 언어",
description="이미지에서 중국어를 한국어로 번역한 언어",
)
product_info: Optional[Dict] = Field(
None, title="상품 정보", description="블로그 콘텐츠에 포함할 상품 정보"
)
content_type: Optional[str] = Field(
None, title="콘텐츠 타입", description="생성할 콘텐츠 유형"
)
target_length: Optional[int] = Field(
None, title="목표 글자 수", description="생성할 콘텐츠의 목표 길이"
uploaded_images: Optional[List[Dict]] = Field(
None,
title="업로드된 이미지",
description="S3에 업로드된 이미지 목록 (크기 정보 포함)",
)


Expand All @@ -301,6 +312,29 @@ class ResponseBlogCreate(ResponseBase[BlogCreateData]):
pass


# ================== 이미지에서 텍스트 추출 및 번역 ==================
class RequestImageTextExtract(RequestBase):
keyword: Optional[str] = Field(
..., title="키워드", description="텍스트 추출용 키워드"
)


class ImageTextExtract(BaseModel):
keyword: Optional[str] = Field(
..., title="키워드", description="텍스트 추출용 키워드"
)
extraction_language: str = Field(
..., title="추출된 텍스트", description="이미지에서 추출된 텍스트"
)
translation_language: str = Field(
..., title="번역된 텍스트", description="추출된 텍스트의 번역본"
)


class ResponseImageTextExtract(ResponseBase[ImageTextExtract]):
pass


# ============== 블로그 배포 ==============


Expand Down
Loading
Loading