Skip to content

Commit f5d92bd

Browse files
committed
Trying to get new CI to work
1 parent 1db1b34 commit f5d92bd

File tree

6 files changed

+35
-15
lines changed

6 files changed

+35
-15
lines changed

.github/workflows/main.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ jobs:
112112
needs: [checks]
113113
env:
114114
BEAKER_TOKEN: ${{ secrets.BEAKER_TOKEN }}
115-
BEAKER_IMAGE: chrisw/olmocr-gpu-ci
115+
BEAKER_IMAGE: jakep/olmocr-gpu-ci
116116
BEAKER_BUDGET: ai2/oe-data
117117
BEAKER_WORKSPACE: ai2/olmocr
118118
steps:

scripts/beaker/gpu-ci-script.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ git clone https://github.com/allenai/olmocr.git olmocr \
99
.[gpu] \
1010
pytest \
1111
--find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/ \
12-
&& bash tests/gnarly_pdfs/test_gnarly_pdfs.sh
12+
&& bash scripts/run_integration_test.sh
1313

1414

1515

scripts/run_integration_test.sh

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#/usr/bin/bash
2+
3+
set -ex
4+
5+
python -m olmocr.pipeline ./localworkspace --pdfs tests/gnarly_pdfs/ambiguous.pdf tests/gnarly_pdfs/edgar.pdf tests/gnarly_pdfs/dolma-page-1.pdf \
6+
&& pytest tests/test_integration.py

tests/gnarly_pdfs/test_gnarly_pdfs.py

-7
This file was deleted.

tests/gnarly_pdfs/test_gnarly_pdfs.sh

-6
This file was deleted.

tests/test_integration.py

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import glob
2+
import json
3+
import os
4+
import unittest
5+
6+
import pytest
7+
8+
9+
@pytest.mark.nonci
10+
class TestPipelineIntegration(unittest.TestCase):
11+
def setUp(self):
12+
self.data = []
13+
14+
for file in glob.glob(os.path.join("localworkspace", "results", "*.jsonl")):
15+
with open(file, "r") as jf:
16+
for line in jf:
17+
if len(line.strip()) > 0:
18+
self.data.append(json.loads(line))
19+
20+
def test_edgar(self) -> None:
21+
self.assertTrue(any("King of England" in line["text"] for line in self.data))
22+
23+
def test_ambig(self) -> None:
24+
self.assertTrue(any("Apples and Bananas" in line["text"] for line in self.data))
25+
26+
def test_dolma(self) -> None:
27+
self.assertTrue(any("We extensively document Dolma" in line["text"] for line in self.data))

0 commit comments

Comments
 (0)