Skip to content

Fix tests org.commoncrawl.* being ignored/skipped#44

Draft
lfoppiano wants to merge 5 commits intoccfrom
bugfix/test-not-running
Draft

Fix tests org.commoncrawl.* being ignored/skipped#44
lfoppiano wants to merge 5 commits intoccfrom
bugfix/test-not-running

Conversation

@lfoppiano
Copy link

This PR fixes the problem described in #43

In short, all test were ran with and there are two tests in particular TestCommonCrawlDataDumper.dump() which calls System.exit() dumping the whole forked process. Also TestMimeUtil had a similar problem.

For now I've separated the org.commoncrawl.* from the org.apache.* tests. But the issue may still occur preventing other tests from the same group to be executed. Ideally we shoud avoid the System.exit (maybe static mocking?) or make the tests in individual forks.

@lfoppiano lfoppiano changed the title Make tests org.commoncrawl.* be executed Fix tests org.commoncrawl.* being ignored/skipped Feb 25, 2026
@github-actions
Copy link

github-actions bot commented Feb 25, 2026

Test Results

115 files  115 suites   3m 57s ⏱️
297 tests 291 ✅ 6 💤 0 ❌
291 runs  285 ✅ 6 💤 0 ❌

Results for commit 3bcac79.

♻️ This comment has been updated with latest results.

@sebastian-nagel
Copy link

Hi, @lfoppiano.

called at some point System.exit()

Thanks! That's a left-over of NUTCH-2852. Unclear why it hits our fork but not upstream Nutch. It should be fixed upstream. I hope this avoids any fix in the Common Crawl fork.

There was also a lot of refactoring and improvements around unit tests and workflow reports ongoing recently. I'd avoid any changes unless these are fully merged into the cc branch. See NUTCH-3143, NUTCH-3126, NUTCH-3125, NUTCH-3042.

There are few more blocked because we need to upgrade our Hadoop cluster to 3.4.2 and JDK 17: NUTCH-3085https://issues.apache.org/jira/browse/ NUTCH-3085) and most important the upgrade to JUnit6 NUTCH-3145.

@lfoppiano
Copy link
Author

OK. I leave it for now. The last commit fixes all the problems, by running each test as separate fork. Plus removing the System.exit() in a non main() method.

@lfoppiano
Copy link
Author

The same fix was committed in the upstream Nutch: apache@f8577a0

@sebastian-nagel
Copy link

The same fix was committed in the upstream Nutch: apache/nutch@f8577a0

Ok, that was NUTCH-3143.

In addition, the reason should be addressed, that is no System.exit() except in the main() methods.

@lfoppiano
Copy link
Author

Yes, this is addressed at apache#903 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants