Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

miggration from 8.11.4 to 10.1.3 memory shortage on github action #907

Open
erwanriou opened this issue Jan 10, 2025 · 14 comments
Open

miggration from 8.11.4 to 10.1.3 memory shortage on github action #907

erwanriou opened this issue Jan 10, 2025 · 14 comments

Comments

@erwanriou
Copy link

erwanriou commented Jan 10, 2025

Versions

  • NodeJS: 22
  • mongodb-memory-server-*: 10.1.3
  • mongodb(the binary version): 6.0.5
  • mongoose: 8.8.3
  • system: github action (i believe its ubuntu 24 now)

package: mongodb-memory-server

What is the Problem?

i updated the mongodb-memory-server to latest in order to pass a test that is making use of a feature added in mongodb 5.2 and test passed locally (but much much slower tho). Since i deployed it on the CI but the github action timeout due to memory when i don't even have that much test in it.

Code Example

I have a total of 78 test suite (in some other be i have more than 200 without any issues)
I believe a screen is better than more explainations:
image

So i believe there might be some sort of memoryleak underneath that generate the issue. In the meantime i will do

await MongoMemoryServer.create({ binary: { version: "6.0.5" } })
@erwanriou erwanriou added the bug label Jan 10, 2025
@hasezoey
Copy link
Member

In the meantime i will do

Alternatively, you could also try to use mongodb-memory-server version 9.x, see Default MongoDB Versions for MMS.


I dont know anything about MMS that could leak that much and i also am not aware of any change in mongodb that would lead to a massive increase. Do you clean your database between test-suites?
A alternative could be to use storageEngine: 'wiredTiger' with a different location than the default /tmp.

Is the log / repository public? If so i it would be great to investigate there.
Regardless if it is public or not, could you add a command in your test environment that runs du -xh /tmp and also ps -aux --sort=pmem once the ENOSPC Error happens and post the result? (it needs to be in jest so it can capture the state when the error happens and not after the processes get killed)

@erwanriou
Copy link
Author

@hasezoey I could try the version 9 too indeed (they might have a higher default binary that true)

yes i do have events between test to clean and clear:

// DROP TEST DABASE
beforeEach(async () => await dbhandler.clearDatabase())

the create and clear event are quite standard:

const mongoose = require("mongoose")
const { MongoMemoryServer } = require("mongodb-memory-server")

let mongo

module.exports.connect = async () => {
  const systemBinary = process.env.MONGOMS_SYSTEM_BINARY || undefined
  const version = process.env.MONGOMS_VERSION || "6.0.19"

  mongo = await MongoMemoryServer.create({ binary: { version, systemBinary } })
  const mongoUri = await mongo.getUri()
  await mongoose.set("strictQuery", true)
  await mongoose.connect(mongoUri)
}

module.exports.clearDatabase = async () => {
  const collections = await mongoose.connection.db.collections()
  for (let collection of collections) {
    await collection.deleteMany({})
  }
}

I don't make usage of views in this backend so its a simple one but i do have a few aggregate, lookups and things. Sadly it's not a public repository :(
I will do what you requested on the CICD and send the logs here. (the tests work locally, just are very slow to finish on the version 10)

@hasezoey
Copy link
Member

just are very slow to finish on the version 10

To confirm, are you using jest with global-setup? Or do you start a mongodb-memory-server instance for each test-suite?

@erwanriou
Copy link
Author

erwanriou commented Jan 11, 2025

Yes i do have a global setup for jest that is used by all the tests after.
the config is simple:

"jest": {
  "testEnvironment": "node",
  "setupFilesAfterEnv": [
    "./src/test/setup_test.js"
  ]
}

The main differences between the local and cicd are the command i launch that are respectivly:

{
  "test": "jest --watchAll --runInBand --verbose",
  "test:ci": "jest --forceExit --detectOpenHandles --maxWorkers=10",
}

to give you an idea of how would looklike a test:

const request = require("supertest")
const app = require("../../../../app")

const eventBuilder = async () => {
  const listener = new PreprofileCreatedList(NatsWrapper.client())
  const data = global.preprofileGenerator({})
  const msg = { ack: jest.fn() }
  return { listener, data, msg }
}

it("returns a 200 when fetching list of preprofiles as admin", async () => {
  // GENERATE AND APPROVE 3 ARCHITECT PREPROFILES
  for (let i = 0; i < 3; i++) {
    const { listener, data, msg } = await eventBuilder()
    await listener.onMessage(data, msg)
    const preprofile = await PreProfile.findById(data._id)
    expect(preprofile._id.toString()).toEqual(data._id)
  }

  // FETCH ARCHITECT WITH ADMIN API
  const adminCookie = await global.adminRegister()
  const page = 0
  const limit = 100

  const { body } = await request(app)
    .get(`/api/architect/admin/example-endpoint/list?page=${page}&limit=${limit}`)
    .set({ Host: "www.example.com" })
    .set("Cookie", adminCookie)
    .expect(200)

  expect(body[0].total).toEqual([{ count: 3 }])
  expect(body[0].preprofiles).toHaveLength(3)
})

@Tim0theus
Copy link

Tim0theus commented Jan 12, 2025

EDIT: Seems to be that JEST increased heap usage with each test, which lead to this ENOMEM. So my problem below is not directly with MMS, but with JEST instead.

Hey, I'm not sure if this is related, but since updating Node in my dockerfile vom 20 to 22.13 and then also updating the mongodb-memory-server to 10.1.3, I see a new ENOMEM error when testing, which didn't happen before that.
I already tried to increase the memory, but that didn't help so far. As the heap is not fully using the available memery in my case, it most likely doesn't have to do with not enough memory anyway.
Also tried wired-tiger instead and it didn't help.

I read that it might have to do with virtual memory and that increasing vm.max_map_count might help. However since I'm using an AWS build environment and FARGATE I think that's not possible there. (Still researching).

However I'll try changing the binary and see if that helps. Didn't try that so far.

Here the error:

Starting the MongoMemoryServer Instance failed, enable debug log for more information. Error:
--
3117 | Error: spawn ENOMEM
3118 | at ChildProcess.spawn (node:internal/child_process:420:11)
3119 | at spawn (node:child_process:753:9)
3120 | at MongoInstance._launchMongod (/backend/node_modules/mongodb-memory-server-core/src/util/MongoInstance.ts:506:31)
3121 | at MongoInstance.start (/backend/node_modules/mongodb-memory-server-core/src/util/MongoInstance.ts:394:31)
3122 | at async Function.create (/backend/node_modules/mongodb-memory-server-core/src/util/MongoInstance.ts:294:5)
3123 | at async MongoMemoryServer._startUpInstance (/backend/node_modules/mongodb-memory-server-core/src/MongoMemoryServer.ts:530:22)
3124 | at async MongoMemoryServer.start (/backend/node_modules/mongodb-memory-server-core/src/MongoMemoryServer.ts:350:5)
3125 | at async Function.create (/backend/node_modules/mongodb-memory-server-core/src/MongoMemoryServer.ts:317:5) {
3126 | errno: -12,
3127 | code: 'ENOMEM',
3128 | syscall: 'spawn'
3129 | }
3130 |  
3131 | at node_modules/mongodb-memory-server-core/src/MongoMemoryServer.ts:359:17
3132 | at async MongoMemoryServer.start (node_modules/mongodb-memory-server-core/src/MongoMemoryServer.ts:350:5)
3133 | at async Function.create (node_modules/mongodb-memory-server-core/src/MongoMemoryServer.ts:317:5)

@hasezoey
Copy link
Member

hasezoey commented Jan 12, 2025

Starting the MongoMemoryServer Instance failed, enable debug log for more information. Error:
3117 | Error: spawn ENOMEM

Well this is weird, this means that the mongodb binary couldnt even start because it has no memory, to my knowledge this issue(#907) is about running out of memory while a binary is already running (though i dont know yet if the issue is mongodb, MMS, or something else here).

I dont think i can help in your case, you will need to look why it is already out-of-memory at that point. (if it is actually MMS, please open a new issue)

@orimdominic
Copy link

orimdominic commented Feb 4, 2025

It is an issue with v10 obviously.
I used v9.1.6 and it was fine.
Check https://github.com/orimdominic/nodejs-test-runner-mongoose/blob/06-pass-tests/

When I used v10, resources were created in /tmp and were not deleted after tests were complete (see screenshot) and it caused my computer memory to be full.

Image

@hasezoey
Copy link
Member

hasezoey commented Feb 4, 2025

When I used v10, resources were created in /tmp and were not deleted after tests were complete (see screenshot) and it caused my computer memory to be full.

From what i recall and can tell from comparing (between old/9.x and v10.1.3), nothing had changed regarding cleanup logic, only that feature Explicit Resource Management was added, which you need to make use of explicitly, otherwise it is the same as 9.x.


Thanks for providing repro code, and i can reproduce it, weirdly enough. Though note that even without modifications your provided code fails.

Fail logs of repro code
$ node --test --import ./test.setup.mjs
(node:11139) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
▶ TaskService.create
  ✔ creates a task and returns it (23.514855ms)
✔ TaskService.create (65.394208ms)
▶ TaskService.getAllForUser
  ✔ returns all the tasks for a user (13.911151ms)
✔ TaskService.getAllForUser (14.167921ms)
▶ TaskService.getOneWithUser
  ✔ throws an error if task is not found (3.927656ms)
  ✔ returns a task with the user details (10.449805ms)
✔ TaskService.getOneWithUser (14.713043ms)
(node:11140) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
▶ UserService.create
  ✔ creates and returns a user (25.261628ms)
  ✔ throws an error when email exists already (6.768191ms)
✔ UserService.create (72.224372ms)
▶ UserService.getById
  ✖ throws an error if user is not found (3.618324ms)
    AssertionError [ERR_ASSERTION]: Missing expected rejection.
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
        at async Test.run (node:internal/test_runner/test:935:9)
        at async Promise.all (index 0)
        at async Suite.run (node:internal/test_runner/test:1320:7)
        at async Test.processPendingSubtests (node:internal/test_runner/test:633:7) {
      generatedMessage: false,
      code: 'ERR_ASSERTION',
      actual: undefined,
      expected: undefined,
      operator: 'rejects'
    }

  ✖ returns a user with their total tasks (7.296001ms)
    TypeError [Error]: email.toLowerCase is not a function
        at Module.create (file:///home/hasezoey/Downloads/test/nodejs-test-runner-mongoose/user/service.mjs:16:61)
        at TestContext.<anonymous> (file:///home/hasezoey/Downloads/test/nodejs-test-runner-mongoose/user/__tests__/service.test.mjs:50:21)
        at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
        at async Test.run (node:internal/test_runner/test:935:9)
        at async Suite.processPendingSubtests (node:internal/test_runner/test:633:7)

✖ UserService.getById (11.493237ms)
ℹ tests 8
ℹ suites 5
ℹ pass 6
ℹ fail 2
ℹ cancelled 0
ℹ skipped 0
ℹ todo 0
ℹ duration_ms 406.52901

✖ failing tests:

test at user/__tests__/service.test.mjs:34:5
✖ throws an error if user is not found (3.618324ms)
  AssertionError [ERR_ASSERTION]: Missing expected rejection.
      at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
      at async Test.run (node:internal/test_runner/test:935:9)
      at async Promise.all (index 0)
      at async Suite.run (node:internal/test_runner/test:1320:7)
      at async Test.processPendingSubtests (node:internal/test_runner/test:633:7) {
    generatedMessage: false,
    code: 'ERR_ASSERTION',
    actual: undefined,
    expected: undefined,
    operator: 'rejects'
  }

test at user/__tests__/service.test.mjs:38:5
✖ returns a user with their total tasks (7.296001ms)
  TypeError [Error]: email.toLowerCase is not a function
      at Module.create (file:///home/hasezoey/Downloads/test/nodejs-test-runner-mongoose/user/service.mjs:16:61)
      at TestContext.<anonymous> (file:///home/hasezoey/Downloads/test/nodejs-test-runner-mongoose/user/__tests__/service.test.mjs:50:21)
      at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
      at async Test.run (node:internal/test_runner/test:935:9)
      at async Suite.processPendingSubtests (node:internal/test_runner/test:633:7)
error Command failed with exit code 1.

I will further investigate this, but from my first findings, somehow the paths in the logs dont match up with the ones that still exist after, and the ones that exist after are also never mentioned in the logs. Very weird.

@orimdominic
Copy link

orimdominic commented Feb 4, 2025

Hello @hasezoey yes. The two tests that fail were supposed to fail.
Code to pass those tests have not been written yet.
You can skip them.

Can you point to the part of the documentation where one can explicitly enable the Explicit Resource Management feature?

What do you mean by the paths in the logs don't match up with the ones that still exist after, and the ones that exist after are also never mentioned in the logs.?

@hasezoey
Copy link
Member

hasezoey commented Feb 4, 2025

What do you mean by the paths in the logs don't match up with the ones that still exist after, and the ones that exist after are also never mentioned in the logs.?

In the logs (when running with debug logs), it for example only says it creates mongo-mem-AAAA and mongo-mem-BBBB and also says it deletes them (as confirmed via inotifywatch), but then there still exists mongo-mem-CCCC afterwards. (assuming a clean /tmp)


I have further debugged this, it seems like the problem is the killer script, it somehow gets the test.before executed, but never the test.after, ie the killer script launches its own mongodb server, which never gets stopped and cleaned. (the mongo-mem-CCCC in the previous example)
Weirdly, nothing regarding this has changed for quite a while (at least in 9.5.0 to 10.1.3).
Even if there is no logically visible change, i still did a git bisect (and manual testing as i could not believe it) says the first problematic commit is 946e78e.
Which is very weird as this only changes the default mongodb version from 7.0.11 to 7.0.14, and changes no other code (literally only string changes).
I have tested various mongodb version, which showed varying behavior, but then i tried to await a timeout in after to see if all just behaved the same, but were differently fast. Which was the case.

TL;DR: somehow the tests after gets also executed in the killer script, which causes more servers to be created, which causes more killer scripts to be created (and the loop continues) mostly until the main process exists.

I have not seen this behavior outside of node:test (or at least not this bad), but i dont know how i could circumvent that, as fork seemingly implies --import YOURMAINSCRIPT, which is (to my understanding) the main cause.
I guess i will have to try to find a new way to launch the killer script.

This is definitely a problem, though i dont know why, but unless @erwanriou can confirm this behavior in the jest case, it is a different problem.

@orimdominic
Copy link

orimdominic commented Feb 4, 2025

What do you mean by the paths in the logs don't match up with the ones that still exist after, and the ones that exist after are also never mentioned in the logs.?

In the logs (when running with debug logs), it for example only says it creates mongo-mem-AAAA and mongo-mem-BBBB and also says it deletes them (as confirmed via inotifywatch), but then there still exists mongo-mem-CCCC afterwards. (assuming a clean /tmp)

I have further debugged this, it seems like the problem is the killer script, it somehow gets the test.before executed, but never the test.after, ie the killer script launches its own mongodb server, which never gets stopped and cleaned. (the mongo-mem-CCCC in the previous example) Weirdly, nothing regarding this has changed for quite a while (at least in 9.5.0 to 10.1.3). Even if there is no logically visible change, i still did a git bisect (and manual testing as i could not believe it) says the first problematic commit is 946e78e. Which is very weird as this only changes the default mongodb version from 7.0.11 to 7.0.14, and changes no other code (literally only string changes). I have tested various mongodb version, which showed varying behavior, but then i tried to await a timeout in after to see if all just behaved the same, but were differently fast. Which was the case.

TL;DR: somehow the tests after gets also executed in the killer script, which causes more servers to be created, which causes more killer scripts to be created (and the loop continues) mostly until the main process exists.

I have not seen this behavior outside of node:test (or at least not this bad), but i dont know how i could circumvent that, as fork seemingly implies --import YOURMAINSCRIPT, which is (to my understanding) the main cause. I guess i will have to try to find a new way to launch the killer script.

This is definitely a problem, though i dont know why, but unless @erwanriou can confirm this behavior in the jest case, it is a different problem.

Thank you for this. It's insightful.
Is this something worth talking to the folks at Node.js about?

@hasezoey
Copy link
Member

hasezoey commented Feb 4, 2025

Re #907 (comment)

I have created #912 to track this.

Is this something worth talking to the folks at Node.js about?

I dont think this is necessary, see nodejs/node#52930 and nodejs/node#52939.
TL;DR: doc: specify that preloaded modules affect subprocesses

@hasezoey
Copy link
Member

hasezoey commented Feb 4, 2025

Can you point to the part of the documentation where one can explicitly enable the Explicit Resource Management feature?

Sorry, forgot about this, see FAQ: Does this package support Explicit Resource Management? and the ECMAScript ECMAScript Explicit Resource Management Proposal.
And for TL;DR, read the Typescript 5.2 using Declarations and Explicit Resource Management.

mongodb-memory-server example:

function main() {
  await using dbServer = MongoMemoryServer.create();
  // do your stuff
  // automatically disposed on the end of the scope of the variable, calling ".stop", unless disabled via options)
}

@erwanriou
Copy link
Author

Sorry guys i completely forget to checkout on my logs. My issue was indeed when launching jest and not related to node:test.
Just updated the pipeline temporarly to check that

      - uses: actions/checkout@v3

      - name: Check initial disk space
        run: df -h

      - name: Clear temporary files
        run: sudo rm -rf /tmp/* || true

      - name: Install dependencies
        run: |-
          echo -e "//npm.pkg.github.com/:_authToken=${{secrets.CI_TOKEN}} \n@archsplace:registry=https://npm.pkg.github.com/archsplace \nregistry=https://registry.npmjs.org \ntimeout=60000 \nupdate-notifier=false" > .npmrc && npm i

      - name: Preload MongoDB binary
        run: |
          mkdir -p /tmp/mongodb-binaries
          curl -o /tmp/mongodb-binaries/mongodb-linux-x86_64-ubuntu2204-6.0.19.tgz https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-ubuntu2204-6.0.19.tgz
          tar -xvzf /tmp/mongodb-binaries/mongodb-linux-x86_64-ubuntu2204-6.0.19.tgz -C /tmp/mongodb-binaries
          echo "MongoDB binary downloaded and extracted."

      - name: Run tests
        env:
          MONGOMS_DOWNLOAD_URL: https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-ubuntu2204-6.0.19.tgz
          MONGOMS_SYSTEM_BINARY: /tmp/mongodb-binaries/mongodb-linux-x86_64-ubuntu2204-6.0.19/bin/mongod
        run: |
          npm run test:ci || { 
            echo "Test failure encountered (possibly ENOSPC). Capturing /tmp disk usage:"; 
            du -xh /tmp; 
            echo "Process list (sorted by memory):"; 
            ps aux --sort=pmem; 
            exit 1; 
          }

And as crasy as it seems i tried to reproduce the bug but now all the test keep passing with the latest version. It's true that i am enforcing a specific mongodb version but still i don't know why it would be working now. I could remove the logic enforcing a specific mongodb binary and check maybe...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants