Skip to content

Does not work in swam mode #238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
willvincent opened this issue Jun 24, 2019 · 16 comments
Closed

Does not work in swam mode #238

willvincent opened this issue Jun 24, 2019 · 16 comments

Comments

@willvincent
Copy link

Have followed the instructions many times, starting from scratch each time.. No luck whatsoever getting swarm mode configured and usable.

Docker version 18.09.6, build 481bc77

Best I can figure, is this is being caused by the swarm plugin...? Works fine in "normal" mode, but swarm is completely broken.

Unhandled promise rejections and reference to network being ambiguous.

[exoframe-server] › ℹ  info                         Initializing docker services...
[exoframe-server] › ℹ  info                         Exoframe network exoframe does not exists, creating...
[exoframe-server] › ℹ  info                         Exoframe network exoframe-swarm does not exists, creating...
(node:1) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
[exoframe-server] › ℹ  info                         Traefik instance started..
[exoframe-server] › ℹ  info                         Init finished via exclusive plugin: exoframe-plugin-swarm
[exoframe-server] › ℹ  info                         Server running at: 8080
(node:1) UnhandledPromiseRejectionWarning: Error: (HTTP code 400) unexpected - network  is ambiguous (4 matches found)
at /snapshot/exoframe-server/bin/server-core.js:62119:17
at getCause (/snapshot/exoframe-server/bin/server-core.js:62149:7)
at Modem.module.exports.Modem.buildPayload (/snapshot/exoframe-server/bin/server-core.js:62118:5)
at IncomingMessage.<anonymous> (/snapshot/exoframe-server/bin/server-core.js:62094:14)
at IncomingMessage.emit (events.js:187:15)
at endReadableNT (_stream_readable.js:1081:12)
at process._tickCallback (internal/process/next_tick.js:63:19)
(node:1) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:1) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

Is there a known good version? Specific versions of docker/traefik/etc that I should be using? The docs haven't been any help getting past this, but I'd have expected the latest release to be stable -- of others to have reported problems maybe.

@yamalight
Copy link
Contributor

Latest docker/traefik versions should work fine. I haven't seen that error before 🤔
Seems like there's some issue with networks. Can you try running docker system prune and restarting exoframe-server to see if that helps?

@willvincent
Copy link
Author

Server config:

debug: false
letsencrypt: false
letsencryptEmail: [email protected]
compress: true
baseDomain: false
cors: false
updateChannel: stable
traefikImage: 'traefik:latest'
traefikName: exoframe-traefik
traefikArgs: []
exoframeNetwork: exoframe
publicKeysPath: /root/.ssh
swarm: true
plugins:
  install: ['exoframe-plugin-swarm']
  swarm:
    enabled: true

Have tried with debug set true as well, but that doesn't yield any more useful logging

@willvincent
Copy link
Author

docker system prune results in the deletion of the exoframe network.. exoframe-swarm remains, but upon restart, the exoframe network is recreated and same error persists

@yamalight
Copy link
Contributor

how does your docker network ls looks?

@willvincent
Copy link
Author

NETWORK ID          NAME                DRIVER              SCOPE
5944c80a3e70        bridge              bridge              local
cab1aa285270        docker_gwbridge     bridge              local
j1sawq879o43        exoframe-swarm      overlay             swarm
b24010230e16        host                host                local
96g47bfvdzia        ingress             overlay             swarm
765a677587d5        none                null                local

@willvincent
Copy link
Author

Had changed config to point at exoframe-swarm network, to no avail.. changed back, now I have:

root@exo01:~# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
5944c80a3e70        bridge              bridge              local
cab1aa285270        docker_gwbridge     bridge              local
130fb003ab69        exoframe            bridge              local
j1sawq879o43        exoframe-swarm      overlay             swarm
b24010230e16        host                host                local
96g47bfvdzia        ingress             overlay             swarm
765a677587d5        none                null                local

After restart.. same error in the log

@willvincent
Copy link
Author

Also, restarting exoframe-server I end up with two exoframe-server containers.

@willvincent
Copy link
Author

disabling docker swarm mode, and redeploying in normal mode, it fires right up instantly.. no issues.. this is obviously a swarm issue

@yamalight
Copy link
Contributor

Networks list seems to look OK 🤔
Exoframe-swarm plugin doesn't use network set for exoframe-server, it has its own config param for that, e.g.:

plugins:
  install: ['exoframe-plugin-swarm']
  swarm:
    enabled: true
    network: 'my-custom-net'

This does looks like an issue on docker end to be honest.
You could try running debug version of exoframe-server, bug I doubt you'll get more info than that.
Is it possible for you to wipe your swarm cluster and recreate it from scratch? I suspect there might've been some issue on the docker end during its creation that lead to this 🤔

@willvincent
Copy link
Author

  swarm:
    enabled: true
    network: 'my-custom-net'

The network part of that doesn't seem to be in the documentation, might be the whole problem.

Is it possible for you to wipe your swarm cluster and recreate it from scratch? I suspect there might've been some issue on the docker end during its creation that lead to this

Have done, many times.

@yamalight
Copy link
Contributor

@willvincent it's not documented right now indeed. There's a ticket for that, but I haven't gotten around to actually doing it.

W.r.t. docker cluster - have you also wiped the master node?

@willvincent
Copy link
Author

That was it.. network definition is apparently required for the swam plugin.. docs don't reflect that. :(

@yamalight
Copy link
Contributor

@willvincent it shouldn't be required as it uses exoframe-swarm network by default (that it created successfully according to your logs) 🤔

@willvincent
Copy link
Author

Yea, but when it wasn't defined it errored... so, must be necessary, or there's a bug somewhere causing that ambiguous network issue.

Anyway, seems to be working, wish it hadn't taken the better part of the past 20ish hours to get here...

@yamalight
Copy link
Contributor

The network prop is absolutely not required - and CI tests do pass just fine for plugin right now.
So, this definitely looks like a bug, but I'm not entirely sure where it comes from.
Would be more than happy to accept PR that fixes this if you can track it down!

@yamalight
Copy link
Contributor

I've been trying to repro this on 2 different machines over the past week with no success (clean swarm setups via docker-machine cli). I'll be closing this since I couldn't repro it at all.
Feel free to re-open if you do encounter it again and can provide steps to reproduce it (preferably using docker-machine w/ new fresh swarm setup).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants