Skip to content

[Simulated Omicron] Fails to start up when runing pieces individually due to Failed to lookup mgd address: when running locally #4421

Open
@karencfv

Description

@karencfv

When I run the cargo run --bin omicron-dev -- run-all command, simulated omicron fails to start up with the following output:

coatlicue@pop-os:~/src/omicron$ cargo run --bin omicron-dev -- run-all
Finished dev [unoptimized + debuginfo] target(s) in 0.32s
Running target/debug/omicron-dev run-all
omicron-dev: setting up all services ...
log file: /tmp/omicron-dev-omicron-dev.1091696.0.log
note: configured to log to "/tmp/omicron-dev-omicron-dev.1091696.0.log"
DB URL: postgresql://root@[::1]:34507/omicron?sslmode=disable
DB address: [::1]:34507
log file: /tmp/omicron-dev-omicron-dev.1091696.1.log
note: configured to log to "/tmp/omicron-dev-omicron-dev.1091696.1.log"
thread 'main' panicked at 'called Result::unwrap() on an Err value: failed to discover dendrite port from files in /tmp/.tmpcEaWcn

Caused by:
0: time out while discovering dendrite port number
1: deadline has elapsed', /home/coatlicue/src/omicron/nexus/test-utils/src/lib.rs:423:72
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
Aborted (core dumped)

UPDATE: The above is fixed by @david-crespo's suggestion to run the prereqs script again, but I am still seeing the behaviour below.

When I try to run the pieces separately, the cargo run --bin=nexus -- nexus/examples/config.toml command returns:

Nov 03 00:15:59.680 INFO SEC running, sec_id: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, component: SEC, component: nexus, component: ServerContext, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, file: /home/coatlicue/.cargo/registry/src/index.crates.io-6f17d22bba15001f/steno-0.4.0/src/sec.rs:811
Nov 03 00:16:14.726 WARN Failed to lookup mgd address: Cannot lookup mgd addresses: request timed out, component: nexus, component: ServerContext, name: e6bff1ff-24fb-49dc-a54e-c6a350cd4d6c, file: nexus/src/app/mod.rs:307

and cargo run --bin=sled-agent-sim -- $(uuidgen) [::1]:12345 [::1]:12221 --rss-nexus-external-addr 127.0.0.1:12220 --rss-external-dns-internal-addr [::1]:5353 --rss-internal-dns-dns-addr [::1]:3535 returns:

Nov 03 00:17:44.965 WARN failed to contact nexus, will retry in 17.393268136s, error: Communication Error: error sending request for url (http://[::1]:12221/sled-agents/1f913b0a-962d-4d28-a706-9c436d3691ad): error trying to connect: tcp connect error: Connection refused (os error 111), file: sled-agent/src/sim/server.rs:123

I am unsure if the documentation needs to be updated or why this is failing locally and not in CI 🤷‍♀️

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething that isn't working.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions