-
Notifications
You must be signed in to change notification settings - Fork 8
Description
First of all thank you for sharing your work on this plugin, it has saved me a lot of time already.
I have started working on k3s compatibility for this device plugin and I have gotten to the point when the plugin discovers available TPUs and registers them with the kubelet.
$ kubectl describe nodes
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 100m (2%) 0 (0%)
memory 70Mi (7%) 170Mi (18%)
ephemeral-storage 0 (0%) 0 (0%)
kkohtaka.org/edgetpu 1 1
I0327 17:02:58.014104 1 plugin.go:98] Started gRPC service on plugin socket
I0327 17:02:58.014183 1 plugin.go:101] Started monitoring devices
I0327 17:02:58.014211 1 plugin.go:49] gRPC server started.
I0327 17:02:58.015416 1 plugin.go:118] Opened connection to kubelet socket
I0327 17:02:58.015486 1 plugin.go:121] Registering dpServer: &{[] 0xd58100 0xd58140}
I0327 17:02:58.017211 1 plugin.go:134] Registered device plugin
I0327 17:02:58.019309 1 server.go:56] Start watching devices
I0327 17:02:58.019419 1 server.go:66] Update a device list
I0327 17:02:58.019455 1 server.go:126] Starting Edge TPU device monitor
I0327 17:03:03.184672 1 server.go:155] Edge TPU became active.
I0327 17:03:03.185096 1 server.go:66] Update a device list
I0327 17:04:03.390627 1 server.go:79] Container TPU request: &ContainerAllocateRequest{DevicesIDs:[42],}
I0327 17:04:03.390765 1 server.go:80] Allocating devices... Device IDs: [42]
side note: it looks like the device id is hardcoded to 42 so only 1 TPU is currently allowed per node, do you plan to support multiple devices?
I am able to schedule a pod which requests a TPU but the container fails to start due to:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 86s default-scheduler Successfully assigned default/edgetpu-demo-54f5l to raspberrypi
Normal Pulling 2s (x2 over 84s) kubelet, raspberrypi pulling image "quay.io/kkohtaka/edgetpu-demo:arm32"
Warning Failed 2s kubelet, raspberrypi Error: failed to generate container "41e1245b846a2a815f54ca40741d10fc071f91b34eadf22309c52f949ea1d4ce" spec: failed to set devices mapping [&Device{ContainerPath:/dev/bus/usb,HostPath:/dev/bus/usb,Permissions:rw,}]: not a device node
Normal Pulled 1s (x2 over 2s) kubelet, raspberrypi Successfully pulled image "quay.io/kkohtaka/edgetpu-demo:arm32"
Warning Failed 1s kubelet, raspberrypi Error: failed to generate container "d72f48a0b04a560b1c7e81ec20d686b6ca3710f0a02fd38cecb1b937b52e8d05" spec: failed to set devices mapping [&Device{ContainerPath:/dev/bus/usb,HostPath:/dev/bus/usb,Permissions:rw,}]: not a device node
It looks like I need to specify an absolute path to the device but I'm not sure what that would be, could you point me in the right direction? I'm using a raspberrypiB+ with a coral accelerator usb for testing
If you're open to it, I'd be happy to submit a PR for k3s support once I get this working.