-
Notifications
You must be signed in to change notification settings - Fork 414
Open
Description
I was trying to export RFDETRSegPreview like the following:
model = RFDETRSegPreview(pretrain_weights=output_dir + "/checkpoint_best_ema.pth", device="cpu")
export_image_shape = (560, 560)
model.export(
output_dir=output_dir,
verbose=True,
shape=export_image_shape,
)
I get the complaint:
File ~/projects/rf-detr/rfdetr/main.py:551, in Model.export(self, output_dir, infer_dir, simplify, backbone_only, opset_version, verbose, force, shape, batch_size, **kwargs)
549 print(f"PyTorch inference output shape: {features.shape}")
550 elif self.args.segmentation_head:
--> [551](https://file+.vscode-resource.vscode-cdn.net/home/mujin/projects/rf-detr/~/projects/rf-detr/rfdetr/main.py:551) outputs = model(input_tensors)
552 dets = outputs['pred_boxes']
...
--> [187](https://file+.vscode-resource.vscode-cdn.net/home/mujin/projects/rf-detr/~/projects/rf-detr/rfdetr/models/backbone/dinov2.py:187) assert x.shape[2] % block_size == 0 and x.shape[3] % block_size == 0, f"Backbone requires input shape to be divisible by {block_size}, but got {x.shape}"
188 x = self.encoder(x)
189 return list(x[0])
AssertionError: Backbone requires input shape to be divisible by 24, but got torch.Size([1, 3, 560, 560])
If I set export_image_shape=(432, 432) - which is what is used if shape=None
passed to export anyway - I get the following complaint:
File ~/projects/rf-detr/rfdetr/main.py:539, in Model.export(self, output_dir, infer_dir, simplify, backbone_only, opset_version, verbose, force, shape, batch_size, **kwargs)
537 else:
538 if shape[0] % 14 != 0 or shape[1] % 14 != 0:
--> [539](https://file+.vscode-resource.vscode-cdn.net/home/mujin/projects/rf-detr/~/projects/rf-detr/rfdetr/main.py:539) raise ValueError("Shape must be divisible by 14")
541 input_tensors = make_infer_image(infer_dir, shape, batch_size, device).to(device)
542 input_names = ['input']
I was trying to export the model with a higher resolution. I have trained and predicted using the segmentation model. With an image resolution of 432, the output masks are too grainy for some of our robotics applications.
Metadata
Metadata
Assignees
Labels
No labels