Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using TextSpotInferencer to infer,the InstanceData from pred_instances is not compatible with textspotting_visualizer #1943

Open
2 tasks done
xiaomaofeng opened this issue Jun 30, 2023 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@xiaomaofeng
Copy link

xiaomaofeng commented Jun 30, 2023

Prerequisite

Task

I have modified the scripts/configs, or I'm working on my own tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmocr

Environment

sys.platform: linux
Python: 3.10.7 (main, Nov 24 2022, 19:45:47) [GCC 12.2.0]
CUDA available: True
numpy_random_seed: 2147483648
GPU 0,1: GeForce RTX 3090
GPU 2,3: GeForce RTX 3080 Ti
CUDA_HOME: /usr/local/cuda-11.8
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: x86_64-linux-gnu-gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0
PyTorch: 2.0.0+cu117
PyTorch compiling details: PyTorch built with:

GCC 9.3
C++ Version: 201703
Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.7.3 (Git Hash 6dbeffbae1f23cbbeae17adb7b5b13f1f37c080e)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX2
CUDA Runtime 11.7
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
CuDNN 8.5
Magma 2.6.1
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.0.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision: 0.15.1+cu117
OpenCV: 4.7.0
MMEngine: 0.7.3
MMOCR: 1.0.0+964172a

Reproduces the problem - code sample

def add_datasample(self,
                   name: str,
                   image: np.ndarray,
                   data_sample: Optional['TextDetDataSample'] = None,
                   draw_gt: bool = True,
                   draw_pred: bool = True,
                   show: bool = False,
                   wait_time: int = 0,
                   pred_score_thr: float = 0.5,
                   out_file: Optional[str] = None,
                   step: int = 0) -> None:
    """Draw datasample and save to all backends.

    - If GT and prediction are plotted at the same time, they are
    displayed in a stitched image where the left image is the
    ground truth and the right image is the prediction.
    - If ``show`` is True, all storage backends are ignored, and
    the images will be displayed in a local window.
    - If ``out_file`` is specified, the drawn image will be
    saved to ``out_file``. This is usually used when the display
    is not available.

    Args:
        name (str): The image identifier.
        image (np.ndarray): The image to draw.
        data_sample (:obj:`TextSpottingDataSample`, optional):
            TextDetDataSample which contains gt and prediction. Defaults
                to None.
        draw_gt (bool): Whether to draw GT TextDetDataSample.
            Defaults to True.
        draw_pred (bool): Whether to draw Predicted TextDetDataSample.
            Defaults to True.
        show (bool): Whether to display the drawn image. Default to False.
        wait_time (float): The interval of show (s). Defaults to 0.
        out_file (str): Path to output file. Defaults to None.
        pred_score_thr (float): The threshold to visualize the bboxes
            and masks. Defaults to 0.3.
        step (int): Global step value to record. Defaults to 0.
    """
    cat_images = []

    if data_sample is not None:
        if draw_gt and 'gt_instances' in data_sample:
            gt_bboxes = data_sample.gt_instances.get('bboxes', None)
            gt_texts = data_sample.gt_instances.texts
            gt_polygons = data_sample.gt_instances.get('polygons', None)
            gt_img_data = self._draw_instances(image, gt_bboxes,
                                               gt_polygons, gt_texts)
            cat_images.append(gt_img_data)

        if draw_pred and 'pred_instances' in data_sample:
            pred_instances = data_sample.pred_instances
            # pred_instances = pred_instances[
            #     pred_instances.scores > pred_score_thr].cpu().numpy()
            pred_instances = pred_instances[
                [i for i, x in enumerate(pred_instances) if any(score > 0.3 for score in x.text_scores)]].cpu().numpy()
            pred_bboxes = pred_instances.get('bboxes', None)
            pred_texts = pred_instances.texts
            pred_polygons = pred_instances.get('polygons', None)
            if pred_bboxes is None and pred_polygons is not None:
                pred_bboxes = [poly2bbox(poly) for poly in pred_polygons]
                pred_bboxes = np.array(pred_bboxes)
            if pred_bboxes is not None:
                pred_img_data = self._draw_instances(image, pred_bboxes,
                                                 pred_polygons, pred_texts)
                cat_images.append(pred_img_data)

    cat_images = self._cat_image(cat_images, axis=0)
    if cat_images is None:
        cat_images = image

    if show:
        self.show(cat_images, win_name=name, wait_time=wait_time)
    else:
        self.add_image(name, cat_images, step)

    if out_file is not None:
        mmcv.imwrite(cat_images[..., ::-1], out_file)

    self.set_image(cat_images)
    return self.get_image()

Reproduces the problem - command or script

from mmocr.apis import TextSpotInferencer

Load models into memory

inferencer = TextSpotInferencer(model='projects/SPTS/config/spts/spts_resnet50_8xb8-200e_icdar2015.py',weights='model/best_generic_hmean.pth')
inferencer('/root/icdar2015/textdet_imgs/test/image_7000.jpg', save_vis=True,return_vis=True)

Reproduces the problem - error message

image
please see the datastruct in the pred_instances, here is not have the bbox and polygons data, and the type of scores is 'list' not 'tensor',so i had to change the code to adapter the datastruct.

Additional information

1.the datastruct should be consist with inferring from (det and recog)
2.get the infer result image to visualizer.

@xiaomaofeng xiaomaofeng added the bug Something isn't working label Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants