Direction Estimation Techniques based on OBB detection

1 - Technical University of Cluj-Napoca, Memorandumului 28, 400114 Romania {Szilard.Molnar,Levente.Tamas}@aut.utcluj.ro 2 - Institute of Informatics, University of Szeged, Hungary, {amstadt,kato}@inf.u-szeged.hu

Object-Based Camera Pose Estimation from a Single Object Detection and Gravity Vector

Recent results on pose estimation from ellipsoid-ellipse cor- respondences, which can be readily obtained from an object detector, allow a direct computation of the camera pose from object-level corre- spondences. Unfortunately, standard bounding boxes (either horizontal or minimal enclosing boxes) are symmetric, which introduces an inher- ent ambiguity in the correspondence, yielding multiple or even infinite solutions. Furthermore, the current state of the art requires minimum two such correspondences to provide sufficient constraints for camera rotation. Our contributions make object-based pose estimation efficient in practice: First, a novel object detection method is proposed, called Directional Object Bounding Box (DOBB), which is capable of detect- ing the object’s own direction together with its minimal enclosing box (OBB), yet independently from it, which not only breaks the symmetry of OBBs, but also provides the necessary additional geometric informa- tion for our pose estimation method. Second, a novel object-based robust camera pose estimation pipeline is proposed where a minimal solution can be obtained from a single object for outlier filtering when vertical direction and the object orientation w.r.t. that axis are known; followed by a closed-form least squares solution for multiple inlier objects to com- pute the camera pose. Comparative tests confirm the state-of-the-art performance of the proposed DOBB-based pose estimation method on the standard KITTI360 and 7-Scenes datasets.

The Code will be available after the article is published. On special request contact the authors.

Alternative Text

Directional Object Detection in Aerial Images

Object detection in remote sensing consists in localizing objects of interest on the Earth’s surface, and of course recognizing them. In such aerial images objects are seen from above, from a relatively large distance which makes them essentially ”planar”, i.e., we get a standard view of them. Classical object detectors must deal with a wide range of distinct side- views, generally oriented upward due to gravity - while objects in aerial images appear with an arbitrary orientation on the ground plane. This naturally leads to the question: Can we detect the direction of an object along with its location and object category? So called oriented bounding boxes (OBB) became standard in remote sensing due the crowded scenes, especially for rotated and elongated objects (e.g., parking cars in a parking lot). OBBs localizes objects by a minimal enclosing box thus oriented refers to a rotation angle of a standard BB for a tight enclosing of the detected object with respect to the image axes. However, this bounding box orientation does not correspond to the orientation of the detected object, e.g., the front of an airplane or a car, stem of a leaf, which provides valuable information for aerial image analysis. Herein, we propose a novel method, the Directed Object Detector (DOD), which is capable of detecting the object’s own direction together with its minimal enclosing OBB. This is integrated into an existing object detector to create a single end-to-end neural network. Experimental validation confirms the state of the art performance on both close and far-range remote sensing images in man-made and natural environments. Furthermore, we prove the advantage of DOD over the OBB approach in an image rectification application, which generally is not solvable with OBB.

For more information read the MIGARS2025 paper. The Code will be available after the article is published. On special request contact the authors.

Alternative Text

BibTeX

@InProceedings{molnar2025isvc_dobb_objectbasedcamerapose,
      author    = {Molnar, Szilard and Amstadt, Zita and Tamas, Levente and Kato, Zoltan},
      booktitle = {{ISVC 2025 20th International Symposium on Visual Computing}},
      title     = {{Object-Based Camera Pose Estimation from a Single Object Detection and Gravity Vector}},
      year      = {2025},
      note      = {Presented at the convference, waiting for the proceeding publication},
    }
@InProceedings{molnar2025migars_DOD_directionalobjectdetection,
      author    = {Molnar, Szilard and Tamas, Levente and Kato, Zoltan},
      booktitle = {{2025 International Conference on Machine Intelligence for GeoAnalytics and Remote Sensing (MIGARS)}},
      title     = {{Directional Object Detection in Aerial Images}},
      year      = {2025},
      pages     = {1--4},
      doi       = {10.1109/MIGARS67156.2025.11231783},
    }