About Me

Who Am I?

Hi, I'm Szilárd. I am a Ph. D. Student at the University of Cluj-Napoca, Romania. My domain is 2D and 3D computer vision with deep learning and robotics. My research team is the ROCON group. I have experience in working with various programming languages (python, C, C++, C#, Matlab, kotlin, java, javascript, bash) and software tools (Nvidia Omniverse, Blender, ROS, PyTorch, Tensorflow, docker).

Work

Projects and Publications

VinEye

VinEye: project aimed to monitor vineyards and detect vine diseases using autonomous drones and ground vehicles.

VinEye Project page

Directional Bounding Boxes for Oriented Object Detection

This DOBB method estimates oriented bounding boxes and a special direction to each object.

DOBB Project page

Segmentation Methods Evaluation on Grapevine Leaf Diseases

The focus of this work is to validate the most suitable methods for vine leaf segmentation and disease detection on a custom dataset containing leaves both from the laboratory environment and cropped from images in the field. We tested five promising methods including the Otsu’s thresholding, Mask R-CNN, MobileNet, SegNet, and Feature Pyramid Network variants.

FedCSIS - 2023 on RG

Feature Pyramid Network based Proximal Vine Canopy Segmentation

In this work, we present a Feature Pyramid Network-based grape canopy segmentation method, which has great potential to create a segmentation mask, containing only the leaves and fruits of interest. We conducted our tests in different vineyards and we also obtained the above state- of-the-art segmentation results on public and custom datasets

IFAC2023 - Github page

Representation Learning for Point Clouds with Variational Autoencoders

In this work, we present a novel method that operates on depth images and with the use of geometric images is able to learn the representation of discrete 3D points based on variational autoencoders (VAE). Traditional VAE solutions failed to capture sharply compressed 3D data; however, with the constrained variational framework with additional hyperparameters, we managed to learn the representation of 3D data successfully. To do this, we applied a Bayesian optimization on the hyperparameter space of the VAE.

ECCVW/ACVR2022 - Github page

Embedded GPU based autonomous robot use cases

The focus of this paper is on the implementation and validation of navigation tasks for various autonomous mobile robot platforms using embedded GPUs. These platforms include ground vehicles, aerial vehicles, and mobile manipulators.

MED2022 - IEEE page

ToFNest: Efficient normal estimation for time-of-flight depth cameras

ToFNest: In this work, we propose an efficient normal estimation method for depth images acquired by Time-of-Flight (ToF) cameras based on feature pyramid networks (FPN). We perform the normal estimation starting from the 2D depth images, projecting the measured data into the 3D space and computing the loss function for the point cloud normal.

ToFNest ICCVW/ACVR2021 - Github page

Feature Pyramid Network Based Efficient Normal Estimation and Filtering for Time-of-Flight Depth Cameras

In this paper, an efficient normal estimation and filtering method for depth images acquired by Time-of-Flight (ToF) cameras is proposed. The method is based on a common feature pyramid networks (FPN) architecture. The normal estimation method is called ToFNest, and the filtering method ToFClean.

Sensors2021 - RG page

Education

Education

Ph.D. – Computer vision

Focus subject: Monitor vineyards using unmanned aerial vehicles and detect diseases using deep learning, robotics, and computer vision.

Master’s degree - Advanced Control Engineering in Manufacturing

Dissertation: 3D computer vision based on the deep learning and Time-of-Flight cameras

Focus subjects: Automation, Systems Engineering, Embedded Systems

My Projects and Publications

Thesis: Comparing 2D and 3D object detection in navigation

Experience

Work Experience

Technical University of Cluj-Napoca

Position: Research Assistant

Subjects: Monitoring vineyards and detecting diseases using robotics, deep learning, and computer vision

Tools: ROS, Matlab, PyTorch, Kotlin

Technical University of Cluj-Napoca

Position: Research Engineer

Subjects: 3D computer vision based on time-of-flight cameras and deep learning

Tools: ROS, Matlab, PyTorch

Marelli, Cluj-Napoca

Position: Analyst - Internship

Project: smart headlights in automotive