IMPROVE 2022 Abstracts


Area 1 - Fundamentals

Full Papers
Paper Nr: 11
Title:

Detecting Tourette’s Syndrome in Anatomical Regions of the Brain through MRI Analysis and Naive Bayes Classifier

Authors:

Murilo Costa De Barros, Kaue N. Duarte, Wang-Tso Lee, Chia-Jui Hsu and Marco A. Garcia De Carvalho

Abstract: Tourette Syndrome (TS) is an inherited condition represented by involuntary vocal and motor movements (tics). Nowadays, there is no available cure, only psychological treatments to inhibit it, requesting the use of medication in rare cases. The importance of diagnosing Tourette’s in childhood enables a range of possible treatments that would decrease the intensity of TS, and in some cases, even stop it. In most cases, the TS diagnosis considers only clinical assessment. Analyzing the brain and its anatomical regions via imaging data can provide relevant information in order to assist doctors. This work aims to propose an approach in order to identify the most affected anatomical region of the brain by TS. The approach consists of three major steps: (i) the brain is segmented in its anatomical regions; (ii) texture patterns are extracted via Gray-level Co-occurrence Matrix for each region; finally, (iii) each brain region is evaluated using Naive Bayes classifier, determining the presence or absence of TS. We use MRI images from 68 subjects around nine years old equally divided whether has TS or not. The regions from the limbic system were relevant in the diagnosis: right-side accumbens reached 68% of accuracy; posterior and central parts of corpus callosum ranked in the top four positions. Combining the top five most predictive regions led our approach to reach 78% of accuracy. The results were significant in detecting the most affected regions in TS and providing a reliable approach to classify the brain regions accordingly.
Download

Paper Nr: 12
Title:

An Image Quality Assessment Method based on Sparse Neighbor Significance

Authors:

Selcuk I. Aydi

Abstract: In this paper, the image quality assessment problem is tackled from a sparse coding perspective, and a new automated image quality assessment algorithm is presented. Specifically, the input image is first divided into non-overlapping blocks and sparse coding is used to reconstruct a central sub-block using the neighboring sub-blocks as dictionaries. The resulting 2D sparse vectors from each neighboring sub-block, are devised as significance maps that are then used in similarity measures between the reference and distorted images. The proposed method is compared against various recently introduced shallow and deep methods across four datasets and multiple distortion types. The experimental results that have been obtained show that it possesses a strong correlation with the Human Visual System and outperforms its counterparts.
Download

Paper Nr: 24
Title:

Assessment of Dose Reduction Strategies in Wavelength-selective Neutron Tomography

Authors:

Victoria H. DiStefano, Jacob M. LaManna, David L. Jacobson, Paul A. Kienzle, Daniel S. Hussey and Peter Bajcsy

Abstract: The goal of this study is to determine variable relationships and a computational workflow that yield the highest quality of three-dimensional reconstructions in neutron imaging applications with reduced number of projections angles. Neutrons interact with matter primarily through the strong nuclear force providing unique image contrast modes. Accessing many of these contrast modes requires defining the energy of the neutron beam, resulting in long exposure times for a single two-dimensional projection image. To collect of order 100 tomograms at different neutron wavelengths within a reasonable time frame (less than 1 week) suggests the use of dose reduction tomography reconstruction algorithms. We identified and evaluated the main factors affecting the quality of the 3D tomographic reconstruction in the computational image workflow: the projection number, the reconstruction method, and the post-processing method. This study reports several relationships between 3D reconstruction quality metrics and acquisition time. Based on the established relationships, the performance of a seeded simultaneous iterative reconstruction technique (SIRT) yielded improved image quality and more accurate estimates of the reconstructed attenuation values compared to a SIRT without a priori information or a trained neural network based on a mixed scale dense network.
Download

Paper Nr: 36
Title:

BBBD: Bounding Box Based Detector for Occlusion Detection and Order Recovery

Authors:

Kaziwa Saleh and Zoltán Vámossy

Abstract: Occlusion handling is one of the challenges of object detection and segmentation, and scene understanding. Because objects appear differently when they are occluded in varying degree, angle, and locations. Therefore, determining the existence of occlusion between objects and their order in a scene is a fundamental requirement for semantic understanding. Existing works mostly use deep learning based models to retrieve the order of the instances in an image or for occlusion detection. This requires labelled occluded data and it is time-consuming. In this paper, we propose a simpler and faster method that can perform both operations without any training and only requires the modal segmentation masks. For occlusion detection, instead of scanning the two objects entirely, we only focus on the intersected area between their bounding boxes. Similarly, we use the segmentation mask inside the same area to recover the depth-ordering. When tested on COCOA dataset, our method achieves +8% and +5% more accuracy than the baselines in order recovery and occlusion detection respectively.
Download

Short Papers
Paper Nr: 5
Title:

Student Engagement from Video using Unsupervised Domain Adaptation

Authors:

Chinchu Thomas, Seethamraju Purvaj and Dinesh B. Jayagopi

Abstract: Student engagement is the key to successful learning. Measuring student engagement is of utmost importance in the current global scenario where learning happens over online platforms. Automatic analysis of student engagement, in offline and online social interactions, is largely carried out using supervised machine learning techniques. Recent advances in deep learning have improved performance, albeit at the cost of collecting a large volume of labeled data, which can be tedious and expensive. Unsupervised domain adaptation using the deep learning technique is an emerging and promising direction in machine learning when labeled data is less or absent. Motivated by this, we pose our research question: ”Can deep unsupervised domain adaptation techniques be used to infer student engagement in classroom videos with unlabeled data?” In our work, two such classic techniques i.e. Joint Adaptation Network and adversarial domain adaptation using Wasserstein distance were explored for this task and posed as a binary classification problem along with different base models such as ResNet and I3D. The unsupervised domain adaptation results show significant improvement over the unsupervised baseline methods.
Download

Paper Nr: 10
Title:

ANNs Dream of Augmented Sheep: An Artificial Dreaming Algorithm

Authors:

Gustavo Assunção, Miguel Castelo-Branco and Paulo Menezes

Abstract: Sleep is a fundamental daily process of several species, during which the brain cycles through critical stages for both resting and learning. A phenomenon known as dreaming may occur during that cycle, whose purpose and functioning have yet to be agreed upon by the research community. Despite the controversy, some have hypothesized dreaming to be an overfitting prevention mechanism, which enables the brain to corrupt its small amount of statistically similar observations and experiences. This leads to better cognition through non-rigid consolidation of knowledge and memory without requiring external generalization. Although this may occur in numerous ways depending on the basis theory, some appear more adequate for homologous methodology in machine learning. Overfitting is a recurrent problem of artificial neural network (ANN) training, caused by data homogeneity/reduced size and which is often resolved by manual alteration of data. In this paper we propose an artificial dreaming algorithm, following the mentioned hypothesis, for tackling overfitting in ANNs using autonomous data augmentation and interpretation based on a network’s current state of knowledge.
Download

Paper Nr: 15
Title:

Real-Time 3D Object Detection and Recognition using a Smartphone

Authors:

Jin Chen and Zhigang Zhu

Abstract: Real-time detection of 3D obstacles and recognition of humans and other objects is essential for blind or low- vision people to travel not only safely and independently but also confidently and interactively, especially in a cluttered indoor environment. Most existing 3D obstacle detection techniques that are widely applied in robotic applications and outdoor environments often require high-end devices to ensure real-time performance. There is a strong need to develop a low-cost and highly efficient technique for 3D obstacle detection and object recognition in indoor environments. This paper proposes an integrated 3D obstacle detection system implemented on a smartphone, by utilizing deep-learning-based pre-trained 2D object detectors and ARKit- based point cloud data acquisition to predict and track the 3D positions of multiple objects (obstacles, humans, and other objects), and then provide alerts to users in real time. The system consists of four modules: 3D obstacle detection, 3D object tracking, 3D object matching, and information filtering. Preliminary tests in a small house setting indicated that this application could reliably detect large obstacles and their 3D positions and sizes in the real world and small obstacles’ positions, without any expensive devices besides an iPhone.
Download

Paper Nr: 34
Title:

Relighting Backlight and Spotlight Images using the von Kries Model

Authors:

Michela Lecca

Abstract: Improving the quality of backlight and spotlight images is a challenging task. Indeed, these pictures include both very bright and very dark regions with unreadable content and details. Restoring the visibility in these regions has to be performed without over-enhancing the bright regions, thus without generating unpleasant artifacts. To this end, some algorithms segment the image in bright and dark regions, re-work them separately by different enhancing functions. Other algorithms process the input image at multiple scales or with different enhancement techniques. All these methods merge the results together paying attention to the edge areas. The present work proposes a novel approach, called REK and implementing a relighting technique based on the von Kries model. REK linearly increases the channel intensities of the input image, obtaining a new brighter image, which is then summed up to the input one with weights computed from the input image and taking high values on the dark regions while low values on the bright ones. In this way, REK improves the quality of backlight and spotlight pictures without needing for segmentation and multiple analysis, while granting satisfactory performance at a computational complexity proportional to the number of image pixels.
Download

Paper Nr: 13
Title:

Simulation-to-Reality Domain Adaptation for Offline 3D Object Annotation on Pointclouds with Correlation Alignment

Authors:

Weishuang Zhang, B. R. Kiran, Thomas Gauthier, Yanis Mazouz and Theo Steger

Abstract: Annotating objects with 3D bounding boxes in LiDAR pointclouds is a costly human driven process in an autonomous driving perception system. In this paper, we present a method to semi-automatically annotate real-world pointclouds collected by deployment vehicles using simulated data. We train a 3D object detector model on labeled simulated data from CARLA jointly with real world pointclouds from our target vehicle. The supervised object detection loss is augmented with a CORAL loss term to reduce the distance between labeled simulated and unlabeled real pointcloud feature representations. The goal here is to learn representations that are invariant to simulated (labeled) and real-world (unlabeled) target domains. We also provide an updated survey on domain adaptation methods for pointclouds.
Download

Area 2 - Methods and Techniques

Short Papers
Paper Nr: 25
Title:

Applying Genetic Algorithm and Image Quality Assessment for Reproducible Processing of Low-light Images

Authors:

Olivier Parisot and Thomas Tamisier

Abstract: Reproducible images preprocessing is fundamental in computer vision, whether to fairly compare process algorithms or to prepare new images corpus. In this paper, we propose an approach based on genetic algorithm combined to Image Quality Assessment methods to obtain a reproducible sequence of transformations for improving low-light images. Preliminary tests have been performed on state-of-the-art benchmarks.
Download

Area 3 - Imaging

Short Papers
Paper Nr: 14
Title:

Data Fusion of PRISMA Satellite Imagery for Asbestos-containing Materials: An Application on Balangero’s Mine Site (Italy)

Authors:

Giuseppe Bonifazi, Giuseppe Capobianco, Riccardo Gasbarrone, Silvia Serranti, Sergio Bellagamba and Daniele Taddei

Abstract: In the last few decades, the procedure for identifying, classifying and mapping the asbestos-containing materials (ACMs), and contaminated areas, is considered one of the most important aspects for the purpose of remediation. This task, carried out by skilled workers, can be very long and difficult to perform, and it can also increase the risk of inhalation of asbestos fibers. The identification and characterization of areas contaminated by asbestos using remote sensing techniques represent a valid alternative to census methods, traditionally based on visual inspection of surfaces and in situ sampling to be analyzed later in the laboratory. The aim of this work was to explore the possibilities of using machine learning techniques to identify possible asbestos-contaminated areas and ACMs by using PRISMA satellite imagery in areas where chrysotile was once extracted, processed and used in asbestos-containing products (ACPs). The study area is located in the Balangero’s asbestos mine site. More in detail, Principal Component Analysis (PCA) was performed on a Visible, Near-InfraRed and Short-Wave InfraRed (VNIR-SWIR) PRISMA image to reduce data dimensionality and used as an exploratory analysis tool. Classification And Regression Trees (CART) technique was finally utilized to test a classification of six predetermined classes on the panchromatic image.
Download

Area 4 - Machine Learning

Full Papers
Paper Nr: 4
Title:

Real-time Arabic Sign Language Recognition based on YOLOv5

Authors:

Sabrina Aiouez, Anis Hamitouche, Mohamed S. Belmadoui, Khadidja Belattar and Feryel Souami

Abstract: Sign language is the most common communication mode of deaf and mute community. However, hearing people do not generally know this language. So, an automatic sign langage recognition is required to facilitate and better understand interactions with such people. However, one of the main challlenges in this field is the real-time sign recognition. That is why, deep learning-based object detection models can be used to improve the recognition performance (in terms of time and accuracy). In this paper, we present a real-time system that allows the detection and recognition of hand postures intended for the Arabic sign language alphabet. To do so, we constructed a dataset of 28 Arabic signs containing around 15,000 images acquired with different sizes of hands, lighting conditions, backgrounds and with/without accessories. We then trained and tested different variants of YOLOv5 on the constructed dataset. The conducted experiments on our ArSL real-time recognition system show that the adapted YOLOv5 is more effective than Faster R-CNN detector.
Download

Paper Nr: 18
Title:

Transfer Learning Gaussian Anomaly Detection by Fine-tuning Representations

Authors:

Oliver Rippel, Arnav Chavan, Chucai Lei and Dorit Merhof

Abstract: Current state-of-the-art anomaly detection (AD) methods exploit the powerful representations yielded by large-scale ImageNet training. However, catastrophic forgetting prevents the successful fine-tuning of pretrained representations on new datasets in the semi-supervised setting, and representations are therefore commonly fixed. In our work, we propose a new method to overcome catastrophic forgetting and thus successfully fine-tune pre-trained representations for AD in the transfer learning setting. Specifically, we induce a multivariate Gaussian distribution for the normal class based on the linkage between generative and discriminative modeling, and use the Mahalanobis distance of normal images to the estimated distribution as training objective. We additionally propose to use augmentations commonly employed for vicinal risk minimization in a validation scheme to detect onset of catastrophic forgetting. Extensive evaluations on the public MVTec dataset reveal that a new state of the art is achieved by our method in the AD task while simultaneously achieving anomaly segmentation performance comparable to prior state of the art. Further, ablation studies demonstrate the importance of the induced Gaussian distribution as well as the robustness of the proposed fine-tuning scheme with respect to the choice of augmentations.
Download

Paper Nr: 37
Title:

Image-based Lesion Classification using Deep Neural Networks

Authors:

Ákos Hermann and Zoltán Vámossy

Abstract: This research explores the topic of moles in cancer using a machine learning approach, with the aim of designing and implementing a system that can determine whether a mole shows a melanoma-like abnormality based on 2D input photographs, and thus whether further examination by a specialist is required. The target system is built around a general-purpose convolutional network, GoogleNet InceptionV3, which has been retrained for the task using a transfer learning technique. In addition to the system, an automated pre- processing phase has been defined to reduce and eliminate anomalies and noise in each sample by means of image processing operations. In conclusion, the system provided 156 correct diagnoses in 180 test cases, indicating a test accuracy of 86.67%, making it an effective melanoma diagnostic tool.
Download

Short Papers
Paper Nr: 3
Title:

Advanced Assisted Car Driving in Low-light Scenarios

Authors:

Francesco Rundo, Roberto Leotta, Angelo Messina and Sebastiano Battiato

Abstract: The robust identification, tracking and monitoring of driving-scenario moving objects represents an extremely critical task in the safe driving target of the latest generation cars. This accomplishment becomes even more difficult in a poor light driving scenarios such as driving at night or in rough weather conditions. Since the driving detected objects could represent a significant collision risk, the aim of the proposed pipeline is to address the issue of real time low-light driving salient objects detection and tracking. By using a combined time-transient non-linear deep architecture with convolutional network embedding self attention mechanism, the authors will be able to perform a real-time assessment of the low-light driving scenario frames. The downstream deep backbone learns such features from the driving frames thus improved in terms of light exposure in order to identify and segment salient objects. The implemented algorithm is ongoing to be ported over an hybrid architectures consisting of a an embedded system with SPC5x Chorus device with an automotive-grade system based on STA1295 MCU core. The collected experimental results confirmed the effectiveness of the proposed approach.
Download

Paper Nr: 27
Title:

Using Keypoint Matching and Interactive Self Attention Network to Verify Retail POSMs

Authors:

Harshita Seth, Sonaal Kant and Muktabh M. Srivastava

Abstract: Point of Sale Materials(POSM) are the merchandising and decoration items that are used by companies to communicate product information and offers in retail stores. POSMs are part of companies’ retail marketing strategy and are often applied as stylized window displays around retail shelves. In this work, we apply computer vision techniques to the task of verification of POSMs in supermarkets by telling if all desired components of window display are present in a shelf image. We use Convolutional Neural Network based unsupervised keypoint matching as a baseline to verify POSM components and propose a supervised Neural Network based method to enhance the accuracy of baseline by a large margin. We also show that the supervised pipeline is not restricted to the POSM material it is trained on and can generalize. We train and evaluate our model on a private dataset composed of retail shelf images.
Download

Paper Nr: 2
Title:

Objects Motion Detection in Domain-adapted Assisted Driving

Authors:

Francesco Rundo, Roberto Leotta and Sebastiano Battiato

Abstract: The modern Advanced Driver Assistance Systems (ADAS) contributed to reduce road accidents due to the driver’s inexperience or unexpected scenarios. ADAS technologies allow the intelligent monitoring of the driving scenario. Recently, estimation of the visual saliency i.e. the part of the visual scene in which the driver put high visual attention has received significant research interests. This work makes further contributions to video saliency investigation for automotive applications. The difficulty to collect robust labeled data as well as the several features of the driving scenarios require the usage of such domain adaptation methods. A new approach to Gradient-Reversal domain adaptation in deep architectures is proposed. More in detail, the proposed pipeline enables an intelligent identification and segmentation of the motion salient objects in different driving scenarios and domains. The performed test results confirmed the effectiveness of the overall proposed pipeline.
Download

Paper Nr: 23
Title:

Classification of EEG Motor Imagery Tasks Utilizing 2D Temporal Patterns with Deep Learning

Authors:

Anup Ghimire and Kazim Sekeroglu

Abstract: This study aims to explore the decoding of human brain activities using EEG signals for Brain Computer Interfaces by utilizing a multi-view spatiotemporal hierarchical deep learning method. In this study, we ex- plored the transformation of 1D temporal EEG signals into 2D spatiotemporal EEG image sequences as well as we explored the use of 2D spatiotemporal EEG image sequences in the proposed multi-view hierarchical deep learning scheme for recognition. For this work, the PhysioNet EEG Motor Movement/Imagery Dataset is used. Proposed model utilizes Conv2D layers in a hierarchical structure, where a decision is made at each level individually by using the decisions from the previous level. This method is used to learn the spatiotem- poral patterns in the data. Proposed model achieved a competitive performance compared to the current state of the art EEG Motor Imagery classification models in the binary classification paradigm. For the binary Imagined Left Fist versus Imagined Right Fist classification, we were able to achieve 82.79% average vali- dation accuracy. This level of validation accuracy on multiple test dataset proves the robustness of the pro- posed model. At the same time, the models clearly show an improvement due to the use of the multi-layer and multi-perspective approach.
Download

Paper Nr: 35
Title:

Improved Assessment of Offshore Helideck Marking Standards’ Compliance using Optimized Machine Learning Principles in the U.S. Gulf of Mexico

Authors:

Mitchell Bosman, Kazim Sekeroglu and Ghassan Alkadi

Abstract: There is an unknown number of offshore helidecks in the U.S. Gulf of Mexico that comply with a specific marking standard. This is a direct result from the lack of national regulations enforced. The purpose of this research is to improve the assessment of offshore helideck marking standards’ compliance using optimized machine learning principles. Using two different phases and employing the transfer learning approach, an optimized machine learning algorithm is generated to classify offshore helidecks from photographs into CAP 437, HSAC RP 161 or None. Results show that this model can identify marking standards being used with an accuracy of 95.7 percent. Therefore, demonstrating that the machine learning principles used can improve the assessment of offshore helideck marking standards’ compliance.
Download

Area 5 - Applications

Full Papers
Paper Nr: 22
Title:

Reversible Fragile Medical Image Watermarking Scheme Resistant to Malicious Tampering Attacks

Authors:

Victor Fedoseev and Anna Denisova

Abstract: Paper is aimed to eliminate a significant drawback of existing schemes for protecting medical images from tampering using fragile watermarking: instability to “malicious tampering attacks”. In such attacks, an intruder, while tampering image content, keeps unchanged an inconspicuous additional component that contains a fragile watermark. In watermarking schemes based on least significant bit (LSB) embedding or quantization index modulation (QIM), such a component is the remainder of dividing pixel values by some number corresponding to embedding parameters. In this paper, we present a QIM-based fragile watermarking method resistant to malicious tampering due to variation in quantization steps. This fact is justified theoretically and confirmed experimentally. For use in real systems for processing and analyzing medical images, a reverse watermarking scheme based on this method is proposed. The reversibility property is achieved by the division of an image into a region of interest (ROI) and a region of noninterest (RONI) and dual watermarking.
Download

Short Papers
Paper Nr: 1
Title:

Removing Automatically the Ambiguity in Wind Direction Retrieved from SAR Images

Authors:

Maria Conceição da Proença

Abstract: The evaluation of the wind resource in large areas to study the viability of wind farms is ideally studied using synthetic aperture radar (SAR) images in which the direction of the wind can be mapped from its effects on the water surface. Methods in use usually assume a fixed direction from a measurement for the whole image or interpolate the direction of wind fields from numerical weather models, that can be non-coincident in time with the SAR snapshot and of much less spatial resolution. The problem remains in the directional ambiguity of 180 degrees. This work presents three indexes to identify and validate initial “anchor vectors” that could be used as an aid in the complex process of remove this ambiguity, using wind shadows in the water near the coastline. These indexes consider several hypotheses to provide for local variability such as physiographic accidents, the eccentricity of the shadows and the effect of bay-shaped areas, all quantified through image processing methods. Comparing the results with the reference wind field provided by ESA for the time of acquisition of the ENVISAT-ASAR image used we could conclude that this is a promising line of work.
Download

Paper Nr: 6
Title:

Two-step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real

Authors:

Yan Y. Aaren, George Bebis and Mircea Nicolescu

Abstract: The COVID-19 spread raised urgent requirements for masked face recognition and detection tasks. However, the current masked face datasets are insufficient. To alleviate the limitation of data, we proposed a two-step data augmentation that combines rule-based mask warping with unpaired image-to-image translation. Our qualitative evaluations showed that our method achieved noticeable improvements compared to the rule-based warping alone and complemented results from other state-of-the-art GAN-based generation methods, such as IAMGAN. The non-mask change loss and the noise input we used to improve training showed effectiveness. We also provided an analysis of potential future directions based on observations of our experiments.
Download

Paper Nr: 17
Title:

Aircraft Type Recognition in Remote Sensing Images using Mean Interval Kernel

Authors:

Jaya Sharma, Rajeshreddy Datla, Yenduri Sravani, Vishnu Chalavadi and Krishna M. C.

Abstract: Structural characteristics representation and their fine variations are crucial for the recognition of different types of aircrafts in remote sensing images. Aircraft type classification across different sensor remote sensing images by spectral and spatial resolutions of objects in an image involves variable length spatial pattern identification. In our proposed approach, we explore dynamic kernels to deal with variable length spatial patterns of aircrafts in remote sensing images. A Gaussian mixture model (GMM), namely, structure model (SM) is trained over aircraft scenes to implicitly learn the local structures using the spatial scale-invariant feature transform (SIFT) features. The statistics of SM are used to design dynamic kernel, namely, mean interval kernel (MIK) to deal with the spatial changes globally in the identical scene and preserve the similarities in local spatial structures. The efficacy of the proposed method is demonstrated on the multi-type aircraft remote sensing images (MTARSI) benchmark dataset (20 distinct kinds of aircraft) using MIK. Also, we compare the performance of the proposed approach with other dynamic kernels, such as supervector kernel (SVK) and intermediate matching kernel (IMK).
Download

Paper Nr: 19
Title:

An Improved YOLOv5 for Real-time Mini-UAV Detection in No Fly Zones

Authors:

Tijeni Delleji and Zied Chtourou

Abstract: In the past few years, the manufacturing technology of mini-UAVs has undergone major developments. Therefore, the early warning optical drone detection, as an important part of intelligent surveillance, is becoming a global research hotspot. In this article, the authors provide a prospective study to prevent any potential hazards that mini-UAVs may cause, especially those that can carry payloads. Subsequently, we regarded the problem of detecting and locating mini-UAVs in different environments as the problem of detecting tiny and very small objects from an air image. However, the accuracy and speed of existing detection algorithms do not meet the requirements of real-time detection. For solving this problem, we developed a mini-UAV detection model based on YOLOv5. The main contributions of this research are as follows: (1) a mini-UAV dataset of air pictures was prepared using Dahua multi-sensor camera; (2) a tiny and very small object detection layers are added to improve the model’s ability to detect mini-UAVs. The experimental results show that the overall performance of the improved YOLOv5 is better than the original. Therefore, the proposed mini-UAV detection technology can be deployed in monitor center in order to protect a No Fly Zone or a restricted area.
Download

Paper Nr: 30
Title:

UAV Path Planning based on Road Extraction

Authors:

Chang Liu and Tamás Szirányi

Abstract: With the development of science and technology, UAVs are increasingly being used and serving humans, especially in the wilderness environment, due to their portability and the ease with which they can reach places that are beyond human reach. In this paper, we present a technique for drones to help humans intelligently plan routes in a field environment. Our approach is firstly based on road extraction techniques in the field of image segmentation, using state-of-the-art D-LinkNet to extract roads from images captured by real-time UAVs. Secondly, the extracted road information is analyzed, the set of main roads and that of the secondary road are distinguished according to the width and the real-time road conditions on the ground, and different weights are assigned to them. Finally, the A star algorithm is used to calculate a route plan with weights based on the human-defined starting and ending points to obtain the optimal route. The results of our task are the simulations on publicly available datasets to show that the method works well to provide the optimal intelligent routes in real-time for people in the field.
Download

Paper Nr: 31
Title:

Detection, Tracking, and Speed Estimation of Vehicles: A Homography-based Approach

Authors:

Kaleb Blankenship and Sotirios Diamantas

Abstract: In this research we present a parsimonious yet effective method to detect, track, and estimate the speed of multiple vehicles using a single camera. This research aims to determine the efficacy of homography-based speed estimations derived from details extracted from objects of interest. At first, a neural network trained to detect vehicles outputs bounding boxes. The output of the neural network serves as an input to a multi-object tracking algorithm which tracks the detected vehicles while, at the same time, their speed is estimated through a homography-based approach. This algorithm makes no assumptions about the camera, the distance to the objects, or the direction of motion of vehicles with respect to the camera. This method proves to be accurate and efficient with minimal assumptions. In particular, only the mean dimensions of a passenger vehicle are assumed to be known and, using the homography matrix derived from the corners of a vehicle, the speed of any vehicle in the frame irrespective of its motion direction and regardless of its size is able to be estimated. In addition, only a single point from each tracked vehicle is needed to infer its speed, avoiding repeatedly computing the homography matrix for each and every vehicle, thus reducing the time and computational complexity of the algorithm. We have tested our algorithm on a series of known datasets, the results from which validate the approach.
Download

Paper Nr: 33
Title:

Source Attribution of Modern Multi-camera Smartphones

Authors:

Manoranjan Mohanty

Abstract: The PRNU (Photo Response Non-Uniformity)-based source camera attribution is a useful method for verifying if a camera has taken an image (e.g., a crime image). Although this method has matured for images taken by single-camera smartphones, its usability is yet unknown for multi-camera smartphones. A multi-camera smartphone, such as iPhone XS or Huawei P20 Pro, combines output from a number of rear cameras for providing high-quality images. In this paper, we study the effectiveness of the PRNU-based method for a multi-camera smartphone using two simple approaches: (i) multi-fingerprint verification, and (ii) mixed fingerprint verification. In the verification process, the first approach uses fingerprint from each camera whereas the second approach uses a mixed-fingerprint that is obtained by averaging the fingerprints from all cameras. The experimental result shows that the proposed approaches are useful for some camera models. For some other camera models, a more sophisticated method, however, is required.
Download