Hardik Jain

Dr.-Ing. Hardik Jain

Assistant Professor in CSIS Department at BITS Pilani, Rajasthan

PhD in 3D Computer Vision from Technische Universität Berlin, Germany

Doctoral Thesis

Deep Learning Mesh Parameterization of 3D Shapes: For 3D Reconstruction, Shape Generation, Noise Filtering, and Mobile Rendering

Technische Universität Berlin, Germany (Dec 2022)

Thesis Presentation

Deep learning has made remarkable progress in extracting meaningful information from visual sensory 2D data. The regularized grid representation of the 2D image data makes it convenient to apply efficient convolutional kernels. Comparatively, the 3D data, specifically the widely used surface meshes are less utilized in deep learning applications. Along with the added dimension, this is attributed to the irregular representation of the mesh structure. This thesis is motivated by the performance of 2D convolutional kernels on regular grid and tries to employ them for 3D surface meshes. The inherent irregular connectivity of surface mesh can be regularized by borrowing the concept of mesh parameterization from computer graphics domain. The regularized surface meshes are trained in a deep convolutional neural network for 3D reconstruction from a single image, generative icosahedral mesh convolutional network and icosahedral mesh denoising. The efficient networks for 3D surface meshes proposed in this thesis use 2D convolutional kernels, making it possible to be inferenced on low-compute devices. To exhibit this capability, an android application for the 3D reconstruction network from a single image is designed and validated.

Research Publications

IMD-Net: a Deep Learning-based Icosahedral Mesh Denoising Network

IEEE Open Access Journal, April 2022

Jan Botsch, Hardik Jain, Olaf Hellwich

Paper Code

In this work, we propose a novel denoising technique, the icosahedral mesh denoising network (IMD-Net) for closed genus-0 meshes. A preprocessing step, exploiting the homeomorphism between genus-0 mesh and sphere, remeshes an irregular mesh using the regular mesh structure of a frequency subdivided icosahedron. Enabled by gauge equivariant convolutional layers arranged in a residual U-net, IMD-Net denoises the remeshing invariant to global mesh transformations as well as local feature constellations and orientations, doing so with a computational complexity of traditional conv2D kernel. The network is equipped with carefully crafted loss function that leverages differences between positional, normal and curvature fields of target and noisy mesh in a numerically stable fashion. In a first, two large shape datasets commonly used in related fields, ABC and ShapeNetCore, are introduced to evaluate mesh denoising. IMD-Net's competitiveness with existing state-of-the-art techniques is established using both metric evaluations and visual inspection of denoised models.

GenIcoNet: Generative Icosahedral Mesh Convolutional Network

International Conference on 3D Vision (3DV), London, UK, November 2021

Hardik Jain, Olaf Hellwich

Paper Supplementary Code Dataset Oral Presentation

In the past few decades, the computer vision domain has achieved outstanding success in learning 3D shapes for classification, segmentation and image-based reconstruction. However, deep networks are less explored for the generative task of obtaining new 3D shapes from the learned representation. This problem becomes more prominent for 3D shapes represented as surface meshes, mainly because the mesh structure lacks regularity, an essential property for training deep generative networks. In this work, we remedy this problem by proposing a generative icosahedral mesh convolutional network (GenIcoNet) that learns data distribution of surface meshes. Our end-to-end trainable network learns semantic representations using 2D convolutional filters on the regularized icosahedral meshes. Our experiments demonstrate that both the architectures of GenIcoNet are able to outperform existing networks trained on intermediate surface mesh representations.

Learning to Reconstruct Symmetric Shapes using Planar Parameterization of 3D Surface

IEEE International Conference on Computer Vision Workshops (ICCVW), Korea, November 2019

Hardik Jain, Manuel Wöllhaf, Olaf Hellwich

Paper Supplementary Poster Code

In this work, we try to reconstruct 3D shape from images by using a parameterized representation of the shape. We perform iterative parameterization of the surface to obtain a planar representation. This representation is encoded with surface information to generate 2D geometry images, which can be conveniently learned using traditional deep neural networks without additional overhead. Our experiments demonstrate that the proposed network learns detailed features and is able to reconstruct geometrically accurate shapes from single image.

Improving 3D Face Geometry by Adapting Reconstruction From Stereo Image Pair to Generic Morphable Model

International Conference on Information Fusion (FUSION), Germany, July 2016

Hardik Jain, Olaf Hellwich, R S Anand

Stereo reconstruction from image pairs is a standard method for 3D acquisition of human faces. Depending on available imagery and accuracy requirements the resulting 3D reconstructions may have deficits. In this work we remedy such deficits combining the 3D stereo reconstruction with a generic Morphable Model. Prior shape information can be obtained by already developed methods, which uses landmarks to fit a morphable model to a single image resulting in a second 3D reconstruction. This alternative to the stereo reconstruction is combined with it, allowing to prefer information from the single image reconstruction whenever the stereo reconstruction shows untypical deviations from the expected 3D features of a human face.

Master Thesis

Using Morphable Face Model to Improve Stereo Reconstruction and Visualising the Model on a Smartphone

IIT Roorkee, India (May 2016)

Thesis Presentation

Stereo Reconstruction from image pair is a standard method for 3D acquisition of human faces. Stereo surface reconstruction has lots of holes, and limited texture information. In this work we remedy such deficits combining the 3D stereo reconstruction with a generic Morphable Face Model. A Major part of the thesis is devoted to improvement in stereo face reconstruction pipeline by allowing to prefer information from the single image reconstruction whenever the stereo reconstruction shows untypical deviations. From a pair of stereo images, one is used for single image reconstruction and the combination gives the stereo model. The two reconstruction are then combined to result in the deformed face model. The fusion outcome results in geometrically more accurate face reconstruction. Finally, the resultant deformed face model is visually presented on a smartphone using cardboard, which addresses the modern trend of low cost devices in virtual 3D visualization.

Publications from Accomplished Projects

Calibration and Registration Method for Tomography-Based Laser-Guided Surgical Interventions using a 4-DOF Navigation Robot

IEEE International Symposium on Computer-Based Medical Systems (CBMS), June 2023

Samuel Müller, Olaf Hellwich, Daniel Szymanski, Hardik Jain, Timo Krüger

Atlas Medical Technologies, Berlin, Germany

A method for co-registering pre-surgically acquired tomographic data with a patient at time of surgical intervention is introduced. The system using this method consists of a registration element fixed to the skin of the patient at scan and intervention time, a camera, and a laser pointing device movable on a bow rail circumferencing the body of the patient. Camera and laser bow are firmly mounted and calibrated to each other. Pre-operative planning is done using the tomographic data. After moving the patient out of the gantry, the laser ray is steered to indicate position and orientation of a linear instrument (e.g. a needle) on the patient’s skin. It is shown, that the achievable accuracy of the method presented in this paper is sufficient for periradicular therapy.

Laser-Guided CT Intervention using Flexible Laser Bow

DGBMT Annual Conference on Biomedical Engineering (BMT), October 2021

Hardik Jain, Olaf Hellwich, Daniel Szymanski, Andreas Rose, Timo Krüger

Atlas Medical Technologies, Berlin, Germany

Paper Poster

Laser bow systems are used for intervention guidance by steering laser rays. Such systems have shown applications in biopsy, nasal surgeries, drainage, pain therapy. In this paper, we introduce an image-based laser guided CT intervention. Instead of relying on traditional laser bow rigidly mounted to the CT scanner, our approach performs registration making use of a patient tracker in image as well as CT data to find the orientation of laser bow with respect to the CT data. The image is acquired by a camera rigidly connected to the laser bow. The proposed approach allows movability to the laser bow system without compromising on accuracy. Using camera allows real-time view of the intervention and can be used for futuristic applications

A Power Efficiency Enhancements of a Multi-Bit Accelerator for Memory Prohibitive Deep Neural Networks

IEEE Open Journal of Circuits and Systems (OJCAS), January 2021

Suhas Shivaprakash, Hardik Jain, Olaf Hellwich, Friedel Gerfers

Chair of Mixed Signal Circuit Design, Technische Universität Berlin, Germany

Journal Paper Conference Paper Oral Presentation

State of art deep neural network (DNN) models are both memory prohibitive and computationally intensive with millions of connections. Employing these models for an embedded mobile application is resource limited with large power consumption and significant bandwidth requirement. In this paper, we propose a power efficient multi-bit neural network accelerator, where we employ the technique of truncating the partial sum (PSum) results from the previous layer before feeding it into the next layer.

Patient Motion Compensation for Photogrammetric Registration

International Conference on Computer Vision Theory and Applications (VISAPP), Malta, February 2020

Hardik Jain, Olaf Hellwich, Andreas Rose, Nicholas Norman, Dirk Much, Timo Krüger

Fiagon GmbH, Hennigsdorf, Germany

Paper Presentation Oral Presentation

In this work, we treat the task for multi-view monocular imagery acquiring both body surface as well as reference markers. To fulfill the high accuracy requirements the patient is not supposed to move while images are taken. An approach towards relaxing this demanding situation is to measure small movements of the patient, e.g. with help of an electromagnetic device, and to compensate for the measured motion prior to body surface triangulation. We present two approaches for motion compensation: disparity shift compensation, and moving cameras compensation - both capable of achieving patient registration qualitatively equivalent to motion-free registration.

Passive Classification of Source Printer using Text-line-level Geometric Distortion Signatures from Scanned Images of Printed Documents

Multimedia Tools and Applications (MTAP), Springer, December 2019

Hardik Jain, Sharad Joshi, Gaurav Gupta, Nitin Khanna

Department of Electrical Engineering, IIT Gandhinagar, India

Paper Journal Paper

In this digital era, one thing that still holds the convention is a printed archive. Printed documents find their use in many critical domains such as contract papers, legal tenders and proof of identity documents. As more advanced printing, scanning and image editing techniques are becoming available, forgeries on these legal tenders pose a serious threat. Ability to easily and reliably identify source printer of a printed document can help a lot in reducing this menace.

An Enhanced Statistical Approach for Median Filtering Detection using Difference Image

IEEE International Conference on Identity, Security and Behavior Analysis (ISBA), India, February 2017

Hardik Jain, Joydeep Das, Hemant Kumar Verma, Nitin Khanna

Department of Electrical Engineering, IIT Gandhinagar, India

Paper Presentation

In image forensics, detection of image forgeries involving non-linear manipulations have received a great deal of interest in recent past. Median filtering (MF) is one such non-linear manipulation technique which is quite often used in number of applications such as to hide impulse noises. Unlike other linear filtering operations, non-linear characteristics of median filtering makes it harder to detect using traditional forensics methods designed for detecting linear operations.