Master2: 3D Representations for Deep Learning

Context and objective
The field of computer vision has recently experienced a drastic transformation since the adoption of machine learning techniques, in particular deep learning, to solve problems such as recognition and detection in images and video. However, although the effects and benefits of applying these techniques to 3D and 4D (dynamic scenes)  modelling have been  anticipated, they have yet to be fully and formally investigated. In particular it is expected  that they increase model precision with learned priors, simplify the acquisition process by exploiting learned information and reduce data sizes with learned statistical models.

However, successful learning techniques in 2D computer vision, e.g. convolutional deep networks, do not easily generalize to 3D and 4D data since the regular grid assumption with 2D images does not have a straightforward equivalent in 3D-4D. In order to benefit from these techniques , new representations  that can learn moving shape properties must be proposed. This is the objective of this project.

Recent related works in the fields of machine learning, computer vision and computer graphics have explored two main strategies with 3D geometry. A first spatial or extrinsic strategy consists in embedding the shape geometry into Euclidean structures over which standard CNN tools can be applied. This can be 2D structures, as with depth image projections[1], or 3D structure, for instance voxels[2]. A second category considers instead spectral or intrinsic representations that provide Fourier like decompositions of shapes over spectral domains[3,4]. These eigen decompositions enable then well defined convolutions and multi-scale analysis over non Euclidean manifolds.

In this project we will explore strategies in the first category since spectral techniques have difficulties with real and noisy captured data. To this aim, volumetric representations, as recently introduced by the INRIA Morpheo team for shape tracking[5]  will be investigated. Such representations are regular volumetric tessellations of shapes. Contrary to voxel attached to the observation domain they are attached to shapes and they can be consistent over time sequences, as well as over different shapes, enabling therefore spatial approaches to CNNs.

Jakob Verbeek INRIA-Thot team
Edmond Boyer INRIA-Morpheo team

Student profile
A master student in computer science or applied mathematics.
Strong skills in programming.
Knowledge and experience in machine learning and computer vision is a plus.

Time schedule
5-7 months starting from February-March 2017.

INRIA Grenoble, teams Thot-Morpheo.

How to apply
Use the job-application page on this site to apply and provide a full CV and possibly
references and graduation marks.

[1] Dense Human Body Correspondences Using Convolutional Networks,
L. Wei, Q. Huang, D. Ceylan, E. Vouga, H. Li, CVPR 2016.

[2] 3D ShapeNets: A Deep Representation for Volumetric Shape Modeling,
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang and J. Xiao, CVPR 2015.

[3] Learning shape correspondence with anisotropic convolutional neural networks,
D. Boscaini, J .Masci, E. Rodolà, M. Bronstein, arXiv:1605.06437, 2016.

[4] Spectral Networks and Deep Locally Connected Networks on Graphs,
J. Bruna, W. Zaremba, A. Szlam, Y. Lecun, ICLR 2014.

[5] An Efficient Volumetric Framework for Shape Tracking,
B. Allain, J.-S. Franco, E. Boyer, CVPR 2015.