Abstract
Registration maps or warps form a key element in Shape-from-Template (SfT). They relate the template with the input image, which contains the projection of the deformed surface. Recently, it was shown that isometric SfT can be solved analytically if the warp and its first-order derivatives are known. In practice, the warp is recovered by interpolating a set of discrete template-to-image point correspondences. This process relies on smoothness priors but ignores the 3D geometry. This may produce errors in the warp and poor reconstructions. In contrast, we propose to create a 3D consistent warp, which technically is a very challenging task, as the 3D shape variables must be eliminated from the isometric SfT equations to find differential constraints for the warp only. Integrating these constraints in warp estimation yields the isowarp, a warp 3D consistent with isometric SfT. Experimental results show that incorporating the isowarp in the SfT pipeline allows the analytic solution to outperform non-convex 3D shape refinement methods and the recent DNN-based SfT methods. The isowarp can be properly initialized with convex methods and its hyperparameters can be automatically obtained with cross-validation. The isowarp is resistant to 3D ambiguities and less computationally expensive than existing 3D shape refinement methods. The isowarp is thus a theoretical and practical breakthrough in SfT.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agisoft Photoscan 1.0.4 (2014). http://www.agisoft.ru/products/photoscan
Agudo, A., Agapito, L., Calvo, B., & Montiel, J. M. (2014) Good vibrations: A modal analysis approach for sequential non-rigid structure from motion. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1558–1565
Agudo, A., Moreno-Noguer, F., Calvo, B., & Montiel, J. M. M. (2016). Sequential non-rigid structure from motion using physical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(5), 979–994.
Bartoli, A., Gérard, Y., Chadebecq, F., Collins, T., & Pizarro, D. (2015). Shape-from-template. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(10), 2099–2118.
Bartoli, A., Perriollat, M., & Chambon, S. (2010). Generalized thin-plate spline warps. International Journal of Computer Vision, 88(1), 85–110.
Bay, H., Tuytelaars, T., Van Gool, L. (2006). Surf: Speeded up robust features. In European conference on computer vision, pp. 404–417. Springer.
Blender (1994). Online Community: Blender - a 3D modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam. http://www.blender.org
Bookstein, F. L. (1989). Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis & Machine Intelligence, 6, 567–585.
Brunet, F., Bartoli, A., & Hartley, R. I. (2014). Monocular template-based 3D surface reconstruction: Convex inextensible and nonconvex isometric methods. Computer Vision and Image Understanding, 125, 138–154.
Brunet, F., Bartoli, A., Navab, N., Malgouyres, R. et al. (2009). NURBS warps. In British machine vision conference.
Casillas-Perez, D., Pizarro, D., Fuentes-Jimenez, D., Mazo, M., & Bartoli, A. (2018). Equiareal shape-from-template. Journal of Mathematical Imaging and Vision,. https://doi.org/10.1007/s10851-018-0862-5.
Chhatkuli, A., Pizarro, D., Bartoli, A., & Collins, T. (2017). A stable analytical framework for isometric shape-from-template by surface integration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(5), 833–850.
Do Carmo, M. P. (2016). Differential geometry of curves and surfaces: Revised and updated second edition. Courier Dover Publications.
Dubitzky, W., Granzow, M., & Berrar, D. P. (2007). Fundamentals of data mining in genomics and proteomics. Berlin: Springer.
Fayad, J., Russell, C., & Agapito, L. (2011). Automated articulated structure and 3d shape recovery from point correspondences. In 2011 International conference on computer vision, pp. 431–438. IEEE.
Fuentes-Jiménez, D., Casillas-Pérez, D., Pizarro-Pérez, D., Collins, T., & Bartoli, A. (2018). Deep shape-from-template: Wide-baseline, dense and fast registration and deformable reconstruction from a single image. CoRR abs/1811.07791. arXiv:1811.07791
Golyanik, V., Shimada, S., Varanasi, K., & Stricker, D. (2018). Hdm-net: Monocular non-rigid 3d reconstruction with learned deformation model. In International conference on virtual reality and augmented reality, pp. 51–72. Berlin: Springer.
Haouchine, N., Dequidt, J., Berger, M. O., & Cotin, S. (2014). Single view augmentation of 3D elastic objects. In ISMAR, pp. 229–236. IEEE.
Hartley, R., & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: Cambridge University Press.
Horn, B. K., & Schunck, B. G. (1981). Determining optical flow. Artificial intelligence, 17(1–3), 185–203.
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94.
Malti, A., Bartoli, A., & Hartley, R. (2015). A linear least-squares solution to elastic shape-from-template. In CVPR, pp. 1629–1637.
Malti, A., & Herzet, C. (2017). Elastic shape-from-template with spatially sparse deforming forces. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3337–3345.
Ngo, D. T., Östlund, J., & Fua, P. (2016). Template-based monocular 3d shape recovery using Laplacian meshes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 172–187.
Ovsienko, V., & Tabachnikov, S. (2009). What is the schwarzian derivative. Notices of the AMS, 56(1), 34–36.
Özgür, E., & Bartoli, A. (2017). Particle-SfT: A provably-convergent, fast shape-from-template algorithm. International Journal of Computer Vision, 123(2), 184–205.
Parashar, S., Pizarro, D., & Bartoli, A. (2020). Local deformable 3d reconstruction with cartan’s connections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(12), 3011–3026. https://doi.org/10.1109/TPAMI.2019.2920821.
Perriollat, M., Hartley, R., & Bartoli, A. (2011). Monocular template-based reconstruction of inextensible surfaces. International Journal of Computer Vision, 95(2), 124–137.
Pilet, J., Lepetit, V., & Fua, P. (2008). Fast non-rigid surface detection, registration and realistic augmentation. International Journal of Computer Vision, 76(2), 109–122.
Pizarro, D., & Bartoli, A. (2012). Feature-based deformable surface detection with self-occlusion reasoning. International Journal of Computer Vision, 97(1), 54–70.
Pizarro, D., Bartoli, A., & Collins, T. (2013). Isowarp and conwarp: Warps that exactly comply with weak-perspective projection of deforming objects. In BMVC.
Pizarro, D., Khan, R., & Bartoli, A. (2016). Schwarps: Locally projective image warps based on 2d schwarzian derivatives. International Journal of Computer Vision, 119(2), 93–109. https://doi.org/10.1007/s11263-016-0882-9.
Pumarola, A., Agudo, A., Porzi, L., Sanfeliu, A., Lepetit, V., & Moreno-Noguer, F. (2018). Geometry-aware network for non-rigid shape prediction from a single view. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681–4690.
Rueckert, D., Sonoda, L. I., Hayes, C., Hill, D. L. G., Leach, M. O., & Hawkes, D. J. (1999). Nonrigid registration using free-form deformations: Application to breast MR images. IEEE Transactions on Medical Imaging, 18(8), 712–721. https://doi.org/10.1109/42.796284.
Salzmann, M., & Fua, P. (2011). Linear local models for monocular reconstruction of deformable surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 931–944.
Shimada, S., Golyanik, V., Theobalt, C., & Stricker, D. (2019). Ismo-gan: Adversarial learning for monocular non-rigid 3d reconstruction. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops.
Sundaram, N., Brox, T., & Keutzer, K. (2010). Dense point trajectories by GPU-accelerated large displacement optical flow. In European conference on computer vision, pp. 438–451. Berlin: Springer.
Varol, A., Salzmann, M., Fua, P., & Urtasun, R. (2012). A constrained latent variable model. In 2012 IEEE conference on computer vision and pattern recognition, pp. 2248–2255. IEEE.
Yu, R., Russell, C., Campbell, N. D., & Agapito, L. (2015). Direct, dense, and deformable: Template-based non-rigid 3D reconstruction from RGB video. In |it Proceedings of the IEEE international conference on computer vision, pp. 918–926.
Zach, C., Pock, T., & Bischof, H. (2007). A duality based approach for realtime tv-l1 optical flow. In F. A. Hamprecht, C. Schnörr, & B. Jähne (Eds.), Pattern recognition (pp. 214–223). Berlin,: Springer.
Acknowledgements
This research has received funding from the Spanish Ministry of Education and Culture under the scholarship FPU, the Spanish Ministry of Economy, Industry and Competitiveness under the project ARTEMISA (TIN2016-80939-R) and the EU’s FP7 through the ERC research grant 307483 FLEXABLE.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Joachim Weickert.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Derivation of the Image Embedding
We assume that the image plane is at \(z=1\) in camera coordinates, which is achieved by working in retinal coordinates. The perspective projection of a point (x, y, z) is then given by:
The inverse of the restriction \(\Pi _{p}|_{{\mathcal {S}}}:{\mathcal {S}} \rightarrow {\mathbb {R}}^2\) of \(\Pi _{p}\) is the image embedding. It forms a depth based parametrization of the surface \(({\mathcal {I}},X_{i})\) expressed in terms of the depth function \(\rho :{\mathbb {R}}^2 \rightarrow {\mathbb {R}}\):
where \(u'\) and \(v'\) represent the image coordinates. Alternatively to \(\rho \), we define the Euclidean distance between the camera’s projection origin and the surface point as \({\tilde{\rho }}:{\mathbb {R}}^2 \rightarrow {\mathbb {R}}\):
where \(\zeta (u',v')=\sqrt{1+u'^2+v'^2}\). The perspective parametrization \(({\mathcal {I}},X_i)\) can be then expressed in terms of \({\tilde{\rho }}\) as:
Now, we can define the surface \({\mathcal {S}}\) from the template parametrization domain \({\mathcal {U}}\) by composing the previous parametrization \(X_i\) and the warp function \(\eta \) as follows:
where u and v are template domain coordinates.
Defining the depth function \({\bar{\rho }}:{\mathbb {R}}^2 \rightarrow {\mathbb {R}}\) as the depth function \({\tilde{\rho }}\) in u, v coordinates by the composition \({\bar{\rho }} = {\tilde{\rho }}\circ \eta \), we obtain:
Working with the parametrization \(({\mathcal {U}},{\bar{X}}_i)\) of \({\mathcal {S}}\) has two principal advantages. First, it allows us to compute the first fundamental form, also known as the metric tensor, over the same parametrization domain as the template, which is essential to obtain the isowarp equations. Second, it greatly simplifies these equations.
Derivation of the Isowarp Equations
We give Matlab code to establish the Isowarp equations (11). These equations are too lengthy to be reproduced fully expanded. However we recall that, importantly, they depend on the known template and the unknown warp \(\eta \) only. More specifically, they are quadratic of the second-order in \(\eta \).
Rights and permissions
About this article
Cite this article
Casillas-Perez, D., Pizarro, D., Fuentes-Jimenez, D. et al. The Isowarp: The Template-Based Visual Geometry of Isometric Surfaces. Int J Comput Vis 129, 2194–2222 (2021). https://doi.org/10.1007/s11263-021-01472-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-021-01472-w