Abstract
With the wide spread of smartphones, a large number of user-generated videos are produced everyday. The embedded sensors, e.g., GPS and the digital compass, make it possible that videos are accessed based on their geo-properties. In our previous work, we have created a framework for integrated, sensor-rich video acquisition (with one instantiation implemented in the form of smartphone applications) which associates a continuous stream of location and viewing direction information with the collected videos, hence allowing them to be expressed and manipulated as spatio-temporal objects. These sensor meta-data are considerably smaller in size compared to the visual content and are helpful in effectively and efficiently searching for geo-tagged videos in large-scale repositories. In this study, we propose a novel three-level grid-based index structure and introduce a number of related query types, including typical spatial queries and ones based on bounded radius and viewing direction restriction. These two criteria are important in many video applications and we demonstrate the importance with a real-world dataset. Moreover, experimental results on a large-scale synthetic dataset show that our approach can provide a significant speed improvements of at least 30 %, considering a mix of queries, compared to a multi-dimensional R-tree implementation.
Similar content being viewed by others
References
Arslan Ay S, Zimmermann R, Kim S (2008) Viewable scene modeling for Geospatial video search. ACMMM, pp 309–318
Arslan Ay S, Zimmermann R, Kim SH (2010) Generating Synthetic Meta-data for Georeferenced Video Management. In: SIGSPATIAL GIS international conference on advances in geographic information systems. ACM, pp 280–289
Beckmann N, Kriegel H, Schneider R, Seeger B (1990) The R∗-tree: an efficient and robust access method for points and rectangles. In: ACM international conference on management of data. SIGMOD, pp 322–331
Chon H, Agrawal D, Abbadi A (2003) Range and KNN query processing for moving objects in grid model. Mob Netw Appl 8(4):401–412
Cisco (2013) Cisco visual networking index: global mobile data traffic forecast update, 2012–2017. http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-.520862.pdf
Eppstein D, Goodrich M, Sun J (2005) The skip Quadtree: a simple dynamic data structure for multidimensional data. In: Annual symposium on computational geometry
Finkel R, Bentley J (1974) Quad Trees: a data structure for retrieval on composite keys. Acta Informatica 4(1):1–9
Graham CH, Bartlett NR, Brown JL, Hsia Y, Mueller CC, Riggs LA (1965) Vision and visual perception
Green M (2010) R-Tree, Templated C++ Implementation. http://superliminal.com/sources/RTreeTemplate.zip
Guttman A (1984) R-Trees: a dynamic index structure for spatial searching. In: ACM international conference on management of data. SIGMOD, pp 47–57
Hwang TH, Choi KH, Joo IH, Lee JH (2003) MPEG-7 Metadata for Video-based GIS Applications. In: IEEE international geoscience and remote sensing symposium, vol 6, pp 3641–3643
Kim KH, Kim SS, Lee SH, Park JH, Lee JH (2003) The interactive geographic video. In: IEEE international geoscience and remote sensing symposium, vol 1. IGARSS, pp 59–61
Kim S, Arslan Ay S, Yu B, Zimmermann R (2010) Vector model in support of versatile georeferenced video search. In: SIGMM conference on multimedia systems. ACM
Liu X, Corner M, Shenoy P (2005) SEVA: sensor-enhanced video annotation. In: ACM international conference on multimedia. SIGMM, pp 618–627
Ma H, Arslan Ay S, Zimmermann R, Kim SH (2012) A grid-based index and queries for large-scale geo-tagged video collections. In: 17th international conference, DASFAA workshops. SIM 3, pp 16–228
Navarrete T, Blat J (2002) VideoGIS: segmenting and indexing video based on geographic information. In: Conference on geographic information science. AGILE, pp 1–9
Nievergelt J, Hinterberger H, Sevcik K (1984) The grid file: an adaptable, symmetric multikey file structure. ACM Trans Database Syst (TODS) 9(1):38–71
Nutanong S, Zhang R, Tanin E, Kulik L (2008) The V∗-Diagram: a query-dependent approach to moving KNN queries. Proc VLDB Endowment 1(1):1095–1106
Okabe A (2000) Spatial tessellations: concepts and applications of voronoi diagrams. Wiley
Priyantha NB, Chakraborty A, Balakrishnan H (2000) The cricket location-support system. In: ACM international conference on mobile computing and networking. MobiCom, pp 32–43
Rigaux P, Scholl M, Voisard A (2001) Spatial databases with application to GIS, Morgan Kaufmann
Roussopoulos N, Faloutsos C, Timos S (1987) The R+-tree: a dynamic index for multi-dimensional objects. In: VLDB International Conference on Very Large Databases, pp 507–518
YouTube (2013) YouTube press statistics. http://www.youtube.com/t/press_statistics
Yu FX, Ji R, Chang S-F (2011) Active query sensing for mobile location search. In: The 19th ACM international conference on multimedia. ACM, pp 3–12
Zhu Z, Riseman E, Hanson A, Schultz H (2005) An efficient method for geo-referenced video mosaicing for environmental monitoring. In: Machine vision and applications, vol 16. Springer, pp 203–216
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ma, H., Arslan Ay, S., Zimmermann, R. et al. Large-scale geo-tagged video indexing and queries. Geoinformatica 18, 671–697 (2014). https://doi.org/10.1007/s10707-013-0199-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-013-0199-6