Advancing Video Synchronization with Fractional Frame Analysis: Introducing a Novel Dataset and Model - Details

Author：

Indexed by：

EI Scopus

Abstract：

Multiple　views　play　a　vital　role　in　3D　pose　estimation　tasks.　Ideally,　multi-view　3D　pose　estimation　tasks　should　directly　utilize　naturally　collected　videos　for　pose　estimation.　However,　due　to　the　constraints　of　video　synchronization,　existing　methods　often　use　expensive　hardware　devices　to　synchronize　the　initiation　of　cameras,　which　restricts　most　3D　pose　collection　scenarios　to　indoor　settings.　Some　recent　works　learn　deep　neural　networks　to　align　desynchronized　datasets　derived　from　synchronized　cameras　and　can　only　produce　frame-level　accuracy.　For　fractional　frame　video　synchronization,　this　work　proposes　an　Inter-Frame　and　Intra-Frame　Desynchronized　Dataset　(IFID),　which　labels　fractional　time　intervals　between　two　video　clips.　IFID　is　the　first　dataset　that　annotates　inter-frame　and　intra-frame　intervals,　with　a　total　of　382,　500　video　clips　annotated,　making　it　the　largest　dataset　to　date.　We　also　develop　a　novel　model　based　on　the　Transformer　architecture,　named　InSynFormer,　for　synchronizing　inter-frame　and　intra-frame.　Extensive　experimental　evaluations　demonstrate　its　promising　performance.　The　dataset　and　source　code　of　the　model　are　available　at　https://github.com/yuxuan-cser/InSynFormer.　Copyright　©　2024,　Association　for　the　Advancement　of　Artificial　Intelligence　(www.aaai.org).　All　rights　reserved.

Keyword：

Deep neural networks Video cameras Synchronization

Author Community：

[ 1 ] [Liu, Yuxuan]Key Laboratory of Pervasive Computing, Ministry of Education, China
[ 2 ] [Liu, Yuxuan]Department of Computer Science and Technology, Tsinghua University, Beijing; 100084, China
[ 3 ] [Ai, Haizhou]Key Laboratory of Pervasive Computing, Ministry of Education, China
[ 4 ] [Ai, Haizhou]Department of Computer Science and Technology, Tsinghua University, Beijing; 100084, China
[ 5 ] [Xing, Junliang]Key Laboratory of Pervasive Computing, Ministry of Education, China
[ 6 ] [Xing, Junliang]Department of Computer Science and Technology, Tsinghua University, Beijing; 100084, China
[ 7 ] [Li, Xuri]Beijing University of Technology, Beijing; 100124, China
[ 8 ] [Wang, Xiaoyi]Unaffiliated Scholar, Haidian District, Beijing, China
[ 9 ] [Tao, Pin]Key Laboratory of Pervasive Computing, Ministry of Education, China
[ 10 ] [Tao, Pin]Department of Computer Science and Technology, Tsinghua University, Beijing; 100084, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Weakly Supervised Real-time Object Detection Based on Saliency Map
2020，Acta Automatica Sinica
Device-Free Sensing for Gesture Recognition by Wi-Fi Communication Signal Based on Auto-encoder/decoder Neural Network
2020，8th International Conference on Communications, Signal Processing, and Systems, CSPS 2019
A lightweight collaborative recognition system with binary convolutional neural network for mobile web augmented reality
2019，39th IEEE International Conference on Distributed Computing Systems, ICDCS 2019
Network Anomaly Detection Using Federated Learning and Transfer Learning
2020，1st International Conference on Security and Privacy in Digital Economy, SPDE 2020

Source ：

ISSN： 2159-5399

Year： 2024

Issue： 4

Volume： 38

Page： 3828-3836

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count： 2

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 0

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to