Learning spatiotemporal features with 3d convolutional networks D Tran, L Bourdev, R Fergus, L Torresani, M Paluri 2015 IEEE International Conference on Computer Vision (ICCV), 4489-4497, 2015 | 4554 | 2015 |
A closer look at spatiotemporal convolutions for action recognition D Tran, H Wang, L Torresani, J Ray, Y LeCun, M Paluri Proceedings of the IEEE conference on Computer Vision and Pattern …, 2018 | 776 | 2018 |
Human activity recognition with metric learning D Tran, A Sorokin, D Forsyth Computer Vision–ECCV 2008, 548-561, 2008 | 404 | 2008 |
C3D: generic features for video analysis D Tran, LD Bourdev, R Fergus, L Torresani, M Paluri CoRR, abs/1412.0767 2 (7), 8, 2014 | 341 | 2014 |
Building an automatic vehicle license plate recognition system TD Duan, TLH Du, TV Phuoc, NV Hoang Proc. Int. Conf. Comput. Sci. RIVF, 59-63, 2005 | 327 | 2005 |
Convnet architecture search for spatiotemporal feature learning D Tran, J Ray, Z Shou, SF Chang, M Paluri arXiv preprint arXiv:1708.05038, 2017 | 240 | 2017 |
Combining Hough transform and contour algorithm for detecting vehicles' license-plates TD Duan, DA Duc, TLH Du Proceedings of 2004 International Symposium on Intelligent Multimedia, Video …, 2004 | 170 | 2004 |
Cooperative learning of audio and video models from self-supervised synchronization B Korbar, D Tran, L Torresani arXiv preprint arXiv:1807.00230, 2018 | 147* | 2018 |
Detect-and-track: Efficient pose estimation in videos R Girdhar, G Gkioxari, L Torresani, M Paluri, D Tran Proceedings of the IEEE Conference on Computer Vision and Pattern …, 2018 | 129 | 2018 |
Deep end2end voxel2voxel prediction D Tran, L Bourdev, R Fergus, L Torresani, M Paluri Proceedings of the IEEE conference on computer vision and pattern …, 2016 | 115 | 2016 |
Video event detection: From subvolume localization to spatiotemporal path search D Tran, J Yuan, D Forsyth IEEE transactions on pattern analysis and machine intelligence 36 (2), 404-416, 2013 | 106 | 2013 |
Video classification with channel-separated convolutional networks D Tran, H Wang, L Torresani, M Feiszli Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 105 | 2019 |
Large-scale weakly-supervised pre-training for video action recognition D Ghadiyaram, D Tran, D Mahajan Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 94 | 2019 |
Max-margin structured output regression for spatio-temporal action localization D Tran, J Yuan Proceedings of the 25th International Conference on Neural Information …, 2012 | 67 | 2012 |
Optimal spatio-temporal path discovery for video event detection D Tran, J Yuan CVPR 2011, 3321-3328, 2011 | 63 | 2011 |
Transformation-based models of video sequences J Van Amersfoort, A Kannan, MA Ranzato, A Szlam, D Tran, S Chintala arXiv preprint arXiv:1701.08435, 2017 | 49 | 2017 |
Self-supervised learning by cross-modal audio-video clustering H Alwassel, D Mahajan, B Korbar, L Torresani, B Ghanem, D Tran arXiv preprint arXiv:1911.12667, 2019 | 43 | 2019 |
Scsampler: Sampling salient clips from video for efficient action recognition B Korbar, D Tran, L Torresani Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2019 | 43 | 2019 |
What makes training multi-modal classification networks hard? W Wang, D Tran, M Feiszli Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2020 | 36* | 2020 |
Distinit: Learning video representations without a single labeled video R Girdhar, D Tran, L Torresani, D Ramanan Proceedings of the IEEE/CVF International Conference on Computer Vision, 852-861, 2019 | 23 | 2019 |