Multimedia cloud computing W Zhu, C Luo, J Wang, S Li IEEE Signal Processing Magazine 28 (3), 59-69, 2011 | 589 | 2011 |
Florence: A new foundation model for computer vision L Yuan, D Chen, YL Chen, N Codella, X Dai, J Gao, H Hu, X Huang, B Li, ... arXiv preprint arXiv:2111.11432, 2021 | 415 | 2021 |
End-to-end semi-supervised object detection with soft teacher M Xu, Z Zhang, H Hu, J Wang, L Wang, F Wei, X Bai, Z Liu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 300 | 2021 |
An empirical study of training end-to-end vision-and-language transformers ZY Dou, Y Xu, Z Gan, J Wang, S Wang, L Wang, C Zhu, P Zhang, L Yuan, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 198 | 2022 |
Git: A generative image-to-text transformer for vision and language J Wang, Z Yang, X Hu, L Li, K Lin, Z Gan, Z Liu, C Liu, L Wang arXiv preprint arXiv:2205.14100, 2022 | 153 | 2022 |
Scaling up vision-language pre-training for image captioning X Hu, Z Gan, J Wang, Z Yang, Z Liu, Y Lu, L Wang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 144 | 2022 |
An empirical study of gpt-3 for few-shot knowledge-based vqa Z Yang, Z Gan, J Wang, X Hu, Y Lu, Z Liu, L Wang Proceedings of the AAAI Conference on Artificial Intelligence 36 (3), 3081-3089, 2022 | 137 | 2022 |
Seed: Self-supervised distillation for visual representation Z Fang, J Wang, L Wang, L Zhang, Y Yang, Z Liu arXiv preprint arXiv:2101.04731, 2021 | 132 | 2021 |
Order preserving hashing for approximate nearest neighbor search J Wang, J Wang, N Yu, S Li Proceedings of the 21st ACM international conference on Multimedia, 133-142, 2013 | 123 | 2013 |
Optimized cartesian k-means J Wang, J Wang, J Song, XS Xu, HT Shen, S Li IEEE Transactions on Knowledge and Data Engineering 27 (1), 180-192, 2014 | 116 | 2014 |
Tap: Text-aware pre-training for text-vqa and text-caption Z Yang, Y Lu, J Wang, X Yin, D Florencio, L Wang, C Zhang, L Zhang, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2021 | 110 | 2021 |
Facial age estimation with age difference Z Hu, Y Wen, J Wang, M Wang, R Hong, S Yan IEEE Transactions on Image Processing 26 (7), 3087-3097, 2016 | 101 | 2016 |
Hierarchically structured reinforcement learning for topically coherent visual story generation Q Huang, Z Gan, A Celikyilmaz, D Wu, J Wang, X He Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 8465-8472, 2019 | 88 | 2019 |
Anchor box optimization for object detection Y Zhong, J Wang, J Peng, L Zhang Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2020 | 85 | 2020 |
Mm-react: Prompting chatgpt for multimodal reasoning and action Z Yang, L Li, J Wang, K Lin, E Azarnasab, F Ahmed, Z Liu, C Liu, M Zeng, ... arXiv preprint arXiv:2303.11381, 2023 | 69 | 2023 |
Prompting gpt-3 to be reliable C Si, Z Gan, Z Yang, S Wang, J Wang, J Boyd-Graber, L Wang arXiv preprint arXiv:2210.09150, 2022 | 55 | 2022 |
Injecting semantic concepts into end-to-end image captioning Z Fang, J Wang, X Hu, L Liang, Z Gan, L Wang, Y Yang, Z Liu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 50 | 2022 |
A distance-computation-free search scheme for binary code databases J Song, HT Shen, J Wang, Z Huang, N Sebe, J Wang IEEE Transactions on Multimedia 18 (3), 484-495, 2016 | 50 | 2016 |
Ufo: A unified transformer for vision-language representation learning J Wang, X Hu, Z Gan, Z Yang, X Dai, Z Liu, Y Lu, L Wang arXiv preprint arXiv:2111.10023, 2021 | 45 | 2021 |
Compressing visual-linguistic model via knowledge distillation Z Fang, J Wang, X Hu, L Wang, Y Yang, Z Liu Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2021 | 45 | 2021 |