Follow
Filip Pavetic
Filip Pavetic
Verified email at google.com
Title
Cited by
Cited by
Year
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ...
arXiv preprint arXiv:2312.11805, 2023
34992023
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
G Team, P Georgiev, VI Lei, R Burnell, L Bai, A Gulati, G Tanzer, ...
arXiv preprint arXiv:2403.05530, 2024
13552024
Scaling vision transformers to 22 billion parameters
M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ...
International Conference on Machine Learning, 7480-7512, 2023
6082023
Pali-x: On scaling up a multilingual vision and language model
X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ...
arXiv preprint arXiv:2305.18565, 2023
1792023
Object scene representation transformer
MSM Sajjadi, D Duckworth, A Mahendran, S Van Steenkiste, F Pavetic, ...
Advances in neural information processing systems 35, 9512-9524, 2022
1112022
Flexivit: One model for all patch sizes
L Beyer, P Izmailov, A Kolesnikov, M Caron, S Kornblith, X Zhai, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023
1062023
Pali-3 vision language models: Smaller, faster, stronger
X Chen, X Wang, L Beyer, A Kolesnikov, J Wu, P Voigtlaender, B Mustafa, ...
arXiv preprint arXiv:2310.09199, 2023
862023
The auto arborist dataset: a large-scale benchmark for multiview urban forest monitoring under domain shift
S Beery, G Wu, T Edwards, F Pavetic, B Majewski, S Mukherjee, S Chan, ...
Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022
492022
$ LCSk $++: Practical similarity metric for long strings
F Pavetić, G Žužić, M Šikić
arXiv preprint arXiv:1407.2407, 2014
112014
A study of autoregressive decoders for multi-tasking in computer vision
L Beyer, B Wan, G Madan, F Pavetic, A Steiner, A Kolesnikov, AS Pinto, ...
arXiv preprint arXiv:2303.17376, 2023
82023
On scaling up a multilingual vision and language model
X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
62024
Using machine learning to detect which part of the screen includes embedded frames of an uploaded video
F Pavetic, KHT Leung, D Tochilkin
US Patent 10,586,111, 2020
62020
Multi-step sequence alignment
PG Anders, F Pavetic
US Patent 9,959,448, 2018
62018
Methods, systems, and media for detecting abusive stereoscopic videos by generating fingerprints for multiple portions of a video frame
V Zamaraiev, F Pavetic
US Patent 9,872,056, 2018
62018
Fast and simple algorithms for computing both and
F Pavetić, I Katanić, G Matula, G Žužić, M Šikić
arXiv preprint arXiv:1705.07279, 2017
62017
Locca: Visual pretraining with location-aware captioners
B Wan, M Tschannen, Y Xian, F Pavetic, IM Alabdulmohsin, X Wang, ...
Advances in Neural Information Processing Systems 37, 116355-116387, 2024
52024
Detecting multiple parts of a screen to fingerprint to detect abusive uploading videos
F Pavetic, MR Konrad, H Pasula
US Patent 10,614,539, 2020
42020
Detecting multiple parts of a screen to fingerprint to detect abusive uploading videos
F Pavetic, MR Konrad, H Pasula
US Patent 9,972,060, 2018
32018
Using machine learning to detect which part of the screen includes embedded frames of an uploaded video
F Pavetic, KHT Leung, D Tochilkin
US Patent 11,093,751, 2021
22021
Scalable and Cost-Efficient Information Retrieval Architecture for Massive Datasets
F Pavetic, D Simcha, AT Voicu, F Chern, PW Sun, R Guo, HM Pasula, ...
US Patent App. 17/886,860, 2024
12024
The system can't perform the operation now. Try again later.
Articles 1–20