Filip Pavetic

Cited by

	All	Since 2019
Citations	1053	1047
h-index	9	9
i10-index	9	9

700

350

175

525

2019202020212022202320244 6 13 20 309 693

Co-authors

Xiaohua ZhaiGoogle DeepmindVerified email at google.com
Alexander KolesnikovGoogle DeepmindVerified email at google.com
Lucas BeyerGoogle DeepMind, Google Brain, RWTH AachenVerified email at google.com
Michael TschannenGoogle DeepMindVerified email at google.com
Sjoerd van SteenkisteResearch Scientist at Google ResearchVerified email at google.com
Leonidas GuibasProfessor of Computer Science, Stanford UniversityVerified email at cs.stanford.edu
Thomas KipfSenior Research Scientist, Google DeepMindVerified email at google.com
Mehdi S. M. SajjadiSenior Research Scientist, Google DeepMindVerified email at google.com
Aravindh MahendranGoogle DeepmindVerified email at google.com
Klaus GreffResearch Scientist at Google BrainVerified email at usi.ch
Mario LučićResearch Scientist, Google DeepMindVerified email at google.com
Daniel DuckworthGoogle DeepMind, BerlinVerified email at google.com
Xiao WangGoogle DeepMindVerified email at google.com
Mile SikicUniversity of Zagreb, Faculty of Electrical Engineering and ComputingVerified email at fer.hr
Pavel IzmailovOpenAIVerified email at openai.com
Simon KornblithAnthropicVerified email at anthropic.com
Matthias MindererSenior Research Scientist, Google DeepMindVerified email at google.com
Mathilde CaronGoogleVerified email at google.com
Vivek RathodSoftware Engineer, GoogleVerified email at google.com
Sara BeeryAssistant Professor at MIT CSAILVerified email at mit.edu

Filip Pavetic

Google

Verified email at google.com

machine learning computer vision algorithms data


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023	485	2023
Scaling vision transformers to 22 billion parameters M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ... International Conference on Machine Learning, 7480-7512, 2023	261	2023
Pali-x: On scaling up a multilingual vision and language model X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ... arXiv preprint arXiv:2305.18565, 2023	83	2023
Object scene representation transformer MSM Sajjadi, D Duckworth, A Mahendran, S Van Steenkiste, F Pavetic, ... Advances in Neural Information Processing Systems 35, 9512-9524, 2022	74	2022
Flexivit: One model for all patch sizes L Beyer, P Izmailov, A Kolesnikov, M Caron, S Kornblith, X Zhai, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023	54	2023
Pali-3 vision language models: Smaller, faster, stronger X Chen, X Wang, L Beyer, A Kolesnikov, J Wu, P Voigtlaender, B Mustafa, ... arXiv preprint arXiv:2310.09199, 2023	23	2023
The auto arborist dataset: a large-scale benchmark for multiview urban forest monitoring under domain shift S Beery, G Wu, T Edwards, F Pavetic, B Majewski, S Mukherjee, S Chan, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022	22	2022
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context M Reid, N Savinov, D Teplyashin, D Lepikhin, T Lillicrap, J Alayrac, ... arXiv preprint arXiv:2403.05530, 2024	11	2024
Methods, systems, and media for detecting two-dimensional videos placed on a sphere in abusive spherical video content by tiling the sphere F Pavetic, M Konrad, R Vorushin US Patent 10,509,965, 2019	10	2019
$ LCSk $++: Practical similarity metric for long strings F Pavetić, G Žužić, M Šikić arXiv preprint arXiv:1407.2407, 2014	8	2014
Methods, systems, and media for detecting abusive stereoscopic videos by generating fingerprints for multiple portions of a video frame V Zamaraiev, F Pavetic US Patent 9,872,056, 2018	6	2018
Fast and simple algorithms for computing both and F Pavetić, I Katanić, G Matula, G Žužić, M Šikić arXiv preprint arXiv:1705.07279, 2017	5	2017
Detecting multiple parts of a screen to fingerprint to detect abusive uploading videos F Pavetic, MR Konrad, H Pasula US Patent 10,614,539, 2020	4	2020
A study of autoregressive decoders for multi-tasking in computer vision L Beyer, B Wan, G Madan, F Pavetic, A Steiner, A Kolesnikov, AS Pinto, ... arXiv preprint arXiv:2303.17376, 2023	3	2023
Detecting multiple parts of a screen to fingerprint to detect abusive uploading videos F Pavetic, MR Konrad, H Pasula US Patent 9,972,060, 2018	3	2018
Methods, systems, and media for detecting abusive stereoscopic videos by generating fingerprints for multiple portions of a video frame V Zamaraiev, F Pavetic US Patent 10,499,097, 2019	1	2019
Using machine learning to detect which part of the screen includes embedded frames of an uploaded video F Pavetic, KHT Leung, D Tochilkin US Patent App. 18/520,532, 2024		2024
LocCa: Visual Pretraining with Location-aware Captioners B Wan, M Tschannen, Y Xian, F Pavetic, I Alabdulmohsin, X Wang, ... arXiv preprint arXiv:2403.19596, 2024		2024
Scalable and Cost-Efficient Information Retrieval Architecture for Massive Datasets F Pavetic, D Simcha, AT Voicu, F Chern, PW Sun, R Guo, HM Pasula, ... US Patent App. 17/886,860, 2024		2024
Using machine learning to detect which part of the screen includes embedded frames of an uploaded video F Pavetic, KHT Leung, D Tochilkin US Patent 11,829,854, 2023		2023

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors