MAVOS

Model Zoo

Environment and Settings

4/1 NVIDIA V100 GPUs for training/evaluation.
Auto-mixed precision was enabled in training but disabled in evaluation.
Test-time augmentations were not used.
The inference resolution was 480p as DeAOT.
Fully online inference. We passed all the modules frame by frame.

Pre-trained Models

Stages:

PRE: the pre-training stage with static images are the same as DeAOT.
PRE_YTB_DAV: the main-training stage with YouTube-VOS and DAVIS.

Model	Param	PRE	PRE_YTB_DAV (LVOS eval checkpoints)	PRE_YTB_DAV (LTV eval checkpoints)	PRE_YTB_DAV (DAVIS eval checkpoints)
MAVOS	34M	gdrive	gdrive	gdrive	gdrive
R50-MAAVOS	41M	gdrive	gdrive	gdrive	gdrive
SwinB-MAVOS	91M	gdrive	gdrive	gdrive	gdrive

To use our pre-trained models to infer, a simple way is to set --model and --ckpt_path to your downloaded checkpoint’s model type and file path when running eval.py.

This site is open source. Improve this page.