Abstract: This paper tackles the domain of multimodal prompting for visual recognition, specifically when dealing with missing modalities through multimodal Transformers. It presents two main ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results