VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning Published: November 18, 2022Direct Link Previous Next