22 AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent · 11 authors 2
21 MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens · 7 authors
7 RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis · 11 authors