Duy Nguyen, from the Interactive Machine Learning department, and colleagues from the University of Stuttgart, Oldenburg University, University of California San Diego, Stanford University, and other institutions presented a full paper on accelerating transformer models at NeurIPS 2024.
NeurIPS is considered one of the premier global conferences in the field of machine learning. The conference took place from December 10 to December 15, 2024, at the Vancouver Convention Center. This year, NeurIPS received a record-breaking 15,671 paper submissions, of which 4,037 were accepted, resulting in an acceptance rate of approximately 25.76%.
The accepted work, namely PiToMe, introduces a ๐ป๐ฒ๐ ๐๐ผ๐ธ๐ฒ๐ป-๐บ๐ฒ๐ฟ๐ด๐ถ๐ป๐ด ๐ฎ๐น๐ด๐ผ๐ฟ๐ถ๐๐ต๐บ ๐ณ๐ผ๐ฟ ๐ฎ๐ฐ๐ฐ๐ฒ๐น๐ฒ๐ฟ๐ฎ๐๐ถ๐ป๐ด ๐ง๐ฟ๐ฎ๐ป๐๐ณ๐ผ๐ฟ๐บ๐ฒ๐ฟ-๐ฏ๐ฎ๐๐ฒ๐ฑ ๐บ๐ผ๐ฑ๐ฒ๐น๐ ๐๐ถ๐๐ต ๐๐ฝ๐ฒ๐ฐ๐๐ฟ๐๐บ ๐ฝ๐ฟ๐ฒ๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป. We showed that ๐ฃ๐ถ๐ง๐ผ๐ ๐ฒ ๐ฐ๐ผ๐๐น๐ฑ ๐๐ฎ๐๐ฒ ๐ณ๐ฟ๐ผ๐บ ๐ฐ๐ฌ-๐ฒ๐ฌ% ๐๐๐ข๐ฃ๐ of the ๐ฏ๐ฎ๐๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น๐ ๐๐ต๐ถ๐น๐ฒ ๐ฒ๐ ๐ต๐ถ๐ฏ๐ถ๐๐ถ๐ป๐ด ๐๐๐ฝ๐ฒ๐ฟ๐ถ๐ผ๐ฟ ๐ผ๐ณ๐ณ-๐๐ต๐ฒ-๐๐ต๐ฒ๐น๐ณ ๐ฝ๐ฒ๐ฟ๐ณ๐ผ๐ฟ๐บ๐ฎ๐ป๐ฐ๐ฒ (๐๐ถ๐๐ต๐ผ๐๐ ๐ฎ๐ป๐ ๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐๐ฒ๐ฝ๐) on ๐ช๐ฎ๐ข๐จ๐ฆ ๐ค๐ญ๐ข๐ด๐ด๐ช๐ง๐ช๐ค๐ข๐ต๐ช๐ฐ๐ฏ (0.5% average performance drop of ViT-MAE-H compared to 2.6% as baselines), ๐ช๐ฎ๐ข๐จ๐ฆโ๐ต๐ฆ๐น๐ต ๐ณ๐ฆ๐ต๐ณ๐ช๐ฆ๐ท๐ข๐ญ (0.3% average performance drop of CLIP on Flickr30k compared to 4.5% as others), and analogously in ๐ท๐ช๐ด๐ถ๐ข๐ญ ๐ฒ๐ถ๐ฆ๐ด๐ต๐ช๐ฐ๐ฏ๐ด ๐ข๐ฏ๐ด๐ธ๐ฆ๐ณ๐ช๐ฏ๐จ ๐ธ๐ช๐ต๐ฉ ๐๐๐ข๐๐ขโ7๐ ๐ฐ๐ณ ๐๐๐ข๐๐ข-13๐.
Unlike previous approaches that employ Bipartite Soft Matching (BSM) with randomly partitioned token sets to identify the top-k similar tokensโoften leading to sensitivity to token-splitting strategies – ๐ฃ๐ถ๐ง๐ผ๐ ๐ฒ ๐ฒ๐บ๐ฝ๐ต๐ฎ๐๐ถ๐๐ฒ๐ the ๐ฝ๐ฟ๐ฒ๐๐ฒ๐ฟ๐๐ฎ๐๐ถ๐ผ๐ป ๐ผ๐ณ ๐ถ๐ป๐ณ๐ผ๐ฟ๐บ๐ฎ๐๐ถ๐๐ฒ ๐๐ผ๐ธ๐ฒ๐ป๐ by introducing a new term called ๐ฒ๐ป๐ฒ๐ฟ๐ด๐ ๐๐ฐ๐ผ๐ฟ๐ฒ, ๐ถ๐ป๐๐ฝ๐ถ๐ฟ๐ฒ๐ฑ ๐ฏ๐ ๐๐ฝ๐ฒ๐ฐ๐๐ฟ๐ฎ๐น ๐ด๐ฟ๐ฎ๐ฝ๐ต ๐๐ต๐ฒ๐ผ๐ฟ๐. This approach ๐ณ๐ถ๐ฏ๐ด ๐ข๐ด ๐ง๐ข๐ด๐ต ๐ข๐ด ๐ฉ๐ฆ๐ถ๐ณ๐ช๐ด๐ต๐ช๐ค ๐ข๐ญ๐จ๐ฐ๐ณ๐ช๐ต๐ฉ๐ฎ๐ด ๐ฃ๐ถ๐ต ๐ฉ๐ข๐ด ๐ข ๐ต๐ฉ๐ฆ๐ฐ๐ณ๐ฆ๐ต๐ช๐ค๐ข๐ญ๐ญ๐บ ๐จ๐ณ๐ฐ๐ถ๐ฏ๐ฅ๐ฆ๐ฅ ๐จ๐ถ๐ข๐ณ๐ข๐ฏ๐ต๐ฆ๐ฆ that essential information will be retained.
As with previous editions, NeurIPS 2024 offers a rich and diverse program featuring several invited speakers, 4,037 accepted posters, 14 tutorials, and 56 workshops. Noteworthy workshops include Responsibly Building the Next Generation of Multimodal Foundation Models, GenAI for Health: Potential, Trust, and Policy Compliance, and Fine-Tuning in Modern Machine Learning: Principles and Scalability. These workshops highlight critical advancements and challenges in interpretable machine learning (IML) and related fields. Additionally, our department has further strengthened collaborations with leading biomedical research groups at Stanford University, fostering innovation at the intersection of AI and healthcare.
The presented work at NeurIPS 2024