Duy Nguyen, from the Interactive Machine Learning department, and colleagues from the University of Stuttgart, Oldenburg University, University of California San Diego, Stanford University, and other institutions presented a full paper on accelerating transformer models at NeurIPS 2024.

NeurIPS is considered one of the premier global conferences in the field of machine learning. The conference took place from December 10 to December 15, 2024, at the Vancouver Convention Center. This year, NeurIPS received a record-breaking 15,671 paper submissions, of which 4,037 were accepted, resulting in an acceptance rate of approximately 25.76%.

The accepted work, namely PiToMe, introduces a ๐—ป๐—ฒ๐˜„ ๐˜๐—ผ๐—ธ๐—ฒ๐—ป-๐—บ๐—ฒ๐—ฟ๐—ด๐—ถ๐—ป๐—ด ๐—ฎ๐—น๐—ด๐—ผ๐—ฟ๐—ถ๐˜๐—ต๐—บ ๐—ณ๐—ผ๐—ฟ ๐—ฎ๐—ฐ๐—ฐ๐—ฒ๐—น๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ป๐—ด ๐—ง๐—ฟ๐—ฎ๐—ป๐˜€๐—ณ๐—ผ๐—ฟ๐—บ๐—ฒ๐—ฟ-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐˜„๐—ถ๐˜๐—ต ๐˜€๐—ฝ๐—ฒ๐—ฐ๐˜๐—ฟ๐˜‚๐—บ ๐—ฝ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฎ๐˜๐—ถ๐—ผ๐—ป. We showed that ๐—ฃ๐—ถ๐—ง๐—ผ๐— ๐—ฒ ๐—ฐ๐—ผ๐˜‚๐—น๐—ฑ ๐˜€๐—ฎ๐˜ƒ๐—ฒ ๐—ณ๐—ฟ๐—ผ๐—บ ๐Ÿฐ๐Ÿฌ-๐Ÿฒ๐Ÿฌ% ๐—™๐—Ÿ๐—ข๐—ฃ๐˜€ of the ๐—ฏ๐—ฎ๐˜€๐—ฒ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ ๐˜„๐—ต๐—ถ๐—น๐—ฒ ๐—ฒ๐˜…๐—ต๐—ถ๐—ฏ๐—ถ๐˜๐—ถ๐—ป๐—ด ๐˜€๐˜‚๐—ฝ๐—ฒ๐—ฟ๐—ถ๐—ผ๐—ฟ ๐—ผ๐—ณ๐—ณ-๐˜๐—ต๐—ฒ-๐˜€๐—ต๐—ฒ๐—น๐—ณ ๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐—ป๐—ฐ๐—ฒ (๐˜„๐—ถ๐˜๐—ต๐—ผ๐˜‚๐˜ ๐—ฎ๐—ป๐˜† ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด ๐˜€๐˜๐—ฒ๐—ฝ๐˜€) on ๐˜ช๐˜ฎ๐˜ข๐˜จ๐˜ฆ ๐˜ค๐˜ญ๐˜ข๐˜ด๐˜ด๐˜ช๐˜ง๐˜ช๐˜ค๐˜ข๐˜ต๐˜ช๐˜ฐ๐˜ฏ (0.5% average performance drop of ViT-MAE-H compared to 2.6% as baselines), ๐˜ช๐˜ฎ๐˜ข๐˜จ๐˜ฆโ€“๐˜ต๐˜ฆ๐˜น๐˜ต ๐˜ณ๐˜ฆ๐˜ต๐˜ณ๐˜ช๐˜ฆ๐˜ท๐˜ข๐˜ญ (0.3% average performance drop of CLIP on Flickr30k compared to 4.5% as others), and analogously in ๐˜ท๐˜ช๐˜ด๐˜ถ๐˜ข๐˜ญ ๐˜ฒ๐˜ถ๐˜ฆ๐˜ด๐˜ต๐˜ช๐˜ฐ๐˜ฏ๐˜ด ๐˜ข๐˜ฏ๐˜ด๐˜ธ๐˜ฆ๐˜ณ๐˜ช๐˜ฏ๐˜จ ๐˜ธ๐˜ช๐˜ต๐˜ฉ ๐˜“๐˜“๐˜ข๐˜๐˜ขโ€“7๐˜‰ ๐˜ฐ๐˜ณ ๐˜“๐˜“๐˜ข๐˜๐˜ข-13๐˜‰. 

Unlike previous approaches that employ Bipartite Soft Matching (BSM) with randomly partitioned token sets to identify the top-k similar tokensโ€”often leading to sensitivity to token-splitting strategies – ๐—ฃ๐—ถ๐—ง๐—ผ๐— ๐—ฒ ๐—ฒ๐—บ๐—ฝ๐—ต๐—ฎ๐˜€๐—ถ๐˜‡๐—ฒ๐˜€ the ๐—ฝ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฟ๐˜ƒ๐—ฎ๐˜๐—ถ๐—ผ๐—ป ๐—ผ๐—ณ ๐—ถ๐—ป๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ ๐˜๐—ผ๐—ธ๐—ฒ๐—ป๐˜€ by introducing a new term called ๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ด๐˜† ๐˜€๐—ฐ๐—ผ๐—ฟ๐—ฒ, ๐—ถ๐—ป๐˜€๐—ฝ๐—ถ๐—ฟ๐—ฒ๐—ฑ ๐—ฏ๐˜† ๐˜€๐—ฝ๐—ฒ๐—ฐ๐˜๐—ฟ๐—ฎ๐—น ๐—ด๐—ฟ๐—ฎ๐—ฝ๐—ต ๐˜๐—ต๐—ฒ๐—ผ๐—ฟ๐˜†. This approach ๐˜ณ๐˜ถ๐˜ฏ๐˜ด ๐˜ข๐˜ด ๐˜ง๐˜ข๐˜ด๐˜ต ๐˜ข๐˜ด ๐˜ฉ๐˜ฆ๐˜ถ๐˜ณ๐˜ช๐˜ด๐˜ต๐˜ช๐˜ค ๐˜ข๐˜ญ๐˜จ๐˜ฐ๐˜ณ๐˜ช๐˜ต๐˜ฉ๐˜ฎ๐˜ด ๐˜ฃ๐˜ถ๐˜ต ๐˜ฉ๐˜ข๐˜ด ๐˜ข ๐˜ต๐˜ฉ๐˜ฆ๐˜ฐ๐˜ณ๐˜ฆ๐˜ต๐˜ช๐˜ค๐˜ข๐˜ญ๐˜ญ๐˜บ ๐˜จ๐˜ณ๐˜ฐ๐˜ถ๐˜ฏ๐˜ฅ๐˜ฆ๐˜ฅ ๐˜จ๐˜ถ๐˜ข๐˜ณ๐˜ข๐˜ฏ๐˜ต๐˜ฆ๐˜ฆ that essential information will be retained.

As with previous editions, NeurIPS 2024 offers a rich and diverse program featuring several invited speakers, 4,037 accepted posters, 14 tutorials, and 56 workshops. Noteworthy workshops include Responsibly Building the Next Generation of Multimodal Foundation Models, GenAI for Health: Potential, Trust, and Policy Compliance, and Fine-Tuning in Modern Machine Learning: Principles and Scalability. These workshops highlight critical advancements and challenges in interpretable machine learning (IML) and related fields. Additionally, our department has further strengthened collaborations with leading biomedical research groups at Stanford University, fostering innovation at the intersection of AI and healthcare.

The presented work at NeurIPS 2024