Treffer 101 - 120 von 7.008

101

SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
Li, Xiangchen ; Spatharakis, Dimitrios ; Ghafouri, Saeid ; et al.

Distributed, Parallel, a... Artificial Intelligence Machine Learning Networking and Internet... 68T07, 68M14 I.2.6; C.2.4; C.1.4
Report
Zu den Favoriten
102

Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding
Xiao, Zikai ; Wang, Ziyang ; Ma, Wen ; et al.

Computer Science - Compu...
Report
Zu den Favoriten
103

Tokenized Bandit for LLM Decoding and Alignment
Shin, Suho ; Yang, Chenghao ; Xu, Haifeng ; et al.

Computer Science - Machi... Computer Science - Artif...
Report
Zu den Favoriten
104

AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism
Wei, Zhepei ; Chen, Wei-Lin ; Zhu, Xinyu ; et al.

Computer Science - Compu...
Report
Zu den Favoriten
105

Advancing Decoding Strategies: Enhancements in Locally Typical Sampling for LLMs
Sen, Jaydip ; Sengupta, Saptarshi ; Dasgupta, Subhasis

Computer Science - Compu... Computer Science - Artif...
Report
Zu den Favoriten
106

Accelerating Diffusion LLMs via Adaptive Parallel Decoding
Israel, Daniel ; Broeck, Guy Van den ; Grover, Aditya

Computation and Language Artificial Intelligence Machine Learning Performance
Report
Zu den Favoriten
107

Learn from the Past: Fast Sparse Indexing for Large Language Model Decoding
Yao, Feiyu ; Wang, Qian

Machine Learning Artificial Intelligence Computation and Language
Report
Zu den Favoriten
108

Active Layer-Contrastive Decoding Reduces Hallucination in Large Language Model Generation
Zhang, Hongxiang ; Chen, Hao ; Chen, Muhao ; et al.

Computation and Language Artificial Intelligence Machine Learning
Report
Zu den Favoriten
109

Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism
Wei, Jinhui ; Huang, Ye ; Zhou, Yuhui ; et al.

Computer Science - Distr...
Report
Zu den Favoriten
110

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Wu, Chengyue ; Zhang, Hao ; Xue, Shuchen ; et al.

Computation and Language
Report
Zu den Favoriten
111

Leveraging Large Language Models in Visual Speech Recognition: Model Scaling, Context-Aware Decoding, and Iterative Polishing
Liu, Zehua ; Li, Xiaolou ; Guo, Li ; et al.

Computer Science - Compu... Computer Science - Sound Electrical Engineering a...
Report
Zu den Favoriten
112

AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding
Jung, Chaeyoung ; Jang, Youngjoon ; Chung, Joon Son

Computer Vision and Patt...
Report
Zu den Favoriten
113

Speculative Decoding Reimagined for Multimodal Large Language Models
Lin, Luxi ; Lin, Zhihang ; Zeng, Zhanpeng ; et al.

Computer Science - Compu... Computer Science - Artif...
Report
Zu den Favoriten
114

Semi-Clairvoyant Scheduling of Speculative Decoding Requests to Minimize LLM Inference Latency
Li, Ruixiao ; Chen, Fahao ; Li, Peng

Computer Science - Compu... Computer Science - Artif... Computer Science - Machi...
Report
Zu den Favoriten
115

On Next-Token Prediction in LLMs: How End Goals Determine the Consistency of Decoding Algorithms
Trauger, Jacob ; Tewari, Ambuj

Statistics - Machine Lea... Computer Science - Compu... Computer Science - Machi...
Report
Zu den Favoriten
116

Automatic Task Detection and Heterogeneous LLM Speculative Decoding
Ge, Danying ; Gao, Jianhua ; Jiang, Qizhi ; et al.

Computer Science - Compu... I.2.7
Report
Zu den Favoriten
117

SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models
Wu, Hang ; Zhu, Jianian ; Li, Yinghui ; et al.

Computer Science - Machi... Computer Science - Distr...
Report
Zu den Favoriten
118

Sparse Attention Remapping with Clustering for Efficient LLM Decoding on PIM
Fan, Zehao ; Gagnon, Garrett ; Liu, Zhenyu ; et al.

Computer Science - Compu... Computer Science - Machi...
Report
Zu den Favoriten
119

PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding
McDanel, Bradley ; Zhang, Sai Qian ; Hu, Yunhai ; et al.

Computer Science - Artif... Computer Science - Distr...
Report
Zu den Favoriten
120

Helix Parallelism: Rethinking Sharding Strategies for Interactive Multi-Million-Token LLM Decoding
Bhatia, Nidhi ; More, Ankit ; Borkar, Ritika ; et al.

Distributed, Parallel, a... Artificial Intelligence
Report
Zu den Favoriten

Filter