Charlie Ruan
Charlie Ruan
Home
Publications
Projects
Experience
Contact
CV
Light
Dark
Automatic
Preprint
WebLLM: A High-Performance In-Browser LLM Inference Engine
Advancements in large language models (LLMs) have unlocked remarkable capabilities across various domains. However, deploying these …
Charlie F. Ruan
,
Yucheng Qin
,
Xun Zhou
,
Ruihang Lai
,
Hongyi Jin
,
Yixin Dong
,
Bohan Hou
,
Meng-Shiun Yu
,
Yiyan Zhai
,
Sudeep Agarwal
,
Hangrui Cao
,
Siyuan Feng
,
Tianqi Chen
Cite
Code
MicroServe: A System for Microserving of LLMs
The recent advances in LLMs bring a strong demand for efficient system support to improve overall serving efficiency. As LLM inference …
Hongyi Jin
,
Ruihang Lai
,
Charlie F. Ruan
,
Yingcheng Wang
,
Todd Mowry
,
Xupeng Miao
,
Zhihao Jia
,
Tianqi Chen
Cite
XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models
The applications of LLM Agents are becoming increasingly complex and diverse, leading to a high demand for structured outputs that can …
Yixin Dong
,
Charlie F. Ruan
,
Yaxing Cai
,
Ruihang Lai
,
Ziyi Xu
,
Yilong Zhao
,
Tianqi Chen
PDF
Cite
Code
Local deployment of large-scale music AI models on commodity hardware
We present the MIDInfinite, a web application capable of generating symbolic music using a large-scale generative AI model locally on …
Xun Zhou
,
Charlie Ruan
,
Zihe Zhao
,
Tianqi Chen
,
Chris Donahue
PDF
Cite
Emerging Platforms Meet Emerging LLMs: A Year-Long Journey of Top-Down Development
Deploying machine learning (ML) on diverse computing platforms is crucial to accelerate and broaden their applications. However, it …
Siyuan Feng
,
Jiawei Liu
,
Ruihang Lai
,
Charlie F. Ruan
,
Yong Yu
,
Lingming Zhang
,
Tianqi Chen
PDF
Cite
Cite
×