At QCon San Francisco Conference 2024, Ye (Charlotte) Qi from Meta spoke about scaling large language model (LLM) serving ...