The IRMA Community
Newsletters
Research IRM
Click a keyword to search titles using our InfoSci-OnDemand powered search:
|
Can a Student Large Language Model Perform as Well as Its Teacher?
Abstract
The burgeoning complexity of contemporary deep learning models, while achieving unparalleled accuracy, has inadvertently introduced deployment challenges in resource-constrained environments. Through meticulous examination, the authors elucidate the critical determinants of successful distillation, including the architecture of the student model, the caliber of the teacher, and the delicate balance of hyperparameters. While acknowledging its profound advantages, they also delve into the complexities and challenges inherent in the process. The exploration underscores knowledge distillation's potential as a pivotal technique in optimizing the trade-off between model performance and deployment efficiency.
Related Content
Sharon L. Burton.
© 2024.
25 pages.
|
Laura Ann Jones, Ian McAndrew.
© 2024.
24 pages.
|
Olayinka Creighton-Randall.
© 2024.
14 pages.
|
Stacey L. Morin.
© 2024.
11 pages.
|
N. Nagashri, L. Archana, Ramya Raghavan.
© 2024.
22 pages.
|
Esther Gani, Foluso Ayeni, Victor Mbarika, Abdullahi I. Musa, Oneurine Ngwa.
© 2024.
25 pages.
|
Sia Gholami, Marwan Omar.
© 2024.
18 pages.
|
|
|