Early exits for LLMs

1

Q: Can we speed up generation with early exiting?

Idea: Pre-train models with classification head after every layer

Notes:

I really feel that this has been done, but I could not find an example

In CV (https://arxiv.org/abs/1409.4842), this early exits were used before we had residual streams (see Figure 3 of GoogLeNet)

There's a connection to MTP (https://arxiv.org/abs/2404.19737) (first scaled up in DeepSeek V3) in https://arxiv.org/abs/2412.19437v2.

Could be a nice connection to continuous CoTs

language models LLMs Artificial Intelligence

Chat

If you are inspired by this idea, you can reach out to the authors for collaboration or cite it:

@misc{heineman-early-exits-for-2025,
  author = {Heineman, David},
  title = {Early exits for LLMs},
  year = {2025},
  url = {https://hypogenic.ai/ideahub/idea/NAjgyTEHZWbgWBRdPslf}
}

Comments (0)

Please sign in to comment on this idea.

No comments yet. Be the first to share your thoughts!