In this paper, we address the different conceptual and technical difficulties encountered when embedding CHR into an imperative host language. We argue that a tight, natural integration leads to a powerful programming language extension, intuitive to both CHR and imperative programmers. We show how to compile CHR to highly optimized imperative code. To this end, we first review the well-established CHR compilation scheme, and survey the large body of possible optimizations. We then show that this scheme, when used for compilation to imperative target languages, leads to stack overflows. We therefore introduce new optimizations that considerably improve the performance of recursive CHR programs. Rules written using tail calls are even guaranteed to run in constant space. We implemented systems for both Java and C, following the language design principles and compilation scheme presented in this paper, and show that our implementations outperform other state-of-the-art CHR compilers by several orders of magnitude.