An intercomparison of different approaches for the construction and calibration of lumped conceptual rainfall-runoff models is made based on two case studies with unrelated meteorological and hydrological characteristics located in two regions, Belgium and Kenya. While a model with pre-fixed “one-size-fits-all” model structure is traditionally used in lumped conceptual rainfall-runoff modeling, this paper shows the advantages of model structure inference from data or field evidence in a case-specific and step-wise way using non-commensurable measures derived from observed series. The step-wise model structure identification method does not lead to higher accuracy than the traditional approach when evaluated using common statistical criteria like the Nash-Sutcliffe efficiency. The method is, however, favourable to produce a well-balanced calibration obtaining accurate results for a wide range of runoff properties: total flows, quick and slow subflows, cumulative volumes, peak flows, low flows, frequency distributions of peak and low flows, changes in quick flows for given changes in rainfall. It furthermore is shown that model performance evaluation procedures that account for the flow residual serial dependency and homoscedasticity are preferred. Explicit evaluation of model results for peak and/or low flow extremes and changes in these extremes make the models useful for impact investigations on such hydrological extremes.