A new learning algorithm for kernel-based topographic map formation is introduced. The kernel parameters are adjusted individually so as to maximize the joint entropy of the kernel outputs. This is done by maximizing the differential entropies of the individual kernel outputs, given that the map's output redundancy, due to the kernel overlap, needs to be minimized. The latter is achieved by minimizing the mutual information between the kernel outputs. As a kernel, the (radial) incomplete gamma distribution is taken since, for a gaussian input density, the differential entropy of the kernel output will be maximal. Since the theoretically optimal joint entropy performance can be derived for the case of nonoverlapping gaussian mixture densities, a new clustering algorithm is suggested that uses this optimum as its "null" distribution. Finally, it is shown that the learning algorithm is similar to one that performs stochastic gradient descent on the Kullback-Leibler divergence for a heteroskedastic gaussian mixture density model.