Sparse coding schemes are employed by many sensory systems and implement efficient coding principles. Yet, the computations yielding sparse representations are often only partly understood. The early auditory system of the grasshopper produces a temporally and population-sparse representation of natural communication signals. To reveal the computations generating such a code, we estimated 1D and 2D linear-nonlinear models. We then used these models to examine the contribution of different model components to response sparseness. 2D models were better able to reproduce the sparseness measured in the system: while 1D models only captured 55% of the population sparseness at the network's output, 2D models accounted for 88% of it. Looking at the model structure, we could identify two types of computation, which increase sparseness. First, a sensitivity to the derivative of the stimulus and, second, the combination of a fast, excitatory and a slow, suppressive feature. Both were implemented in different classes of cells and increased the specificity and diversity of responses. The two types produced more transient responses and thereby amplified temporal sparseness. Additionally, the second type of computation contributed to population sparseness by increasing the diversity of feature selectivity through a wide range of delays between an excitatory and a suppressive feature. Both kinds of computation can be implemented through spike-frequency adaptation or slow inhibition—mechanisms found in many systems. Our results from the auditory system of the grasshopper are thus likely to reflect general principles underlying the emergence of sparse representations.