Advances in neural variational inference have facilitated the learning of power- ful directed graphical models with con- tinuous latent variables, such as varia- tional autoencoders. The hope is that such models will learn to represent rich, multi-modal latent factors in real-world data, such as natural language text. How- ever, current models often assume simplis- tic priors on the latent variables — such as the uni-modal Gaussian distribution — which are incapable of representing com- plex latent factors efficiently. To over- come this restriction, we propose the sim- ple, but highly flexible, piecewise constant distribution. This distribution has the ca- pacity to represent an exponential num- ber of modes of a latent target distribution, while remaining mathematically tractable. Our results demonstrate that incorporating this new latent distribution into different models yields substantial improvements in natural language processing tasks such as document modeling and natural language generation for dialogue.