Previous work on the acquisition of consonant clusters points to a tendency for word-final clusters to be acquired before word-initial clusters (Templin, 1957; Lleó & Prinz, 1996; Levelt, Schiller & Levelt, 2000). This paper evaluates possible structural, morphological, frequency-based, and articulatory explanations for this asymmetry using a picture identification task with 12 English-speaking two-year-olds. The results show that word-final stop+/s/ clusters and nasal+/z/ clusters were produced much more accurately than word-initial /s/+stop clusters and /s/+nasal clusters. Neither structural nor frequency factors are able to account for these findings. Further analysis of longitudinal spontaneous production data from 2 children aged 1;1-2;6 provides little support for the role of morphology in explaining these results. We argue that an articulatory account best explains the asymmetries in the production of word-initial and word-final clusters.