The syllable and phonotactics


Syllabification is to do with how we chop up words.  If we can't analyse the syllable, it becomes very difficult to deal with things like word and sentence stress and almost impossible to transcribe accurately what people say.
In addition to understanding phonemic analysis at the level of vowels and consonants, we need to know how these elements combine to make comprehensible speech rather than just a series of noises.

The second concern of this guide is to outline the nature of English phonotactics.  This refers to what is and is not allowable in English in terms of how syllables may be constructed.

For the purposes of the first part of this guide, we shall define a syllable as a unit of pronunciation having one vowel sound, with or without surrounding consonants.
As we shall see, there are exceptions.


Counting syllables

Most people are able to count how many syllables there are in an individual word, even if they would be hard put to define exactly what is meant by the term 'syllable'.  They do this, when asked, often by tapping out the numbers on a table or by humming the sound of the word.  Inadvertently, they are betraying just how crucial syllable structure is to phrasing and rhythm in the language.
Roach (2010: 56), however, notes:

As a matter of fact, if one tries the experiment of asking English speakers to count the syllables in, say, a recorded sentence, there is often a considerable amount of disagreement.

This is because, in more extended speech than a single word or short phrase, what is perceived is often at odds with what one knows about the sentence and lexical structures.  Even single words, providing they are sufficiently polysyllabic, may cause problems and counting the number of syllables in a word such as denationalisation requires a little concentration.  There are seven when the word is pronounced carefully and in isolation [/ˌdi:.ˌnæ.ʃə.nə.laɪ.ˈzeɪʃ.ən/] but likely to be fewer if the word appears in a string because the fifth, which is unstressed, is often elided.  Note, by the way, the convention of placing a '.' between syllables in transcriptions.


Syllable structures

Following Roach (op cit.) we will focus on four kinds of syllables and their natures:

  1. Minimal syllables
    These are the simplest (and quite rare).  They usually consist of a single vowel or diphthong.  Examples are:
    or transcribed as /ɔː/
    a transcribed as /eɪ/ or /ə/
    uh? often transcribed as /ə/ or /ʌ/
    sh! transcribable as /ʃ/
    A simple way of representing this structure is with V – the syllable consists only of a vowel – or C – the syllable consists only of a consonant.  One transcription of the second example above (a) would be represented as VV because it is formed from a diphthong.
  2. Onset-only syllables
    These have one or more consonants before the vowel.  Examples are:
    go /ɡəʊ/
    be /bi/
    say /ˈseɪ/
    try /ˈtraɪ/
    The consonant(s) before the vowel are referred to as the syllable Onset.
    A simple way of representing this structure is to with CV – the syllable consists only of a consonant and a vowel (be).  The first example above (go) and the third (say) would be represented as CVV because they both have a diphthong following the consonant, and the last example above (try) would be represented as CCVV because it has two consonants forming the Onset and a diphthong following those.
  3. Coda-only syllables
    These have no Onset but have a consonant or consonants at the end (referred to as the Coda).  Examples are:
    eat /iːt/
    ask /ɑːsk/
    angst /æŋst/
    of /ɒv/
    A simple way of representing this structure is to with VC – the syllable consists only of a vowel and a consonant.  The second example above (ask) would be represented as VCC and the third (angst) as VCCC.
  4. Syllables with Onset and Coda
    Examples are:
    but /bʌt/
    start /stɑːt/
    bus /bʌs/
    shot /ʃɒt/
    hurt /hɜːt/
    A simple way of representing this structure is to with CVC – the syllable consists of a consonant, a vowel and another consonant.  The second example above (start) would be represented as CCVCC and the last example (hurt) as CVCC.
    When an Onset or Coda contains more than one consonant, it is described as complex.

The system looks like this, taking the syllable shrubs [/ʃrʌbz/] as our example because it has both a complex consonant cluster onset (/ʃr/) and a similarly complex coda (/bz/):
syllable structure
The structure is, therefore, CCVCC.

The Rhyme is so called because it is this part of the syllable which allows a poetic rhyme as in hatch, match and dispatch, for example.  You may see it spelled as Rime in US texts.
In some analyses, e.g. Roach (op cit.), the Nucleus is referred to as the Peak.  Here we are following Zec (in de Lacy (2007: 171)).
In some analyses, too, especially of languages other than English, the system is seen branching to the left rather than, as here, to the right so the Onset and Nucleus are considered together as the Body and the Coda stands to the right.
Other analysts, incidentally, doubt the whole existence of the syllable as a unit of analysis.



The term comes from the Greek and refers to the arrangement of sounds in a language.  In other words, it looks at what is possible in terms of the combinations of V and C.
The possible number of ways to arrange 24 consonants in two-sound combinations or clusters is 576 but no language on earth will allow anything like that number of combinations.  If we consider three-sound clusters, the number of possibilities rises to 13,824 but the most flexible and liberal of languages, such as Russian, in this respect will not exhibit more than a tiny fraction of all the possible combinations.
For example, there are no English words in the Chambers 20th Century Dictionary which begin with shm.  There are, however, some which begin with schm such as schmelz, schmooze and schmuck (all of which are loan words from German or Yiddish and none of which cause English speakers any pronunciation trouble).  English football commentators had little difficulty pronouncing the name of the Danish goalkeeper Schmeichel.  Nevertheless, the combination of /ʃ/ and /m/ as the onset of a syllable is not allowed by the phonotactic rules of English and most native speakers would reject a word like shmig without a second thought.
There are some other general rules concerning what is and is not allowed in English:

  1. No syllables containing more than three consonants may occur in the initial position in a word.
  2. Any syllables can begin and end with a vowel.
  3. No syllables can end with more than four consonants (and more than three is very rare).
  4. No syllable may start with /ŋ/ and /ʒ/ in that place is vanishingly rare.
  5. No word may end with a syllable ending in /h/, /w/ or /j/.

Initial consonants

There are also some useful generalisations we can make about initial-position consonant clusters (i.e., those consisting of more than a single consonant).

  1. The consonant /s/ is followed by a limited range of single consonants.  In these cases, the /s/ is referred to as pre-initial and what follows as the initial consonant.  The pre-initial /s/ may be followed only by:
    1. /p/ (as in spend [/spend/]
    2. /t/ (as in step [/step/])
    3. /k/ (as in skip [/skɪp/])
    4. /f/ (as in sphere [/sfɪə/)
    5. /m/ (as in small [/smɔːl/])
    6. /n/ (as in snip [/snɪp/])
    7. /l/ (as in slip [/slɪp/])
    8. /w/ (as in swing [/swɪŋ/])
    9. /j/ (as in suit [/sjuːt/])
      This is becoming rare in Modern English and normally transcribed as /suːt/ rather than /sjuːt/.

It follows, therefore that the following twelve possible combinations are not permitted in English: /sb/, /sd/, /sɡ/, /sθ/, /ss/, /sʃ/, /sh/, /sv/, /sð/ /sz/, /sʒ/ and /sŋ/.
The reasons for this lie not, as many imagine, in the difficulty of pronouncing the combinations but in the historical development of English.  There is no obvious reason why a person who can pronounce skin satisfactorily cannot pronounce sgin.  For example, in Italian /sf/ and /sɡ/ both occur and /sz/, /sd/ and /sb/ are common in many other languages.  The Swedish for Sweden is Sverige and pronouncing Sri Lanka is also not difficult for most English speakers.
In many analyses only /s/ + /p/, /t/ and /k/ are considered as having pre-initial /s/.  The other combinations are described as having /s/ followed by a post-initial consonant.  For teaching purposes, this doesn't matter, of course, but it should be noted that the combinations of /s/ + /p/, /t/ and /k/ are very common in English and forbidden in many languages.

  1. Initial three-consonant clusters with a pre-initial /s/ are also a restricted set.  We can only have:
    1. /sp/ + /l/ (as in splay [/spleɪ]/)
    2. /sp/ + /r/ (as in spray [/spreɪ/])
    3. /sp/ + /j/ (as in spurious [/ˈspjʊə.rɪəs/])
    4. /st/ + /r/ (as in string [/strɪŋ/])
    5. /st/ + /j/ (as in, in some people's production, stew [/stjuː/] although others may use /stuː/)
    6. /sk/ + /l/ (as, just about only, in sclerotic [/sklɪə.ˈrɒ.tɪk/])
    7. /sk/ + /r/ (as in screech [/skriːtʃ/])
    8. /sk/ + /w/ (as in squeal [/skwiːl/])
    9. /sk/ + /j/ (as in, in some people's production, skew [/skjuː/] although others may use /skuː/)

Of these nine possibilities, /sk + /l/ is very rare and /st/ + /j/, /sp/ + /j/ and /sk/ + /j/ only occur regularly in some varieties, notably British English.

  1. The occurrence of post-initial consonants in English is also restricted.  These are the consonants which can occur in the second position and the ones in question are: /l/, /r/, /w/ and /j/.
    /l/ only occurs in combination with /r/ only occurs with: /w/ only occurs with: /j/ only occurs with:
    1. /p/ (play [/ˈpleɪ/])
    2. /k/ (clay [/kleɪ/])
    3. /b/ (black [/blæk/])
    4. /ɡ/ (glum) [/ɡlʌm/])
    5. /f/ (flab [/flæb/])
    6. /s/ (slab [/slæb/])
    1. /p/ (pray [/preɪ/])
    2. /t/ (tray [/treɪ/])
    3. /k/ (cry [/kraɪ/])
    4. /b/ (break [/breɪk/])
    5. /d/ (drag [/dræɡ/])
    6. /ɡ/ (grid [/ɡrɪd/])
    7. /f/ (frog [/frɒɡ/])
    8. /θ/ (throw [/θrəʊ/])
    9. /ʃ/ (shrink [/ʃrɪŋk/])
    1. /t/ (twin [/twɪn/])
    2. /k/ (quick [/kwɪk/])
    3. /d/ (dwell [/dwel/])
    4. /θ/ (thwack [/θwæk/])
    5. /s/ (swipe [/swaɪp/])
    1. /p/ (pew [/pjuː/])
    2. /t/ (tube [/tjuːb/])
    3. /k/ (cute [/kjuːt/])
    4. /b/ (beauty [/ˈbjuː.ti/])
    5. /d/ (duty [/ˈdjuː.ti/])
    6. /f/ (future [/ˈfjuː.tʃər/])
    7. /s/ (suit [/sjuːt/])
    8. /h/ (huge [/hjuːdʒ/])
    9. /v/ (view [/vjuː/])
    10. /m/ (mew [/mjuː/])
    11. /n/ (new [/njuː/])
    12. /l/ (lewd [/ljuːd/])

Some of these combinations are quite rare (e.g., /θ/ + /w/, with only three entries in most dictionaries).
This means that there are, for example, 17 consonants which cannot, in English, immediately occur before /l/ in the initial position, 14 which cannot occur before /r/, 18 which cannot occur before /w/ and 11 which cannot occur before /j/ (and /sj/ is rare in some varieties).
Again, other combinations, which are not permitted for English syllables, are not unpronounceable.  English speakers have, for example, no trouble with the Welsh name Gwen and there is no obvious reason why those who can pronounce throw and through could not pronounce the words with a voiced initial consonant (/ðrəʊ/ instead of /θrəʊ/) but /ðr/ is simply not allowed in this place.
Most of the combinations forbidden in English appear in other languages.  German allows /pf/, French /vr/, Greek /ɡn/, /vl/ and /vɡ/, Czech/tk/ and so on.

Final consonants

We can also identify phonotactic rules for final consonants in English.  The final morpheme of many words in English can be described as a vowel plus a final consonant preceded by another to make a cluster or a final consonant followed by another to form the cluster.
For example, in forming plurals and verb inflexions such as past tenses and other structures, English has the final consonant followed by /s/ (as in lots [/lɒts/, /z/ (as in lads [/lædz/]), /t/ (as in sacked [/sækt/]) or /θ/ (as in seventh [/ˈsevn̩θ/]).  In these cases, the /s/, /z/, /t/ and /θ/ are the only four allowable post-final consonants.
Pre-final consonants can be identified because they do not perform these grammatical functions and there is no overlap between the classes apart from the ubiquitous /s/ which appears post-finally in inflexions and pre finally in other environments.

  1. All consonants can come in the final position of a word except /h/, /w/ and /j/.
  2. /r/ only comes in the final position in what are called rhotic accents in which the sound is always pronounced.  In RP or BBC English the word father is transcribed as /ˈfɑːð.ə/ but in AmE it is /ˈfɑːð.r/, for example.
  3. There are four post-final consonants appearing in clusters.
    /s/, /z/, /t/, and /θ/ are the only ones which can follow the final consonant as in, e.g.:
    tasks, heads, packed, fifth (/tɑːsks/, /hedz/, /pækt/, /ˈfɪfθ/)
    So, formulations such as tabd, pasm etc. are not allowed although they are in many languages.
  4. There are five pre-final consonants appearing in clusters.
    /m/, /n/, /ŋ/ /l/ and /s/ are the only ones which can precede the final consonant.  For example:
    lump, bank, ringed, belt, last (/lʌmps/, /bæŋks/, /rɪŋd/, /belt/, /lɑːst/)
    Similarly, formulations of sounds such as glanb, fodg etc. are not allowable as they may be in other languages.


Phonotactic rules and implications for teaching

You may well be wondering why a site mostly concerned with teaching and learning English takes the trouble to explain all this.
There are implications for teaching because phonotactic rules are extremely language specific and also not something of which learners of English (or their teachers) may even be aware.
What results is that learners will attempt, often, to apply the rules of their first language(s) to English and that may have unpredictable results (most of them errors).
Simply knowing what syllable structures and natures are allowed in English and what are rare or forbidden altogether will direct you to some useful areas of pronunciation practice but there's more to it than that.
The following are just a few examples with a summary at the end.
Some knowledge, however incomplete, of what syllable structures are permitted and where they may fall in your learners' first language(s) will be helpful in planning what to teach, where to lay the focus and, just as importantly, what you can safely assume is familiar already.

In Spanish:
/s/ plus a consonant are not permitted at the beginning of words.  They may occur elsewhere in words so the problem is not one of pronouncing the sounds, it is one of assuming that English follows the same rules.
The result often is that Spanish speakers will insert a sound (usually /e/) before the consonant cluster and pronounce, for example, strike as estrike and school as eschool (/estraɪk/ and /eskuːl/).
In Arabic:
no consonant clusters appear at all, except in some rare loan words.  The tendency, therefore for Arabic speakers to insert a vowel epenthetically into English words is understandable so a word like scratch may be pronounced as sekaratesh.
In Farsi (Persian):
initial consonant clusters do not occur and the same tendency will be apparent with e.g., spin rendered as sepin.
In French:
it is permissible to have initial consonant clusters of /blj/, /klj/ or /plj/ but they are disallowed in English.  Speakers of French may be tempted, therefore, to pronounce clue as /kljuː/ or blue as /bljuː/ and that contributes to a foreign accent, of course.
French also allows /vr/ as the onset of a syllable (as in vrai) and that is forbidden to English.  The result can be that /v/ in the initial position may actually be pronounced /vr/.
In Chinese:
Chinese only allows a very restricted set of clusters and, for example, all the ones in the word clusters (/kl/, /st/ and /rs/) are forbidden.  The tendency, until speakers of Chinese languages have mastered consonant clusters in English, is to insert a vowel, often something like a schwa or /e/ between the elements of many clusters.  The result is that a word like screw may be rendered as sekeru.
Additionally, and the language has this in common with, e.g., Thai, there are no final consonants barring /ŋ/ in most dialects.  The result is often that speakers of these languages will simply fail to produce final consonants at all.  Final consonant clusters, which may, in English be made up of up to four consonants are even more problematic.
In Japanese:
no consonant clusters occur except (and in a restricted way) in the middle of words.  Loan words from English into Japanese are altered in their pronunciation to make them, to the uninitiated, unrecognisable.  The word beer, for example gets an added vowel to conform to Japanese phonotactics and becomes biiru.  Similarly, story may be rendered as sutori and so on.
Even in the middle of words, where consonant clusters are permitted, Japanese phonotactic rules mean that any consonant following a nasal must be a voiced consonant so, e.g., /nd/, is allowed but /nt/ is forbidden and /mb/ is allowable but not /mp/.  This leads to multiple mispronunciations of words like untie, empty and so on.
In Italian:
it is not possible to have the consonant clusters of pl/ or /kl/ and they do not occur in native Italian words.  Italian speakers may have difficulty pronouncing and remembering the spelling of words like comply or unclear.
Very few Italian words end with consonants (and they are often abbreviations or loan words from other languages).  Syllables with consonant-cluster codas will be very challenging for most Italian-speakers to produce and many will insert a vowel at the end (often a schwa) resulting in the recognisable Italian accent in English.
In German:
the initial consonant /ʃl/ is permitted and, indeed, common (as in schlank) but it does not exist in English.  German speakers may produce it in English instead of /sl/.
German also allows /pf/ in the initial position and elsewhere (as in Pfad) but that is another forbidden combination in English in any syllable at all so its production leads to errors.
In Greek:
/pn/ and /ps/ are both allowed in the Onset to a syllable but English forbids them (and changes the pronunciation of, e.g., pneumatic and psychotic to conform).  Greek speakers may assume that words spelled with 'ps' or 'pn' at the beginning may be pronounced fully.  The same applies to words beginning 'gn' such as gnostic.
In Russian:
because of the gradual loss of short vowels, many initial clusters are permitted which in English are forbidden.  These include /pt/, /bd/, /tk/, /kt/ a /gd/, for example.  While this makes it challenging for English speakers to pronounce Russian the reverse effect is that Russian speakers may deploy these, to them, common clusters in English words so, e.g., cab driver may be pronounced with a single central consonant cluster and a final /r/ (/ˈkæbd.raɪ.vər/) rather than the normal English pronunciation which is to assimilate the /b/ into /p/ and produce /kæp ˈdraɪ.və/ or simply to elide the /b/ altogether.

So, in summary, the phonotactic rules in various languages have three major effects on the learners' production of English and result in an identifiable foreign accent which may betray the speaker's first language:

  1. Syllable shapes and clusters which are forbidden in the learners' first languages will be difficult to acquire in English and may be substituted by a shape and structure which is more familiar.
  2. Consonant and consonant cluster codas, such as /kt/ and /dz/ in particular which are not allowed in the learners' first languages may be ignored altogether.
  3. Syllable shapes and clusters which are forbidden in English but allowed in the learners' first languages may be inserted into English with a resulting foreign accent.  In extreme cases, learners may not be comprehensible at all.


Syllabic consonants

/ðə ˈbætl̩ əv ˌwɔː.tə.ˈluː/ The Battle of Waterloo

We started this guide by defining a syllable as a unit of pronunciation having one vowel sound, with or without surrounding consonants and noted there that there are exceptions.  Here they are.
If you look at the transcription above of The Battle of Waterloo, you can see a small dot below the /l/ of /ˈbætl̩/.  This is the conventional way of denoting a syllabic consonant.  The reason is that, although in careful speech, the word battle might be transcribed as /ˈbæt.əl/ with a schwa between the /t/ and /l/ sounds, in rapid speech this is often lost and the /l/ stands alone as the syllable.
Some transcriptions might also raise the schwa to denote a similar effect: /ˈbæt.əl/.
There are three consonants (ignoring /m/ and /ŋ/, which are marginal cases affected by features of connected speech) that can stand alone in normal speech as syllables in their own right although, because there is no discernible vowel in such syllables, they cannot be stressed.
Here's the list:

Syllabic /l̩/
This is the commonest syllabic consonant and occurs frequently.  Examples are:
  word transcription in isolation in rapid speech
alveolar consonant + /l/ bottle /ˈbɒt.əl/ /ˈbɒt.l̩/
huddle /ˈhʌd.əl/ /ˈhʌd.l̩/
plosive consonants + /l/ couple /ˈkʌp.əl/ /ˈkʌp.l̩/
struggle /ˈstrʌɡ.əl/ /ˈstrʌɡ.l̩/
/n/ + /l/ panel /ˈpæn.əl/ /ˈpæn.l̩/
Syllabic /n̩/
This commonly follows alveolar sounds and is rarer in other cases.  Examples are:
  word transcription in isolation in rapid speech
alveolar plosive + /n/ threaten /ˈθret.ən/ /ˈθret.n̩/
hidden /ˈhɪd.ən/ /ˈhɪd.n̩/
alveolar fricative + /n/ hastening /ˈheɪs.ən.ɪŋ/ /ˈheɪs.n̩.ɪŋ/
frozen /ˈfrəʊ.zən/ /ˈfrəʊ.zn̩/
other consonants + /n/ wagon /ˈwæ.ɡən/ /ˈwæ.ɡn̩/
Syllabic /r̩/
This is uncommon in many British varieties of English but occurs widely in others, including AmE and the accents spoken in Scotland, the South-West of Britain and East Anglia.  Examples are:
  word transcription in isolation in rapid speech
AmE mother /ˈmʌð.ər/ /ˈmʌð.r̩/
pillar /ˈpɪ.lər/ /ˈpɪ.lr̩/
Rapid speech (all varieties) preference /ˈpre.fə.rəns/ /ˈpre.fr̩əns/
deference /ˈde.fə.rəns/ /ˈde.fr̩əns/
Syllabic consonants are common in other languages, too, but may, as in German, be considered substandard or dialect forms.  Germanic, Scandinavian and Slavic languages all exhibit syllabic consonants with /l/, /n/ and /r/ and some will also use /m/ and other consonants this way.

