logo  ELT Concourse teacher training
Concourse 2

Connected speech


If anything in the first part of this guide is unfamiliar to you, you should probably take a little time to refresh your memory concerning the essential concepts in phonology.  You can open that guide in a new tab or window by clicking here.  You should also have worked through the guide to consonants and the guide to vowels before tackling this.
It is also assumed, in what follows, that you can read and write phonemic transcription.


Isolated or in a stream?

Connected Speech phenomena occur where words meet.  The first distinction to get clear is that of the pronunciation of a word in isolation and in a stream of speech.  For example, if you read the words on this list aloud, one at a time, you will probably be pronouncing them in what is called their 'canonical', 'citation' or 'isolation' form.  Here's the list to try.  If you can, transcribe the words on a piece of paper as you pronounce them.  Click here when you have done that.

are been
have that
from and
ten bottles

Now memorise this sentence and then say it aloud at normal speed, contracting any words you can.

I have been to town and here are the ten bottles of beer I said that I would get from the shop.

That probably would have been pronounced something like this:

/aɪv bɪn taʊn ənd hɪər ə ðə tem ˈbɒt.l̩z əv bɪər ˈaɪ ˈseðət aɪd ˈɡet frəm ðə ʃɒp/

Look at the parts in black in that transcription and compare them to the transcription of the isolated forms of the words.  What do you notice?  Click here when you have an answer.


The features of connected speech

There are five main areas to understand.

Weak forms
We saw examples of these in the sentence above.  The most common weak forms use the schwa (/ə/) so, for example, for is pronounced /fə/, are is pronounced /ə/, to is pronounced /tə/ and but is pronounced /bət/ (before a vowel) or /bə/ and so on.
There are other weakenings, such as the replacement of the /iː/ in been with the shortened /ɪ/ sound.  The word our in its full form is pronounced /ˈaʊə/ in isolation but is usually weakened to /ə/ or /ɑː/ in connected speech.
Most of these weak forms affect structural words rather than meaning-carrying words but the reduction of the sound at the end of father with the elision of the /r/ before a non-vocalic sound (in British English) is also an example of weakening and another feature of connected speech (elision).
For a list of the commonest weak forms in English, click here.
This occurs when a sound is altered because the speaker is anticipating the following sound or influenced by a previous one (or both).
There are three possibilities:
Regressive assimilation:
In our example, ten bottles sounds like tem bottles because the speaker is anticipating the bilabial voiced consonant /b/ and changes the alveolar nasal /n/ to the bilabial /m/ to make pronouncing the /b/ sound easier.
Try saying his son and his daughter quickly.  Note that it is pronounced like this: /hɪz sʌn ndɪs ˈdɔː.tə/.  The 's' in his daughter is not voiced as it is in his son.  (We drop the 'h' on the second his as well (that's also elision).)
(Regressive assimilation is sometimes, slightly confusingly, called anticipatory assimilation.)
Progressive assimilation:
Sounds may change because the speaker is influenced by the preceding sound.  For example, try saying
    There's not much cider left
quickly and focus on how the 'c' in cider is pronounced.  If you say cider individually, the 'c' is pronounced /s/ as one expects (/ˈsaɪ.də/).
However, in this environment, the influence of the /tʃ/ at the end of much means that the 'c' in cider is pronounced as if it were 'sh', as /ʃ/.  The transcription is, then, not
A simple example of progressive assimilation occurs with the pronunciation of a plural 's' in English.  For example, words ending in unvoiced consonants such as /t/, /k/ or /p/ will make the plural 's' pronounced as /s/:
    hats and coats (/hæts.ənd.kəʊts/)
    talks and walks (/tɔːks.ənd.wɔːks/)
    tops and tips (/tɒps.ənd.ˈtɪps/)
but words ending with voiced consonants such as /d/, /ɡ/ or /b/ will have the 's' pronounced as /z/:
    odds and sods (/ɒdz.ənd.sɒdz/)
    lugs and mugs (/lʌɡz.ənd.mʌɡz/)
    bags and logs (/bæɡz.ənd.lɒɡz/)
It's even easier to spot the difference in
    cats and dogs (/kæts.ənd.dɒɡz/)
Reciprocal assimilation:
Here, sounds influence each other and may fuse together.  For example, try saying
    Won't you come with us
quickly and note how won't you is pronounced.  It is not /wəʊnt ju/ except in slow careful speech but is actually pronounced /wəʊntʃu/.  What has happened is that the 't' and 'y' sounds have coalesced to make the /tʃ/ sound.
(Reciprocal assimilation is sometimes called coalescent assimilation.)
Assimilation, by the way, also explains the tendency in English to mess with prefixes, using 'im-' before words beginning with bilabials (so we have impossible, impolite, immobile etc. rather than *inpossible, *inpolite, *inmobile).  On the other hand, words beginning in alveolar sounds such as /t/ or /d/ or velar sounds such as /k/ and /ɡ/ will normally take either un- or in- (so we have intolerant, undefined, unconnected, ungrateful etc.).  This is not an absolute rule, unfortunately because exceptions such as unmoved, unpleasant etc. are common.
There are lots of possible assimilation changes in English.
Assimilation happens like this (after Field, 2008:150):
Before these sounds this sound assimilates to for example transcription
/m/, /b/, /p/ /n/ /m/ then bake it /ðem.beɪk.ɪt/
then put it /ðemˈpʊt.ɪt/
then mix it /ðe.mɪks.ɪt/
/t/ /p/ or /ʔ/ that mixture /ðəʔ.ˈmɪks.tʃə/
that bread /ðəp.bred/
that paper /ðəʔ.ˈpeɪ.pə/
/d/ /b/ or /ʔ/ mad man /mæʔ.mæn/
mad boy /mæʔ.ˌbɔɪ/
mad policy /mæb.ˈpɒ.lə.si/
/k/, /ɡ/ /n/ /ŋ/ bean cakes /biːŋ.keɪks/
bean good /biːŋ.ɡʊd/
/t/ /k/ or /ʔ/ that cake /ðəʔ.keɪk/
but go /bək.ɡəʊ/
/j/ /t/ /tʃ/ might you /maɪtʃu/
/d/ /dʒ/ had you /hədʒu/
/ʃ/ /s/ /ʃ/ glass shop /ˈɡlɑː.ʃɒp/
/z/ /ʃ/ has shut /hæ.ʃʌt/
This usually occurs when the consonant sound at the end of one word joins the vowel at the beginning of the next so we get, for example, an orange pronounced as a norange (/ə nˈɒ.rɪndʒ/) and right arm becomes something like rye tarm (/raɪ tɑːm/).  Note, too, the way the pronunciation of the boys of Eton differs from the boys have eaten in rapid speech.
A by-product of catenation, incidentally, is the phenomenon variously known as false splitting, misdivision, false separation or coalescence in which a word such as apron, originally from the Old French naparon, is falsely separated into the Modern English an apron.  There are other examples in the guide to word formation.
This refers to boundaries between words and awareness of it allows us to distinguish between, for example:
    I scream
    ice cream
    my turn
    might earn
Usually, the distinction between these pairs is recognisable by either stress:
    /ˈaɪ.skriːm/ vs. /ˈaɪ.ˌskriːm/
or whether a consonant is aspirated:
    /maɪtʰɜːn/ vs. /maɪtɜːn/
or by noticing the syllabic structure:
    /maɪ.tɜːn/ vs. /maɪt.ɜːn/
The detail of how we identify the juncture between words is actually usually redundant because the context almost invariably makes clear what is meant and should be understood.
A clear example of this is the tendency in English to use contracted forms, leaving out whole sections of words (hasn't, can't, wouldn't've etc.), but there are other examples such as:
    the loss of the /d/ in sandwich (/ˈsæn.wɪdʒ/)
    the pronunciation of library as /ˈlaɪ.bri/ or comfortable as /ˈkʌmf.təb.l̩/
    the dropping of /h/ sounds in rapid speech, as in give it to him rendered as /ɡɪv.ɪt.tu.ɪm/.
Essentially, three kinds of elision are recognised (as well as the initial /h/ elision):
Function word reduction occurs when all or part of a function word such as of is elided as in cup of coffee being pronounced cuppa coffee (/kʌpə ˈkɒ.fi/.  In many cases the word and is reduced to 'n' as in tea 'n' cakes (/tiː n̩ keɪks/).
Polysyllabic word reduction occurs in our example of library as /ˈlaɪ.bri/ and also in many other longer words such as probably (/ˈprɒbli/), comfortable (/ˈkʌmf.təb.l̩/) etc.
Cluster reduction occurs when a consonant cluster, such as the one at the end of sixths, is simply difficult to pronounce.  The result is usually something like /sɪkθs/ or even /sɪkfs/.  Learners whose languages do not allow the same clusters as English are often tempted to use cluster reduction inappropriately, for example, pronouncing crisps as /krɪps/ rather than /krɪsps/.  For more see the guides to syllables and phonotactics and the guide to teaching troublesome sounds.
A word that causes persistent problems is clothes because learners feel they should have a go at the consonant cluster at the end /kləʊðz/.  In rapid speech, however, the word is often pronounced /kləʊz/ with the elision of the /ð/.  If learners always say it that way, they will never be misunderstood and it's a good deal easier for them.
This is, in contrast, the addition of sounds in connected speech.  The three sounds usually intruded are /w/, /j/ and /r/.  Consider the pronunciation of these phrases and note the transcriptions.
an intrusive /w/:
    go on (/ɡəʊw ɒn/)
    hoe in (/həʊw.ɪn/)
an intrusive /j/:
    I ate it (/aɪj et ɪt/)
    fly it (/flaɪj.ɪt/)
an intrusive /r/:
    law and order (/ˈlɔːr ənd ˈɔː.də/)
    Victoria and Albert Museum (/vɪk.ˈtɔː.rɪər.ənd.ˈæl.bət.mjuː.ˈzɪəm/)
Erroneous intrusion:
Learners whose languages do not have many (or any) consonant clusters are often tempted to intrude a vowel, often a /ə/, /ɪ/ or a /e/, between elements of a difficult cluster.
Many Arabic speakers, for example, may pronounce screwdriver as /ˈsekəruː.dəraɪ.vər/ rather than /ˈskruː.draɪ.və/, i.e. 6 instead of 3 syllables.  Japanese speakers may do likewise.
Speakers of many other languages will produce crisps as /krɪspəs/ or /krɪspes/ instead of /krɪsps/ and we saw above that clothes is often produced as /kləʊðez/ or /kləʊðɪz/instead of /kləʊðz/.

Of course there's a test.

Go to the index for the pronunciation section of the in-service guides

Field, J, 2008, Listening in the Language Classroom, Cambridge: Cambridge University Press
Roach, P, 2009, English Phonetics and Phonology: A practical course (4th edition), Cambridge: Cambridge University Press