Transcription: a teach-yourself guide


This guide also forms Strand 6 of the Teacher Development section.

This guide concerns transcription, not a description of the sounds of English.  For a description of how the sounds of English are made and what mouth parts do, see the in-service guides to pronunciation.  This is:

  • Firstly, a guide to teaching yourself to transcribe words in what are called their citation forms, i.e., the way they are pronounced when you ask someone to read a list
  • Secondly, a guide to how things are pronounced in normal but not very rapid or mumbled connected speech


The sounds transcribed here are those of an educated southern British-English speaker.  That is not intended to imply that the dialect is somehow better than others.  It is the conventional way to do these things.


The sounds of English: phonemes, allophones and minimal pairs

The first this to be aware of is that we are talking about English sounds.  The study of language sounds (phonemic analysis) is language specific.  This mini-course is concerned with the transcription of English sounds.

In English the sounds /p/ and /b/ are phonemes because changing one to the other affects the meaning of a word (bat or pat).  This is called the Minimal Pair Test:
If you change a single sound in a word and make a new word, the sound you have changed is a phoneme in that language.
In other languages, Arabic, for example, these two sounds are not phonemes and changing one to the other will not change the meaning of a word (but it might sound odd).
Allophones are slightly different pronunciations of certain phonemes which do not affect the meaning of what is said (although it may sound odd).  We saw above that /p/ and /b/ are allophones in Arabic as are, incidentally, /f/ and /v/ in some varieties.  Changing one for the other does not affect the meaning of what you say.
All languages have a number of allophones.  For example, in English the sound /t/ can be pronounced with and without a following /h/ sound.  Compare the sounds in track and tack.  In English, these sounds are not phonemes because you can change /t/ to /th/ without changing the meaning of a word.  In some languages, Mandarin, for example, /t/ and /th/ are separate phonemes and swapping them around will change the meaning of what you say.  The same applies to /k/ vs. /kh/ (ski vs. cat) and /p/ vs. /ph/ (spin vs. pot).
The /l/ sound in English also has two allophones, the light [l] as in lap and the dark version (which has the symbol [ɫ]) and occurs at the end of words like moveable.  The word lull has one of each, the light 'l' at the beginning and the dark 'l' at the end.
Minimal pairs:
Pairs of words which are distinguished only by a change in one phoneme are called minimal pairs.  For example, hit-hat, kick-sick, fit-bit, sheep-ship, jerk-dirk, hot-cot, love-live etc. are all distinguished in meaning by a single change to a vowel or a consonant.  That's in English, of course.  It bears repeating that what is an allophone in English may be a phoneme in other languages and vice versa.
Minimal pairs can also be distinguished by where the stress falls.  For example:
If you stress the word export on the first syllable, you are referring to the noun.  Stress the second syllable and you refer to the verb.
Stress the word convict on the first syllable and you refer to a resident of a prison.  Stress the second syllable and you refer to act of finding someone guilty of an offence.


English phonemes

Here's the list you'll learn.  If you want to download this chart as a PDF document to keep by you as reference, click here.

English phonemes



The consonants are the easiest so we can start there.  Notice that most of them are actually the same as the written form in many cases but be aware that spelling in English is not a reliable guide to pronunciation.

To get started, take a piece of paper and transcribe the consonants in these words, using the right-hand side of the chart.
Click on the table when you have done that.

guide 2


Voicing describes how phonemes may be different depending on whether the vocal cords vibrate or not at the time of pronunciation.  For example, the /k/ sound is made without voicing but the /ɡ/ sound is made with the mouth parts in the same place but with voice added.  If you put your hand on your throat and say the words sue and zoo, you will see what is meant and feel a slight vibration on the second word (/s/ is unvoiced but /z/ is voiced).
Of the consonants, 16 form pairs of voiced-unvoiced sounds:

Unvoiced Voiced
/p/ /b/
/tʃ/ /dʒ/
/f/ /v/
/s/ /z/
/k/ /ɡ/
/t/ /d/
/θ/ /ð/
/ʃ/ /ʒ/

You have to listen out for voicing when you are transcribing because voiced and unvoiced consonants are full phonemes in English.  The words pit and bit, char and jar, fine and vine, sing and zing, Kate and gate, tuck and duck, teeth (plural noun) and teeth (verb), ruche and rouge are all minimal pairs in English (i.e., words distinguished by a single phoneme only).



Here's a list of the vowels in English (authorities may differ slightly about how many there are, incidentally).

/iː/ sleep
/æ/ sat
/ɪə/ here
/ɪ/ kid
/ʌ/ blood
/ʊə/ sure
/ʊ/ put
/ɑː/ part
/ɔɪ/ boy
/uː/ goose
/ɒ/ hot
/eə/ lair
/e/ Fred
/i/ happy
/eɪ/ lace
/ə/ about
  /aɪ/ price
/ɜː/ verse
/əʊ/ boat
/ɔː/ fought
/aʊ/ south

What do you notice about the difference between the first two columns and the third column?
Click to reveal: eye

pure vowels

If you haven't already done so, to do this exercise, you may want to download the chart as a PDF document so you can have it at your elbow.  Click here to do that.

Using the chart, transcribe the following words and then click on the table to check your answers.

test 1

You did, of course, get the difference right between nurse and noose, didn't you?  If you didn't get the final vowel of ago, that doesn't matter (yet).  It was the first one, the schwa, that was important.


There are 8 of these and they are combinations of pure vowels which merge together.  We have, e.g., /ɪ/ + /ə/ (the sounds we know from bid and ago) following one another to produce /ɪə/ as in merely (mee-err-ly).  You can usually work out what the diphthong is by saying the word it contains very slowly and distinctly.

Using the chart, transcribe the following words and then click on the table to check your answers.

test 2

You have now transcribed words using all the vowels and most of the consonant sounds of English.  As a check of your knowledge, try the following.

Using the chart, transcribe the following words and then click on the table to check your answers.

test 3

Did you get it right?  One thing to notice is that in rapid connected speech, the transcription of come with me would probably be /kʌm wɪ miː/ without the /ð/ because we usually leave it out.  You may also, depending on how you say things, have had /iɡ's/ or even /ik's/ at the beginning of exactly.  That doesn't matter too much but note the convention for marking the stress on multisyllabic words: it's a ' inserted before the stressed syllable.
There is also the convention of putting a stop (.) between syllables (as in, e.g., sentence ('sen.təns).  Your students may not need that but many find it helpful.

marking stress

As we saw, the main stressed syllable is conventionally indicated by ' before the syllable (e.g., /'sɪl.əb.l̩/).
It is sometimes helpful to mark secondary stress in longer words like incontrovertible by a lowered symbol like this: /ɪnˌk.ɒn.trə.'vɜː.təb.l̩/ in which you can see a small ˌ before the /k/ sound indicating that the second syllable carries secondary stress and the main stress falls on the fourth syllable and is shown by the 'vɜː in the transcription.  Most learners find just one stressed syllable enough to cope with.


the schwa

The most common vowel in the spoken language has no letter to represent it.
It is, of course, the humble schwa.  If you teach no other phoneme symbol, teach this one.  Including it in your transcriptions is simply a matter of listening out for it and making sure that you aren't being influenced by the spelling of words.  You should also note that the schwa only occurs in unstressed syllables.  You can't stress the schwa.
The schwa may be how any of the traditionally spelled vowels are pronounced:

vowel a schwa in transcribed
a asleep /ə.'sliːp/
e different /'dɪ.frənt/
i definite /'de.fɪ.nət/
o prosody /'prɒ.sə.di/
u tedium /'tiː.dɪəm/
ou tedious /'tiː.dɪəs/
io nation /'neɪʃ.ən/

The schwa also occurs routinely in function words like and, of, for, to etc. which can be transcribed as /ənd/, /əv/, /fə/, /tə/ etc. as that is how they are produced in connected speech.  This is called weakening.

How many schwa sounds can you detect when you say and transcribe this sentence?  Click on the bar when you have an answer.

schwa test


connected speech

intrusive sounds

There are three sounds which speakers insert between vowels in connected speech.  They need to be included in your transcriptions.  They are:

intrusive /r/
Try saying law and order.  You will hear a /r/ sound like this: /lɔːr ənd 'ɔː.də/.  Now transcribe The media are and I saw uncle Fred and you'll get the same phenomenon (/ðə 'miː.dɪər ɑː/ and /'aɪ 'sɔːr 'ʌŋk.l̩ fred/.
intrusive /w/
Try saying I went to evening classes and note what happens between to and evening.  The transcription is: /'aɪ 'went tuw 'iːv.n.ɪŋ 'klɑː.sɪz/.  Try transcribing do it and do or die and you'll see the same effect (/duːw ɪt/ and /duːw ɔː daɪ/.
intrusive /j/
Try saying I agree.  You will hear a /j/ sound between the words.  The transcription is: /'aɪj ə.'ɡriː/.  This effect is common with words ending in 'y'.  Standing alone, the transcriptions of fly, lay and they are /flaɪ/, /leɪ/ and /'ðeɪ/ but in combination with following vowels we get the intrusion.  Now transcribe fly over, lay it down and they aren't and you will get /flaɪj 'əʊv.ə/, /leɪj ɪt 'daʊn/ and /'ðeɪj ɑːnt/.

You may see an intrusive sound put in superscript (r w j) and that's a good way to draw your learners' attention to the sounds.  There is, however, a case to be made that you don't have to teach these at all because they are the inevitable effects of vowel-vowel combinations in speech.  They aren't, of course, only applicable to English.

Try this mini-test.  As before, click on the table to get the answer.

intrusion test

the glottal stop

A glottal stop is formed by briefly blocking the airflow at the back of the mouth (the glottis, hence the name).

In rapid speech a glottal stop is sometimes inserted instead of a consonant.  For example, the usual transcriptions for football and Batman are /'fʊt.bɔːl/ and /'bæt.mən/ but many people will pronounce them /'fʊʔ.bɔːl/ and /ˈbæʔ.mən/, inserting the stop, /ʔ/, instead of the /t/.
Try transcribing put on, pick up, hit him as they might sound in casual rapid speech and you'll get: /'pʊʔ ɒn/, /pɪʔ ʌp/ and /hɪʔ ɪm / instead of the more careful forms of /'pʊt ɒn/, /pɪk ʌp/ and /hɪt ɪm/.
We can even have butter as /'bʌʔ.ə/ not /'bʌt.ə/ or /ɪʔ ɪm/ not /hɪt ɪm/ in some common dialects (London and Scots, for example).

/h/ dropping and /ŋ/ to /n/ conversion

Note, too that dropping the /h/ on him is not always sloppy speech; it is very commonly acceptable.  And very common (but not in all dialects).
The /h/ in I have, when not contracted, is often replaced by an intrusive /j/ as in /'aɪj æv/ and this happens frequently elsewhere, too (they have, we have, e.g., rendered as /'ðeɪjəv/, /'wijæv/).  Notice, too, the tendency to pronounce have as /həv/ in they have but as /hæv/ in we have.
Hello is often pronounced /hə.'ləʊ/ sometimes /hæ.ˈləʊ/ but often /ə.'ləʊ/ or /æ.ˈləʊ/.  It may be safer to stick with /haɪ/.

Similarly, in many dialects the final /ŋ/ in words ending with -ing is often rendered as /n/ but this is generally considered low status.  We get, e.g., /'ɡəʊɪn 'aʊt/ instead of /'ɡəʊɪŋ 'aʊt/.  Oddly, some high-status British accents also make this conversion, exemplified by the so-called huntin', fishin' and shootin' set (the /'hʌnt.ɪn 'fɪʃ.ɪn ən 'ʃuːt.ɪn set/).


/niːd mɔː 'præk.tɪs/?

You can easily get as much practice as you like by opening a book at random, selecting some words and transcribing them.
You can then go online and check your answers.  A good source for that is PhoTransEdit.

Lastly, try transcribing this sentence and then check your answer here: eye

The pronunciation section of the in-service index on this site has separate guides to consonants, vowels, connected speech and intonation.

For a list of the commonest weak forms in English, click here.

If you are feeling strong enough, there are three tests here.