logo ELT Concourse teacher training
Concourse 2

Dealing with pronunciation: the essential guide


Especially at the beginning of their teaching careers, some people are reluctant to treat pronunciation very thoroughly in the classroom.  This is because:

  1. It is a technical area and that's slightly intimidating
  2. They don't know what to do apart from getting learners to repeat the teacher's model
  3. Teachers who are non-native speakers of English are worried that their own pronunciation is faulty

Here are some counter arguments:

  1. It is a technical area and that's slightly intimidating
    1. Yes, it is in parts but the basics of phonemic transcription can be learned in a few hours and the terminology needn't be used with learners.  It's the what of pronunciation which is important not the what's-it-called.
  2. They don't know what to do apart from getting learners to repeat the teacher's model
    1. There are some ideas on this page for getting beyond modelling
  3. Teachers who are non-native speakers of English are worried that their own pronunciation is faulty
    1. That's often true because it is very difficult, after a certain age, to learn native-like pronunciation but the teacher's production is still going to be better than most of the learners' efforts.  Teachers who are native speakers of the learners' first language are often better placed than most to know where the problems lie.

So, don't shy away from teaching pronunciation.  Most learners need explicit help, most learners know they need help and most learners appreciate their teacher's help in this area.

For this guide it will be useful if you have followed either the guide to essential phonology terminology or the guide to the essentials of pronunciation (preferably both).
If you have completed the course for learning to transcribe English phonemes, that's even better because this guide has to use some transcription.
All these links will open in new tabs so just shut them to return.


Learn about your learners' language(s)

If you are a native or fluent speaker of your learners' first languages, you have an immediate advantage because you know where the issues lie.
Most learners will attempt to impose the phonology of their first language on the phonology of the language they are learning.  So, for example:







This list could be considerably extended so you will need to do your own research.  If you are not a native or fluent speaker of your learners' language(s), the simplest way is just to listen to them and make a note of the issues you hear.  There are useful reference books such as Swan and Smith's Learner English (2001), Cambridge University Press and internet sources, if handled with care and scepticism, are often very helpful.
To help a little, here is a list of languages divided by the nature of the timing of syllables and sounds.  It is not complete and disguises the fact that languages exists on a cline from highly syllable timed to highly stress timed with most occupying part of the middle ground:

ARABIC (with variations)
THAI (also tonal)
VIETNAMESE (also tonal)

You also need to be aware of when languages do not differ so you don't waste time teaching and practising the already known.
For example: Spanish shares 16 consonants with English, Greek even more, German somewhat fewer and Russian shares 17 vowel sounds with English, Thai 14 but Italian only 7.


Two views of teaching pronunciation

Among teachers of English, there are two views about how to handle pronunciation:

  1. As and when
    In this view pronunciation work is only undertaken when it arises naturally out of the lesson and the learners' production.  This approach:
    1. requires the teacher to be alert to and prepared for problems as they arise
    2. means that pronunciation teaching is often unplanned
    3. does not allow for specially designed materials
    4. may interfere with the timing and pacing of a lesson by inserting unplanned phases
    5. is instant and gratifying for learners
    6. can be short, sharp and to the point
    7. takes advantage of language needs as they emerge
  2. Planning a series of dedicated pronunciation lessons
    This approach:
    1. requires the teacher thoroughly to research the learners' likely problems bearing their first language(s) in mind
    2. allows for time to plan and design targeted materials
    3. is often appreciated by learners as a sign that the teacher takes the matter as seriously as they do
    4. can be effectively combined with the as-and-when approach by having revision materials and procedures to hand

There's no right answer.  What follows can be used with either approach.



There is a separate guide to drilling on this site.  This part only concerns drilling for pronunciation work.

There are conflicting theories concerning the usefulness of drilling learners.  The debate is between those who believe that learning a language is essentially a process of acquiring new habits and those who believe that learning involves a cognitive, thinking process.  The arguments include:

In favour Against
Most learners like it Some learners find it embarrassing
It's essential for pronunciation work It makes no difference to learning
It makes production automatic It's based on an outdated learning theory
Drills give learners confidence Drills are meaningless and non-communicative
Drills provide valuable speaking practice Drilling is boring



Before learners can employ a listen-and-imitate strategy, they need, of course, to hear an accessible, accurate and natural model.

This means that the salient feature you are trying to get learners to imitate is obvious to them.
If the pronunciation of a single phoneme is your target, that should be the model.  If, for example, the distinction between the short /iː/ for /ɪ/ sound as in pip and the longer /iː/ for /ɪ/ sound as in peep is the concern (as it often is) don't bury the sounds in long tongue twisters at the modelling stage.  Just form the two vowels clearly avoiding other sounds such as the initial /pʰ/ and final /p/ in both words which may also be difficult for some learners.  Why not use deep and dip instead, if that's easier for your learners, or just model the vowel in isolation?
Make sure that the shape of your mouth and your lips are clearly visible to everyone.  With some sounds, /ʊ/, /uː/, /ɔː/ and /əʊ/ as in put, loose, caught and coat, respectively, it is important to get the mouth shape right to be able to make the sounds accurately.  That requires lip rounding.  The amount of rounding varies and this affects the sound.  Other vowels, such as /e/ and /iː/ as in bed and bead are not rounded but the latter requires lateral stretching of the mouth – say cheese.  This should be obvious to learners so drilling from an audio recording or even a video recording is often unwise because the learners cannot see the information they need.
In terms of drilling stress and, especially, intonation, some form of exaggeration is often required so that the learners can easily identify the pattern they are being asked to imitate.  This is particularly true for learners whose first languages exhibit a narrower pitch or stress range than does English (i.e., most of them).
Obviously, you have to be drilling the sound you want the learners to produce so if you have a regional accent or your first language does not have the sound in question, you'll need to practise a little.  We should be aiming at enabling our learners to pronounce the sounds in a regionally neutral manner.  There's nothing wrong with, e.g., having a Newcastle, Mississippi or Hong Kong accent but you need to consider whether that is the accent you want your learners to acquire (and whether they want it).
Be careful with stress and intonation in this, because it takes some practice to be able to model a stress pattern over a sentence consistently in the same way so that learners are not being given conflicting messages.  Audio taped models are useful in this respect because they are unchanging.
We need to distinguish here between the connected-speech version of a sound and its citation form (i.e., when it is pronounced in isolation, perhaps in a list).  Many words change their form in connected speech and, for example, while clothes is pronounced /kləʊðz/ in isolation, it will often be pronounced as close (/kləʊz/) in connected speech.
Weak forms are another obvious case.  For example, the word for will rarely be pronounced as /fɔː/ and almost never as /fɔːr / but will usually simply be /fə/.
Even at lower levels, it pays to avoid losing naturalness for the sake of clarity so be unafraid to practise contractions, elisions and weak forms.

An effective model is sometimes a silent model.  Just mouthing vowels, especially, and some consonants such as the unvoiced /θ/ in thank can be as effective as saying it aloud because it allows the learners to focus on mouth shape and not worry about how they sound.  You can do it like this:

eee aaah sh
'eee' /iː/ aah /ɑː/ sh /ʃ/



Actors often learn their lines by breaking up the part and learning the last section first.  Dog trainers sometimes teach the final part of a command before the beginning and so do animal trainers in circuses.  The theory, such as it is, is that the learners focus on the form not the meaning of what they are doing.

The procedure is simple.  Instead of drilling the whole of a long word, phrase or sentence from the beginning, start at the end.  E.g.:

  1. Don't drill: inde > indepen > independent > independently
    Drill instead: ently > pendently > dependently > independently
  2. Don't drill: I would > I would love > I would love t' come
    Drill instead: t' come > love t' come > would love t' come > I would love t' come

Who to drill

Many teachers confine themselves to drilling the whole class together or drilling individuals only.
The problem with drilling the whole class, especially if it's large, is that a) you can't hear everyone and b) people don't start and finish together so you get an overlapping cacophony.
Here are some alternatives:


Cognitive approaches

Drilling has its critics because it is often seen as a behaviourist throwback to a time when we believed that learners acquired the targets by a process of imitation, repetition and the acquisition of good habits.  No longer.
If it is true that the language-learning process is one of forming and adjusting hypotheses based on the input you notice, then there is no obvious reason why this should not apply to the acquisition of the phonological system as much as it does to the grammar and lexical systems.

There are three possible approaches:


An inductive approach

You can, of course, expose your learners to rich examples of how the language is pronounced and assume that, being thinking animals, they will eventually work out how to form the sounds of the language acceptably.  In other words, this is a modelling approach with the drilling that follows it.
It might work.


A deductive approach

This requires a bit more work on the teacher's part.  A deductive approach to grammar, for example, involves providing the learners with the rules and then letting them loose on the data to form acceptable syntax.
A deductive approach to pronunciation involves telling learners explicitly how the sounds are formed and getting them to follow the rules to produce the sounds.
This is easier said than done because it requires some quite sophisticated understanding of mouth part positioning and, especially, tongue positioning for vowel sounds.
The usual way for all sounds is to use a diagram like this:
vocal tract

Sorting out the difference between voiced and unvoiced sounds is the easy part because you can get learners to place their hands on their throats to feel the vibrations that voiced sounds need.

Getting learners to distinguish between aspirated and non-aspirated sounds is also a simple matter of getting them to hold a small piece of paper in front of their mouths and try to make to move on /pʰ/, /kʰ/ and /tʰ/ but not move on /p/, /k/ and /t/.

Moving on to something more difficult, it is then possible to explain to learners, for example, the nature of a labio-dental voiced fricative (/v/) by telling them to position their bottom lips to touch the top set of front teeth and blow air between them while at the same time using some voice cord vibration.  The same can be done for a range of consonant sounds that cause problems.
It is easier, usually, for people to produce the unvoiced sound before adding in voice so get people to produce /f/ before /v/, /ʧ/ before /ʤ/ and so on (i.e., fan - van, chain - Jane etc.).

Vowels are more of a challenge but some understanding of the required tongue position in terms of height above the floor of the mouth and distance from the back of the mouth can actually be quite productive.  For that, you'll need a different diagram:

If you try saying beat, bit, bet, about, verse, cup, cap, cruise, foot, hot, fought, bark you will feel the tongue position change from left to right, top to bottom of the grid.  In your mouth, it'll move up and down and forward and back depending on the vowel.  If you can do this, so can your learners.
If you start with the extremes and distinguish between the sound /iː/ in, e.g., keys and the /ɔː/ sound in caught, for example, it becomes somewhat easier for learners to identify what they should be doing.
Once learners can do some of that, you can move on to lip rounding and vowel length in a similar way.
There are four defining characteristics of vowels: tongue position (forward or back), tongue height (up or down), length (long or short) and lip rounding (or stretching laterally)


A noticing approach

Now that making an audio (or even video) recording is so much simpler in classrooms, it is possible to apply a consistent noticing approach to the pronunciation issues your learners face.
A recording of your model (or someone else's) which learners can compare to their own output is often useful to help them notice the gap between their and a model production.

A noticing approach can be taken with all aspects of pronunciation from individual sounds, word stress, sentence stress, features of connected speech up to intonation patterns across longer utterances.  Be aware, however, that the longer the targets and the more data the learners have, the harder it is for them to identify the aspects you need them to notice so be careful to guide and help.

A noticing approach can, naturally, be combined with either a deductive or an inductive approach to pronunciation work.


Other guides

This is, of course, not the end of the story by a long way, but it is somewhere to begin.

There is a range of other guides on this site to various aspects of pronunciation that you may like to consider.

Related guides
consonants these guides are in the in-service section so they are more technical
syllables and phonotactics
word stress
connected speech
teaching troublesome sounds
a word-stress exercise for learners these guides are in the initial-plus section so they are slightly easier
essentials of pronunciation
phonology terminology whatever your background
learn to transcribe

Some more references to help:
Brinton, D, Goodwin, JM & Celce-Murcia, M, 1996, Teaching Pronunciation: A Reference for Teachers of English to Speakers of Other Languages, Cambridge: Cambridge University Press
Kenworthy, J, 1987, Teaching English Pronunciation, Harlow: Longman
There is a comprehensive bibliography of other references available at:
http://liceu.uab.es/~joaquim/applied_linguistics/L2_phonetics/Corr_Fon_Bib.html#Specific_works_on_pronunciation_teaching [accessed April 2017]