Types of languages

Depending slightly on how you define a language, there are around 6000 in the world.  And they are all different.


Why is this important?

Knowing something about the languages your learners speak is helpful in a number of ways.
How many people speak it?

See if you can identify five of the world's language spoken by the most people.
Other estimates will vary and these figures are not precise.  French and German appear in many lists of the top 10 and are certainly in the top 20.


Classifying languages

There are fewer ways to classify languages than there are languages but still rather a lot, unfortunately.
Because this site is concerned with English language teaching rather than theoretical linguistics, we will be taking a simple and rather loose approach.
Be assured that if you want to learn more, there is plenty of literature in the field of language typology to keep you amused.


Families of languages

The most obvious way to classify languages is to consider where they are spoken and how they relate to each other.
The results of such research are often interesting and colourful maps of the world or regions with languages shown in terms of their affinities and where they are spoken.  Here's one, for example:

language family map

We can, in fact, take a broader brush and locate these major groups:

Amerindian languages
are indigenous to North and South America.
This is no longer considered a family of related languages but is the term used to describe all American languages, north and south, and includes, therefore, enormous variety.  Examples are Na-Dene, Chinook, Totonac and Yamana.
Hamito-Semitic languages
are indigenous to North and North-East Africa and the Middle East and this group is sometimes called Afro-Asiatic.  There are some 350 languages in this group, spoken by around 350 million people.  Examples are Hausa, Amharic, Berber, Arabic, Hebrew, Somali and Cushitic.
Niger-Congo languages
are indigenous to sub-Saharan and Central Africa.  There are around 1500 languages in this group, spoken by about 350 million people.  Examples are Igbo, Ewe, Swahili, Wolof and Ashuku.
Bantu languages
are indigenous to Southern Africa and this group is often considered a sub-group of the Niger-Congo languages (above).  Estimates of the number of Bantu languages and native speakers of them vary widely (between 200 and over 500 languages, spoken by around 50 million people).  Examples are Zulu, Xhosa, Ndebele and Makhuwa.
Indo-European languages
are indigenous to Europe, Northern Asia and Southern Asia.  Although there are only 420 or so languages in this group, Indo-European languages are spoken by around 3 billion people making it the largest group of all – 6 out of 10 of the top 10 languages listed above are Indo-European.  Examples are English, Hindi, Russian, Greek and Spanish.
Ural-Altaic languages
are indigenous to Central Asia.  This is a contested grouping but, if the group exists at all as a separate family, examples would include Finno-Ugric, Turkic, Mongolic, Tungusic and Caucasian.
Sino-Thai languages
are indigenous to China and South-East Asia.  This group is frequently referred to as Sino-Tibetan.  There are around 420 languages in this group, spoken by 1.2 billion people.  Examples include the Chinese languages, Vietnamese and Thai.
Dravidian languages
are indigenous to the southern tip of India and Sri Lanka but widely spoken further north.  There are only about 30 languages in this group but they are spoken by 200 or so million people.  Examples include Telugu, Tamil, Kannada and Malayalam.
Austronesian languages
are indigenous to the Pacific Islands, the archipelagoes between Asia and Australia and Madagascar.  There are around 1200 languages in this group, spoken by about 350 million people.  Examples include Malay (Indonesian and Malaysian), Javanese, and Filipino (Tagalog).
Papuan and Australian languages
are indigenous to Australia and Papua New Guinea and this group is sometimes called Trans-New Guinea.  There's a huge range here with 500 languages spoken by 3 million people.  Examples include Warlpiri, Melpa, Enga, Ekari and Tiwi.

Hidden in the categories above is an enormous range of language types and language groupings.  As a piece of information it may be interesting and, as a sign that you can identify your learners' first languages, it may be impressive but as an aid to teaching languages, it is less useful.
A better guide is, in fact, a map of where languages are now spoken rather than the indigenous languages categorised here.  If you take that approach, it is clear that in most of both North and South America, Indo-European languages are prevalent (English, French, Spanish and Portuguese).
You can, however, use language family information to make some guesses about possible learner difficulties.  If you know the problems faced by speakers of one language, then you can be fairly sure that speakers of closely related languages will have similar issues.
Sub-families (such as Slavic, Germanic, Turkic, Semitic and so on) are more helpful in this respect.


Family trees

A subset of mapping languages is a family tree approach akin to biological taxonomy.
In this approach, an early source language is identified (often called a proto-language) and the descent of a variety of languages from it is mapped onto a family tree.  It can be helpful and informative for teaching purposes because it clearly shows which languages are most closely related and will share similar structures and lexicons.
Here's an example of a frequent family tree for some Indo-European languages and similar diagrams can be drawn for other language groups.  Languages in black are extinct and other extinct languages such as Anatolian, Cornish, Prussian and other West Baltic languages and Tocharian are excluded.  Goidelic and Brythonic refer to branches of Celtic, not actual languages.


There is a much more complete diagram in Wikipedia (go to https://en.wikipedia.org/wiki/Indo-European_languages for more).

Note that this diagram is not at all complete.  It excludes all the Slavic languages along with the extant Baltic languages (Latvian and Lithuanian).  There's another, slightly different tree in the guide to the roots of English.
It's fairly easy to draw some inferences from charts like this concerning language kinship and similarities.


Word ordering

Languages, arguably, all contain common elements such as subjects, verbs, adjectives and so on and one way of classifying a language is to see what its usual word order is.  There is a guide to word order on this site from which this table is taken.  That guide also considers the ordering of other elements so little more will be said here.

The man took the money
The man the money took
Chinese languages
German (in both lists)
German (in both lists)
Persian (Farsi)

Note that this table contains most of the world's most widely spoken languages and those spoken by most people (not quite the same thing).


Language structure

By far the most useful way to classify language for teaching purposes is by their structures.  To do this we need to look briefly at the concept of the morpheme.

A morpheme is a meaningful unit of a language which cannot be further broken down.
For example, in the word washing there are two morphemes: wash and ing.  The first can stand alone and retain its meaning so is an unbound or free morpheme.  The second can only function as part of a word (here denoting the action of washing or the progressive nature of washing) so it is a bound morpheme.
Bound morphemes come in two sorts:

  • those that make things like past tenses, possessives and plurals are called inflexional morphemes.  An example is the -ed ending on a verb in the past tense in English.
  • those that concern deriving one word from another are called derivational morphemes.  Examples are deriving impolite from polite with the im morpheme or deriving hopeful from hope with the ful morpheme.

As a simple test, break the following down into its morphemes and then click here.

John's closeness to his brothers appeared obvious.

Although there are a number of problems with morphemic analysis, it is central to the science of language typology, as we shall shortly see.

Because no languages fall neatly into any particular category, it makes some sense to arrange them on a cline from language which use lots of morphemes to signal grammatical and derivational relationships and those which use none or very few.

To do this we can look at the number of morphemes per word.  In, for example, the English word house, we have a single stand-alone, free morpheme.  The ratio of morphemes to words is 1:1.
In the word housemates by contrast, we have three morphemes in one word (house, mate and s).  That makes the morpheme to word ratio 3:1.

In our example above (chosen to illustrate what morphemes are) we have 11 morphemes and 7 words, a ratio close to 2:1.

  • Languages which have a generally low ratio of morphemes to words tend to the isolating end of the spectrum.
  • Languages which have high numbers of morphemes per word tend to the synthetic end.
  • Languages which lie somewhere in the middle using, as English does, a low number of inflexions but have a rich variety of derivational morphemes are usually described as analytic.

Here's the picture:

morpheme ratios


A little language lesson

  1. analytic and isolating languages
    use fewer or almost no bound morphemes to signal grammatical relationships, preferring whole words to do the job.
    For example, in English we can say
        I will see you tomorrow
    in which all the words are single free morphemes.  In French on the other hand, that would be
        je te verrai demain.
    in which the verb form includes the signal for 1st person singular and the sense of future, not signalled by another free morpheme such as will.  French is not a very analytical language and certainly not isolating.
    Some languages will not use grammatical, inflexional morphemes but are replete with derivational morphemes.  In Mandarin there are no morphemes to signal plurals, for example, but compounding of words is frequent.
  2. synthetic languages
    use affixation and inflexion to a great extent and have a high morpheme to word ratio.  To make this clear, a short language lesson may help.
    For example, in Greek a verb ending may signal person, gender and tense at the same time and in French the verb system exhibits similar characteristics.
    For example, the past tense of we arrive in English is simply we arrived with a single -d morpheme attached to show tense.  In French and Greek, the situation is different:
    French: one translation is nous arrivons where the verb form shows both tense and person.
    Greek: The pronoun marker will not usually be present because the verb form φτάσαμε [ftasamay] shows tense, person and number by its form.
    In German, Russian and other related languages noun and adjective endings as well as articles will signal case and number consistently.
    For example, in English, there is no change to the articles, verbs, nouns and adjectives in
    A: The teacher gave the children the red book
    B: The children gave the teacher the red book
    We can also say
        C: The red book is now on the shelf
    and the nouns, the adjective and the articles remain unchanged.
    However, in German, the sentences show changes to word forms to signal what was given by whom to whom and where things are:
        A: Die Kinder gaben dem Lehrer das rote Buch
    in which the plural form of the verb shows number, the article forms show case [nominative children, accusative book and dative teacher]
        B: Der Lehrer gab den Kindern das rote Buch
    where the article, der, shows a singular person who did the giving, the article, den, shows who received the book and that the recipients are plural and the -e ending on the adjective shows the singular nature of book [plural would be die roten Bücher with a change to the article, the adjective ending and two changes to the noun to show number].
    The third sentence appears as
        C: Das rote Buch ist jetzt auf dem Regal
    where the articles and the adjective are all inflected to show case.
    Incidentally, if the article is absent, its neuter -s ending is transferred to the adjective and we get rotes Buch.
    As with everything in this area, however, there are degrees of synthetic tendencies.
    1. Some languages also have agglutinative characteristics which means that ideas are joined together within words to make new meanings.
      For example, in German, we can have Hauptbahnhofgaststätte which breaks down roughly as main + line + yard + guest + place and means the main railway station restaurant.  Recently, an initiative was referred to in Germany as
      Das Bundeswehrattraktivitätsteigerungsgesetz
      which is federal + army + attractiveness + increase + law and means an attempt to increase the attractiveness of the armed forces after the ending of compulsory military service.  Turkish, Japanese and many other language exhibit agglutination and it is arguable that English does, too, with concepts such as child protection agency being treated as single words for the purposes of stress (on the first element) and pluralisation (on the last).  It is merely a convention to write this as three words.
    2. At the extreme end of the synthetic spectrum we have languages described as polysynthetic in which long strings of morphemes occur, each signifying a single meaning with a mix of derivational and grammatical morphemes.  Few languages exhibit this to any extent but there are some, such as Mohawk where it is possible to produce a single word which can only be rendered as a sentence in an analytical language like English.  E.g.,
          He ruined her dress
          He made the-thing-that-one-puts-on-one's body ugly for her
      Grateful thanks to Wikipedia for that one.

We can now refine our diagram a little to show a more accurate relationship:

morpheme ratios 2

Let's be quite clear: no language is of one type only.  English for example, can be

  • Isolating and analytic: The boy will arrive soon (with no inflexions or derived words)
  • Synthetic: The boys are arriving soonish (with two inflexional and one derivational morphemes)
  • Agglutinating: An ongoing washing-up argument

A note on terminology:
In the above, we have used a common set of terms but you may come across these:
Isolating and Analytic languages are sometimes called Root languages.
Polysynthetic languages are sometimes referred to as Incorporating languages.
Synthetic languages are sometimes called Inflecting or Fusional languages.


Tone, stress and timing

There are three areas to consider here and they all have immediate and obvious consequences for language teachers.

Tonal languages
All Chinese languages along with related languages in the Sino-Thai family (Burmese, Thai, Vietnamese etc.) exhibit tonal distinctions between words.
Many Chinese languages have four tones:
A word may change its meaning completely when pronounced with these four tones.
For example, the sound [ta] in Szechuanese pronounced with a falling tone means to answer but when it is pronounced with a level tone, it means to beat and when it is produced with a fall-rise tone, it means big.
A few African languages and some North American ones are also tonal and some European ones such as Swedish and Norwegian also make use of tone to distinguish meaning.
Pitch-accent languages
Japanese is often referred to as a pitch-accent language where one class of words consistently has a high pitch on the stressed syllable followed by a low pitch.
Stress distinctions
In some languages, the stress always falls on the same syllable (Hungarian and French for example).  Others, such as Greek and Russian make use of stress changes to distinguish words.  For example, in Greek πότε [potay] means when (as in a question) but when the stress moves to the end as in ποτέ [potay] it means never.  In Russian, and some other Slavic languages, similar things happen.
In English, too, stress changes sometimes denote changes in word class (export [verb] vs. export [noun] and increase [verb] vs. increase [noun], for example).
Stress and syllable timing
The following is the theory.
There are, it is claimed, two fundamental forms of stressing.
In some languages, such as French, Italian, Spanish, Cantonese and Mandarin, every syllable is perceived as taking up the same amount of time.  This is the so-called 'machine gun' sound of these languages.  So we get:
    I ... went ... to ... Lon ... don ... with ... my ... bro ... ther

That's syllable timing.
In other languages, notably English, Dutch, Farsi and Scandinavian languages, some syllables take longer to utter than others and this results in a reduction of the syllables in between.  So we get
    Iwentto ... L o n d'n ... withmy ... b r o the(r)
That's stress timing.
For this reason, the preposition to is not pronounced in its full form as /tuː/ (rhyming with 'two' and 'too') but with a weak form of the vowel /tə/.
Be aware that even if this distinction exists, it is not an either-or one.  Languages will vary along a cline from syllable- to stress-timing tendencies.
(There is, in fact, a third form of timing: Mora timing.  In Japanese, e.g., a vowel (V) takes the same time to utter as a consonant (C) plus a vowel so V takes the same time as CV and CVV takes twice as long as CV.)
Because English is a stress-timed language (allegedly), many vowels are reduced in rapid, connected speech so, e.g., for is pronounced /fə/ (not /fɔː/), been is heard as /bɪn/ (not /biːn/, we is heard as /wɪ/ and so on.
Here's a list but please remember that there is a cline from stress- to syllable-timed languages.  It is not an either-or distinction.
THAI (also tonal)
VIETNAMESE (also tonal)


Interesting (a bit) but all rather theoretical, isn't it?

You are right.  Some of the more abstruse and theoretical areas of language typology research and analysis are very theoretical and they should not overly concern us as language teachers.  We certainly don't need all the technical terminology.
We do, however, need two things:

  1. A grasp of the basic concepts (even if we don't need the words)
  2. An insight into our learners' languages so we can anticipate how they will react to learning English.

Ellis (1994: 302) presents evidence to show that around half of all errors made by learners can be traced to interference from the nature of the learners' first languages.  This is not a trivial matter.
Here are some examples of classroom implications.

Consider the issues here and then click on the eye open to reveal some comments.

You have a class of German, French, Spanish, Mandarin and Vietnamese speakers and you want to introduce regular past-tense endings.
eye open
This is not a problem for those speakers of synthetic-tending languages because they have no conceptual difficulty with a grammatical inflexion.  They may be surprised that English only has one and be tempted to add -s for the third person, producing, e.g., She smokeds.
The speakers of Mandarin and Vietnamese will have real conceptual difficulties because their languages just don't do this kind of thing.  If there is already a past-time marker in the sentence such as
last week then they will not see any reason to add the inflexion to the verb at all and ignore it.
You have to present the language difference to them without an adverb for time.
With a similarly mixed group, you want to emphasise the importance of subject-verb-object structures even when a sentence is passive.
eye open
German (but not French or Spanish) has a way of marking the case of the noun by an ending and a change to the article so a German speaker may well not appreciate that word order is the only way to signal case in many sentences.
The speakers of Mandarin and Vietnamese are conceptually prepared for word order to signal who did what to whom and have no problems.  They will be confused by the tense changes and inflexions (such as
is to was and build to built) that go into making a passive-voice sentence, however.  These languages also do not signal passive voice by changes to word order, using an affix instead.
Your other students should have less trouble.
You want to teach post-modification with marker sentences such as:
The house being built on the corner
The person to see is the manager
eye open
In many languages people would prefer to pre-modify the noun and use something like The building on the corner house.
The non-finite verb forms will trouble speakers of isolating languages especially as they contain two separate morphemes (
-ing and -t).  They will have less trouble with the infinitive form because it better reflects an analytic language.
Speakers of synthetic languages may choose to add an inflexion to the verbs to show more than aspect and tense.
You want to focus on words stress to help a range of learners to notice the way stress sometimes moves in photograph, photography and photographer etc.
eye open
A glance at the section above will alert you to the possible issues.
Learners whose first language has consistent and predictable stress will be confused and discouraged by the fact that stress in English is so unpredictable.
For others, this will be less challenging conceptually but will still be a source of error.
Learners from tonal-language backgrounds may likewise often be looking for changes in meaning rather than word class.
You want to focus on intonation and sentence stress.
eye open
Learners whose first languages are predominately syllable timed (whether they are tonal or not) will struggle to produce natural sentence stressing and timing and may have difficulty perceiving and producing weak forms.
This will clearly be much easier for learners whose first languages are more like English in this respect.

Understanding your learners' errors is another area where some knowledge of other people's languages is a boon.  For example:

  • Learners who ignore past-tense and plural markers are often from isolating language backgrounds and need specific teaching and noticing activities.
  • Learners who overuse inflexions and even add them to modal verbs often come from synthetic-language backgrounds and also need to be helped to see how analytical and even isolating English can be.
  • Learners who ignore word-order constraints in English often come from synthetic language backgrounds which are marked for case (such as Slavic languages and German).  They need help to see that word order is often the only way to know who did what to what in English.

Related guides
cognates and false friends for consideration of the affinities between languages and their lexicons
mini-course on comparing languages which also contains an example lesson concerning how to use some of this information
teaching index many of the guides to teaching particular structures contain consideration of how other languages differ from English

