The phonetics and phonology of the English language differ from one dialect to another, usually without interfering with mutual communication. Phonological variation affects the inventory of phonemes (i.e. speech sounds that distinguish meaning), and phonetic variation consists in differences in pronunciation of the phonemes. This overview mainly describes the standard pronunciations of the United Kingdom and the United States: Received Pronunciation (RP) and General American (GA). (See § Dialects, accents, and varieties, below.)
The phonetic symbols used below are from the International Phonetic Alphabet (IPA).
Most English dialects share the same 24 consonant phonemes. The consonant inventory shown below is valid for California English, and for RP.
* Conventionally transcribed /r/.
In the table, when obstruents (stops, affricates, and fricatives) appear in pairs, such as /p b/, /tʃ dʒ/, and /s z/, the first is fortis (strong) and the second is lenis (weak). Fortis obstruents, such as /p tʃ s/ are pronounced with more muscular tension and breath force than lenis consonants, such as /b dʒ z/, and are always voiceless. Lenis consonants are partly voiced at the beginning and end of utterances, and fully voiced between vowels. Fortis stops such as /p/ have additional articulatory or acoustic features in most dialects: they are aspirated [pʰ] when they occur alone at the beginning of a stressed syllable, often unaspirated in other cases, and often unreleased [p̚] or pre-glottalised [ʔp] at the end of a syllable. In a single-syllable word, a vowel before a fortis stop is shortened: thus nip has a noticeably shorter vowel (phonetically, but not phonemically) than nib [nɪˑb̥] (see below).
- lenis stops: bin [b̥ɪˑn], about [əˈbaʊt], nib [nɪˑb̥]
- fortis stops: pin [pʰɪn]; spin [spɪn]; happy [ˈhæpi]; nip [nɪp̚] or [nɪʔp]
In RP, the lateral approximant /l/, has two main allophones (pronunciation variants): the clear or plain [l], as in light, and the dark or velarised [ɫ], as in full. GA has dark l in most cases.
- clear l: RP light [laɪt]
- dark l: RP and GA full [fʊɫ], GA light [ɫaɪt]
All sonorants (liquids /l, r/ and nasals /m, n, ŋ/) devoice when following a voiceless obstruent, and they are syllabic when following a consonant at the end of a word.
- voiceless sonorants: clay [kl̥eɪ̯]; snow RP [sn̥əʊ̯], GA [sn̥oʊ̯]
- syllabic sonorants: paddle [ˈpad.l̩], button [ˈbʌt.n̩]
The pronunciation of vowels varies a great deal between dialects and is one of the most detectable aspects of a speaker's accent. The table below lists the vowel phonemes in Received Pronunciation (RP) and General American (GA), with examples of words in which they occur from lexical sets compiled by linguists. The vowels are represented with symbols from the International Phonetic Alphabet; those given for RP are standard in British dictionaries and other publications.
In RP, vowel length is phonemic; long vowels are marked with a triangular colon ⟨ː⟩ in the table above, such as the vowel of need [niːd] as opposed to bid [bɪd]. In GA, vowel length is non-distinctive.
In both RP and GA, vowels are phonetically shortened before fortis consonants in the same syllable, like /t tʃ f/, but not before lenis consonants like /d dʒ v/ or in open syllables: thus, the vowels of rich [rɪtʃ], neat [nit], and safe [seɪ̯f] are noticeably shorter than the vowels of ridge [rɪˑdʒ], need [niˑd], and save [seˑɪ̯v], and the vowel of light [laɪ̯t] is shorter than that of lie [laˑɪ̯]. Because lenis consonants are frequently voiceless at the end of a syllable, vowel length is an important cue as to whether the following consonant is lenis or fortis.
The vowel /ə/ only occurs in unstressed syllables and is more open in quality in stem-final positions. Some dialects do not contrast /ɪ/ and /ə/ in unstressed positions, so that rabbit and abbot rhyme and Lenin and Lennon are homophonous, a dialect feature called weak vowel merger. GA /ɜr/ and /ər/ are realised as an r-coloured vowel [ɚ], as in further [ˈfɚðɚ] (phonemically /ˈfɜrðər/), which in RP is realised as [ˈfəːðə] (phonemically /ˈfɜːðə/).
An English syllable includes a syllable nucleus consisting of a vowel sound. Syllable onset and coda (start and end) are optional. A syllable can start with up to three consonant sounds, as in sprint /sprɪnt/, and end with up to four, as in texts /teksts/. This gives an English syllable the following structure, (CCC)V(CCCC) where C represents a consonant and V a vowel; the word strengths /strɛŋkθs/ is thus an example of the most complex syllable possible in English. The consonants that may appear together in onsets or codas are restricted, as is the order in which they may appear. Onsets can only have four types of consonant clusters: a stop and approximant, as in play; a voiceless fricative and approximant, as in fly or sly; s and a voiceless stop, as in stay; and s, a voiceless stop, and an approximant, as in string. Clusters of nasal and stop are only allowed in codas. Clusters of obstruents always agree in voicing, and clusters of sibilants and of plosives with the same point of articulation are prohibited. Furthermore, several consonants have limited distributions: /h/ can only occur in syllable-initial position, and /ŋ/ only in syllable-final position.
Stress, rhythm and intonation
Stress plays an important role in English. Certain syllables are stressed, while others are unstressed. Stress is a combination of duration, intensity, vowel quality, and sometimes changes in pitch. Stressed syllables are pronounced longer and louder than unstressed syllables, and vowels in unstressed syllables are frequently reduced while vowels in stressed syllables are not. Some words, primarily short function words but also some modal verbs such as can, have weak and strong forms depending on whether they occur in stressed or non-stressed position within a sentence.
Stress in English is phonemic, and some pairs of words are distinguished by stress. For instance, the word contract is stressed on the first syllable (/ KON-trakt) when used as a noun, but on the last syllable (/ TRAKT) for most meanings (for example, "reduce in size") when used as a verb. Here stress is connected to vowel reduction: in the noun "contract" the first syllable is stressed and has the unreduced vowel /ɒ/, but in the verb "contract" the first syllable is unstressed and its vowel is reduced to /ə/. Stress is also used to distinguish between words and phrases, so that a compound word receives a single stress unit, but the corresponding phrase has two: e.g. a burnout (/) versus to burn out (/), and a hotdog (/) versus a hot dog (/).
In terms of rhythm, English is generally described as a stress-timed language, meaning that the amount of time between stressed syllables tends to be equal. Stressed syllables are pronounced longer, but unstressed syllables (syllables between stresses) are shortened. Vowels in unstressed syllables are shortened as well, and vowel shortening causes changes in vowel quality: vowel reduction.
Varieties of English vary the most in pronunciation of vowels. The best known national varieties used as standards for education in non English-speaking countries are British (BrE) and American (AmE). Countries such as Canada, Australia, Ireland, New Zealand and South Africa have their own standard varieties which are less often used as standards for education internationally. Some differences between the various dialects are shown in the table "Varieties of Standard English and their features".
English has undergone many historical sound changes, some of them affecting all varieties, and others affecting only a few. Most standard varieties are affected by the Great Vowel Shift, which changed the pronunciation of long vowels, but a few dialects have slightly different results. In North America, a number of chain shifts such as the Northern Cities Vowel Shift and Canadian Shift have produced very different vowel landscapes in some regional accents.
Some dialects have fewer or more consonant phonemes and phones than the standard varieties. Some conservative varieties like Scottish English have a voiceless [ʍ] sound in whine that contrasts with the voiced [w] in wine, but most other dialects pronounce both words with voiced [w], a dialect feature called wine–whine merger. The unvoiced velar fricative sound /x/ is found in Scottish English, which distinguishes loch /lɔx/ from lock /lɔk/. Accents like Cockney with "h-dropping" lack the glottal fricative /h/, and dialects with th-stopping and th-fronting like African American Vernacular and Estuary English do not have the dental fricatives /θ, ð/, but replace them with dental or alveolar stops /t, d/ or labiodental fricatives /f, v/. Other changes affecting the phonology of local varieties are processes such as yod-dropping, yod-coalescence, and reduction of consonant clusters.
General American and Received Pronunciation vary in their pronunciation of historical /r/ after a vowel at the end of a syllable (in the syllable coda). GA is a rhotic dialect, meaning that it pronounces /r/ at the end of a syllable, but RP is non-rhotic, meaning that it loses /r/ in that position. English dialects are classified as rhotic or non-rhotic depending on whether they elide /r/ like RP or keep it like GA.
There is complex dialectal variation in words with the open front and open back vowels /æ ɑː ɒ ɔː/. These four vowels are only distinguished in RP, Australia, New Zealand and South Africa. In GA, these vowels merge to three /æ ɑ ɔ/, and in Canadian English, they merge to two /æ ɑ/. In addition, the words that have each vowel vary by dialect. The table "Dialects and open vowels" shows this variation with lexical sets in which these sounds occur.