Temporal Analysis of the proximity of Scots to English
Can we use a spreadsheet of rhyming words as a corpus for analysis purposes
A few months back a person on Twitter brought up a variation the old trope that Scots used to be a different very language to English but now its really similar to English.
Its quite a compelling argument, but it does depend on the arguer knowing what the Scots language looked or sounded like over a multi-century timespan and what English looked like over a comparable timespan.
Not many people have this sort of esoteric knowledge, perhaps some kind of literature historian academic who has a preference for comparisons rather than an appreciation of the language itself. Why would such a person be on twitter arguing from a sectarian / political perspective?
And even if the knowledge was second-hand, taken from a book or academic paper, then surely that book or paper would exist so anyone else can see the workings.
Varieties of Scots and of English
We might consider varieties of Scots to come in two dimensions, namely regional (Ulster-Scots, Central Scots, Doric, Southern, Orkney, and Shetlandic varieties) and temporal (Early Scots, Early Middle Scots, Late Middle Scots, Modern Scots). There would inevitably be a matrix here where each regional variety has a corresponding temporal variety, such as Early Doric Scots, or Late Middle Shetlandic, but we will tie ourselves in knots trying deal with each matrix variety separately.
Are we comparing to all of Modern Scots and to all of Modern English, Middle Scots and Middle English, or Early Scots and Old English? What about the Early or Late periods of the older languages or the regional dialects. Since everyone accepts that Early Scots arose from the Northumbrian dialect of Old English, at what point was any Scots temporal variety furthest away from any English temporal variety?
We’ve already discussed how when people write “in Scots” there is quite a cohesive idea of what the Scots language looks like and sounds like (with regional variations) and how it is quite distinct from English. And this self-recognition, is quite a separate thing to merely using a few Scots words in speech or writing.
Now, my corpus of 21st century Scots Texts only covers the past 24 years and isn’t very much help for comparing linguistic change over a span of several centuries. Glasgow University’s Scottish Corpus does cover a few centuries, but the texts there aren’t easily marked-up as being Scots or Scottish English.
Over the past few months I’ve been compiling a Scots Language Rhyming Dictionary by going through all the Scots poetry I can get my hands on, and the rhyming words in in a big spreadsheet, with publishing date and author details. On my laptop this spreadsheet currently has about 20,000 rhyming word-pairs, spanning four centuries of Scots poetry and more than 300 different poets. Its reasonably balanced, no single poet makes up more than 1.3% of the total, its very over-weighted toward the 21st century, but there’s about 1,500 lines from each of the 18th, 19th and 20th centuries. Is that enough?
I don’t have a corresponding corpus of English poetry but we can muddle through without one, relying on naivety and ignorance.
A note about rhyming words
Rhyming words used in poetry are a subclass of entire language. If we compare the top 100 most frequently used words and the top 100 most commonly used rhyming words we would find that they over lap by about 15%.
The most frequently used words “the”, “a”, “wis”, “of”, are rarely used as rhyming words. Whilst in theory poetry could be written that uses these words to rhyme at the ends of lines, its not very common compared to all the more easy to reach rhyming words.
“stane” (stone), “knee”, “rain” and “hill” are very common rhyming words, but we might not expect them to be in the top 100 most frequently used words in normal day to day writing.
Scots poetry in the 18th century
If we look at the top 10 most common rhyming words in Scots poetry from the 18th century in my corpus of rhyming word pairs we have the following list
me, care, face, head, hill, gay, day, awa, green, hand
We can see that only one of these words is distinctly Scottish (awa), the other nine wouldn’t be out of place in English language poetry. We might score this sample as being 90% English
If we instead look at the top 25 most common rhyming words in 18th century Scots poetry:-
me, care, face, head, hill, gay, day, awa, green, hand, see, dead, fair, dear, down, mair, weel, man, play, wrang, away, plaid, hear, bride, stane
We see that five of the words are distinctly Scottish (awa, mair, wrang, weel, stane), and we might score this sample as 80% English.
And looking at the top 100 most common rhyming words in 18th century Scots poetry, we would find that 25 are distinctly Scots (awa, mair, weel, wrang, stane, din, hame, sin, syne, gane, thegither, nane, sang, morn, cauld, fain, twa, gang, tether, thrang, skaith, lass, slee, greet, kend, e’en) and 75 are shared with English, a score of 75%.
We might argue that words like “plaid”, “sang”, “morn”, “gang” and “greet” are not distinctly Scots and are shared with English, but we wouldn’t expect them to be as popular in English poetry as they are in Scots, and the meanings are subtly different in Scots writing. “Greet” means “to cry”, “gang” is “going” and so on.
19th Century Scots
Now, stepping forward to the 19th century, we repeat the exercise, parsing the top ten most common rhyming words for any that are distinctly Scots
me, see, awa, be, day, sea, hame, mair, there, again
Three out of the ten are distinctly Scottish, seven are shared with English, a score of 70%. Looking at the top 25:-
me, see, awa, be, day, sea, hame, mair, there, again, brae, door, man, care, doon, dee, men, sang, face, in, lane, time, seen, een, gane
Ten out of the twenty-five are distinctly Scots, 60% Englishness score. Here “lane” is used as “alone” in the poems. In the top 100 words, the following forty words are distinctly Scottish, consistently 60% Englishness:-
awa, hame, mair, brae, doon, dee, sang, lane, een, gane, snaw, sair, nicht, croon, wa, glen, blaw, braw, sin, lea, ben, ken, thegither, mind, lang, thee, alane, ain, din, fa, e'e, weel, twa, cauld, trow, stane, wrang, mune, ee, himsel
Scots poetry in the 20th century
Stepping forward to the 20th century, the top ten words are as follows:-
me, day, hame, doon, awa, sea, nicht, mair, ken, heid
Now we have a spectacular seven out of ten are distinctly Scottish, only 30% English. When we look at the top twenty-five:-
me, day, hame, doon, awa, sea, nicht, mair, ken, heid, be, hill, men, there, name, still, lang, see, green, sang, say, door, dee, stane, weel
Twelve rhyming words are distinctly Scottish, and thirteen are shared with English, so an Englishness score of 52%
For the top 100, the following forty-one words are distinct Scottish
hame, doon, awa, nicht, mair, ken, heid, lang, sang, dee, stane, weel, toon, licht, een, oot, lane, noo, deid, syne, ben, braes, bairn, snaw, gang, sair, byre, gie, wrang, mune, doot, sicht, doun, greet, ava, wark, abön, hert, tae, geen, lea
So our Englishness score is likewise 59%. Is that significant, that the proportions are jumping around?
The 21st century
In my collection of rhyming words, even though we are less than a quarter of the way into this century, it is over-represented, about half of the rhymes listed are from the last twenty-four years. This means we are able to slice them into different decades.
Looking at the top ten rhyming words used between 2000 and 2010:-
day, me, by, doon, mair, hame, again, see, sicht, air
Four are distinctly Scots, 60% are English. Looking at the top twenty-five most common rhyming words from that decade:-
day, me, by, doon, mair, hame, again, see, sicht, air, man, be, in, sea, nicht, heid, awa, kent, say, tea, life, tree, seen, said, pain
Here eight are distinctly Scottish, 68% English.
When we look at the top 100 from 2000 to 2010:-
doon, mair, hame, sicht, nicht, heid, awa, kent, tae, sang, aa, noo, bricht, ee, ain, wa, toon, roun, wrang, lang, deid, sin, een, stan, ower, afore, hae, stane, hoose, richt, licht, broon, gane, ken, agaen, lane, oan, braw
Only thirty eight words are distinctly Scottish, giving an Englishness score of 62%
2010s
The Top ten most frequently used rhyming words in Scots poetry between 2010 and 2020 are as follows:-
me, heid, oot, day, doon, mair, there, noo, face, een
Six of these words are distinctly Scottish, giving an Englishness score of 40%.
Looking at the top twenty five words:-
me, heid, oot, day, doon, mair, there, noo, face, een, sicht, aa, roon, lair, hoose, richt, again, same, nicht, place, seen, lan, lang, shore, wean
Sixteen of them are distinctly Scottish, only nine are shared with English, so the Englishness score is 36%
heid, oot, doon, mair, noo, een, sicht, aa, roon, lair, hoose, richt, nicht, lan, lang, wean, awa, ben, toon, doot, ken, ticht, aboot, hame, lane, ain, braw, ee, strang, lee, sair, licht, brae, bricht, afore, deid, weel, greet, deil, auld, wrang, dee, yin, pairt, breid, syne, bairn, sang, ava, patter, snaw, claes
Out of the top one hundred most common rhyming words, fifty two words are distinctly Scottish, so the Englishness score is 48%.
The contemporary 2020s
Looking at the last four years, from 2020 onwards, the top ten most common rhyming words in Scots poetry are as follows:-
day, ken, see, oot, hame, nicht, me, heid, again, doon
Six words are distinctly Scottish, giving an Englishness score of 40%
From the top twenty-five words:-
day, ken, see, oot, hame, nicht, me, heid, again, doon, mair, toon, door, noo, roon, sicht, you, deid, richt, here, be, name, sea, year, best
Thirteen words are distinctly Scottish, twelve are shared with English, so the Englishness score is 48%.
Looking at the top hundred most common rhyming words, the following are distinctly Scottish
ken, oot, hame, nicht, heid, doon, mair, toon, noo, roon, sicht, deid, richt, doot, stane, licht, wean, aw, haun, sark, een, awa, gaun, wrang, hert, claes, watter, ava, cloot, aboot, hoose, bricht, dae, sair, roond, thegither, ee, ye, thocht, bide, ain, flair, oan
Forty three are distinctly Scottish, fifty seven are shared with English, giving an Englishness score of 57%
Data processing
This survey of the rhyming words gives us the following table of datapoints:-
We can sling them all onto a graph, please excuse the X-axis:-
The three lines showing top 10, top 25 and top 100, behave in a similar manner, a high English percentage of around 80% between 1700 and 1799, and a lower Englishness of around 50% in the years since 2000. The Blue line for the Top 10 words jumps around a lot compared to the Top 25 and Top 100. If we ignore the Top 10 line then the same trend is observable.
We might believe that this shows undeniably that the Scots language, with respect to rhyming words, has become less similar to English over the last three centuries, and our twitter commentator at the start of the article was mistaken.
Conclusions
However, I reckon we might be mistaken here. If we chop up the graph, ignoring the first datapoints covering 1700 to 1799, and ignoring the Top 10 words, because they are so volatile:-
The orange line, representing the top 100 most common rhyming words, is flat, its around 60% and has been stable for the last 220 years.
I reckon that framed like this, the Modern Scots language, as used by poets between 1800 and 2023 is stable and has a consistent level of similarity with English, with no drift away or closer.
We might speculate that those early 18th century poems, from Burns and Fergusson and their contemporaries, were carefully crafted to appeal to English speakers and so less Scottishish than they might otherwise be, and in later centuries a more confident cohort of poets were more comfortable writing in their native Scots tongue.





