New thoughts on standardisation
In which I have think about standards and standardisation, conceptualise the publishing industry and produce some graphs that require a huge amount of linguistic data to prove there is a standard.
The other week there was an event about Scots at the Royal Society of Edinburgh: The Haill Clanjamfrie, I was able to watch it broadcast live on YouTube, the video might still there.
Professor Jennifer Smith of Glasgow University explores Scots as a powerful tool for creative expression and social commentary, joined by writers Len Pennie, Chris McQueer, and Michael Pedersen.
Most Scots writers
In my corpus of 21st century Scots texts, I have works by over 600 contemporary Scots writers. The three at the Royal Society event, aren’t merely representatives of the field of Scots writers, they are particularly successful writers, they have sold many books.
I need to state clearly at this point, this here is my SubStack channel where I write and publish, I am my own publisher and editor. I don’t have a huge audience, a few dozen people at most who might be interested in a niche topic that doesn’t otherwise have mainstream appeal. I am not out to sell subscriptions or make money, writing these articles is an exercise to gather my thoughts, and other people might just be interested in reading it.
I also need to state that I am a not a writer, (I am a so-called Scots language activist, a bad linguist, and at best a thief according to some sources, although I did have a piece published in The Herald newspaper last year). I’m not competing to be a more successful writer than Len Pennie, Chris McQueer or Michael Pedersen. My book sales are not in the same league, my collection of short stories has not been published yet and is inherently barred from mainstream success.
So, if any of this article appears to be criticism, it is punching up at my betters, rather than punching down.
Most Scots writers are not very successful. Most of those six hundred or so in the corpus have poetry published in other people’s collections, short stories in Lallans magazine, or small print runs of a few hundred copies. I don’t doubt that many of them have boxes of unsold book stashed away in lofts, I certainly do. Across the whole publishing industry the average number of book sales per title is around twelve individual copies per year, so any book that sells hundreds is well above average.
Those writers who manage to sell thousands of copies, or tens of thousands make up a tiny proportion of the total published writing community. They are not normal or representative.
During the Haill Clanjamfrey discussion, the issue of standardisation cropped up a couple of times. The writers on the panel were opposed to the concept, calling it an “imaginary orthography” or saying that “Scots doesn’t have a standardised orthography”.
Standards and standardisation
These are two separate concepts.
We might imagine that standardisation is a process by which many individual writers change the way they write, change the spellings they use, aligning themselves to a single consistent form, a “standard”.
This standardisation process might be the action of a generation of school teachers compelling their children to use the same spellings as everyone else. It might be writers voluntarily indicating that their words are aligned to this “standard” form. Either way, standardisation is the process, whether or not its successful or not.
Conversely a standard is a fixed thing. Its not the process or the end result. Its the nebulous theoretical form that the standardisation process is heading towards, but it exists independently of the standardisation process.
To understand how a standard can exist without a standardisation process, I’ve compiled a short list of Scots standards.
De Facto Scots - organically used by the majority of Central Scots writers
De Facto Doric - organically used by the majority of Doric writers
Shaetlan - as published by I Hear Dee
The Manual of Modern Scots (1921) - Grant and Dixon
Focurc - conlang from the Falkirk region
North Carolina Scots wikipedia - discredited orthography used in pre-2020 wikipedia
Iain WD Forde
Gordon Morrison
Scottish Languages Bill Stage 1 Report (2024)
These are all Scots standards that exist, none of them are imaginary, and they all describe the same language, but in slightly different ways, with slightly different spelling preferences.
We can see that standards exist, its just there is no standardisation process, no education authority that can choose one standard and then enact it, and compel people to adhere to that one standard form. No one is getting punished for not using the same spellings as everyone else, and nor should they be.
If standard emerges, somehow, as an organic process, does it still count as a standardisation?
The Facebook discussion
The other week there was a discussion on the Facebook Scots Language Forum, I brought up the matter of proof-reading in Scots - To what extent would people expect subeditors and proof-readers to correct or nudge spellings into more common forms.
I kept a wee note of the range of different views and opinions, so I could write a Substack article in the future.
Not many people were really in favour of standardising the Scots language, maybe around 30%. Some of the reasons for supporting standardisation is that it would make teaching Scots easier, and it was thought that libraries and schools and bookshops would be more inclined to stock books written in standardised Scots than they currently are. Books written in standard Scots would naturally outsell non-standard versions.
The views against it, were generally supportive of local dialects. Writers should be free to spell words how they specifically speak the words in their local area, no one has any right to tell people they are spelling Scots words wrong. “Weird non-standard spellings” are the hallmarks of regional dialects. There is a general view that Scots has no standard orthography, and from this position, it shouldn’t be standardised.
My controversial view is that, whilst academia is about researching the truth, objective reality, how the universe actually works - all these views on Facebook, whilst not academic, have a more tangible effect on perceptions of the Scots language, and these views are almost all wrong.
It is standardised?
It could be merely a matter of different conceptions of what “standardised” means.
Scots isn’t taught in schools so there is no form that children are taught. If people conceive “standardised” to mean the form taught in schools, then they are correct
But, somehow a standardisation process has taken place.
Its not a wild west, anything goes, heap of phonetic and regional spellings. From compiling my frequency dictionaries, about 70% of Scots writers use exactly the same spellings for about 70% of words.1
If we take the word ISNAE as an example, this might be rendered in English as ISN’T. In my corpus of 21st century Scots writing, I’ve got around 200 people using some spelling of the word.
ISNAE is the most common spelling, ISNA is the second most common spelling, it happens to be the most common spelling in Doric writing. ISN’T is the third most common, and its shared with English.
If we imagine all the possible spellings of the same word using our standard computer keyboard - ISNY, ISNIE, IZNA, IZNAE, IZIE, ISNEE, and so on. I couldn’t find any writers using these spellings at all. There might be some Scots writers out there who do use them, but they aren’t represented in the general Scots writing community, and there are certainly not 113 writers using any of these forms.
We might categorise ISNAE and ISNA as “conventional spellings”, spellings that are well-attested, and the other spellings “unconventional”. Writers are free to use these unconventional spellings, but they choose not to. I would argue that this is what standardisation looks like. Its not a schoolteacher, cane in hand, demanding which spellings are used, its writers freely choosing to use the same spellings as other writers.
We will return to ISNAE later.
The majority of Scots books are written in a standard variety of Scots. The 600 or so writers in the corpus mostly use the same spellings for most words. This standard usage hasn’t lead to book sales. In fact, books that use non-standard Scots spellings occasionally become huge best-sellers.
Best-sellers
Occasionally there are best-selling books written in Scots, selling upwards of 10,000 copies or even over 100,000 copies. In the years when this happens, these individual books are probably outselling all other Scots language book sales combined.
We might classify publishers using two vectors - expertise in Scots and size of marketing budgets.
The publishers just happen to fall into two mutually exclusive groups.
Books by Itchy Coo, Luath, Doric Books and so on, tend to use consistent spellings, well-formed Scots and Doric. These publishers are naturally Scots publishers and have a consistent “house style”, possibly editors and proof-readers who marshal the writer’s spellings into something more conventional.
However, “The Young Team” by Graeme Armstrong, published by Picador, is a huge best-seller, selling hundreds of thousands of copies, and it uses non-standard Scots spellings, HIED for HEID, YI for YE, etc.
If you are a writer with a bit of talent and you want to sell thousands of books, then you need a publisher with a huge marketing budget to increase your chances of success.
Its not necessarily the case that books written in a conventional Scots orthography can’t be best-sellers, just that those books with large marketing budgets nearby tend to have a less conventional Scots orthography.
This means that there is a difference in perceived standardisation.
If you picked ten published books written in Scots (or even ten submissions to a writing competition, as a level playing field), about seven out of ten would be in the standard orthography, they would have been proof-read or at least glanced at by editors who have an innate conception of how Scots writing conventionally looks.
But if you picked an average ten books sold in Scots, that is to say representative of best-sellers, about nine out of ten would be in the non-standard orthography.
This isn’t the fault of the writers not using a standard orthography, most of them do, its the fault of the publishers for not proof-reading and correcting the copy they know are going to become best-sellers. I’m not sure if “fault” is the correct word here, this is merely the process that leads to this result.
In the Royal Society of Edinburgh Haill Clanjamfrie, one of the panellists said as much, their major publisher proof-reader had no experience with Scots and relied on the writer for spellings.
And so it becomes a self-perpetuating thing. The successful writers, the ones who get asked to be panellists at the Royal Society of Edinburgh, have been selected by the publishers and groomed into non-standardisation. The less successful writers are the ones without huge marketing budgets who have abided by a standard or conventional spellings, but while they are the majority of writers, they don’t get invited to air their views in front of esteemed audiences.
There is an additional point I ought to make, in that some of the English language publishers are overtly “anti-Scots” - Canongate in particular. William McIlvanney’s original three “Laidlaw” books, set in Glasgow, were populated with Scots-speakers, between 30% and 50% of speaking characters were depicted speaking Scots. When McIlvanney died, manuscripts and notes for a fourth Laidlaw book were passed by Canongate to Sir Ian Rankin, who in turn churned out a finished book. This books co-written, as it were, by McIlvanney and Sir Ian was undoubtedly going to be a best-seller, probably best-selling on advance sales alone. In this fourth Laidlaw book, there is not one Scots speaking character in all of Glasgow, they had been figurative cleansed, and no one noticed.
This was an active choice by the publisher and Sir Ian. Readers were expecting a book written in the established McIlvanney style, and they chose to eliminate all the Scots speakers.
There are other occasions where we can see Canongate actively reducing the amount of Scots in best-selling books that might otherwise have contained a consistent proportion of Scots writing.
We can return to the views expressed in the Facebook discussions.
Dialects
A few public libraries, and indeed the Scottish Languages Act (2025), have expressed support for the view that book acquisitions and teaching materials should be based on the local dialects of Scots, rather than some pan-dialectical standard.
This is the opposite view to what the standardisation supporters believe, that authorities will buy standardised Scots materials.
Some libraries refuse to buy Scots books unless they are explicitly written in the local dialect. And if no books exist in that dialect, then no Scots books are acquired.
After eighteen months of urging Scottish public libraries to stock more Scots books, representative local proportions of Scots speakers, and making no progress, I have a very low opinion of the libraries. Except for a couple of activist librarians, the public library service doesn’t give a crap about a language spoken by almost a third of taxpayers.
The Chartered Institute of Librarians and Information Professionals might harp on about diversity and literacy and culture, but its all lip-service with nothing behind it. Actions speak louder than words and the Scottish public library service is a boot on the neck of Scots speakers.
A result of this is that the librarians in general have no expertise in Scots (excluding those activist librarians). An opportunist writer / publisher could put out a book written in any non-standard variety of Scots with spellings not used anywhere else, basically making it up, and if they slapped a label on the cover, proclaiming it was written in the Kirkcaldy accent, or the Rutherglen variety of Scots, the librarians wouldn’t know any better, they’d be able to tick the box in diversity and the Scottish Languages Act, and buy a handful of copies.
If we now turn to those who oppose standardisation, whilst they are more numerous, they’re on the losing side. Scots is already standardised to some extent, the same spellings are used across mainland Scotland by the majority of individual writers.
In the Facebook discussion, I raised the word DRIECH and DREICH, where this is a typical example of a word where most people use the same spelling.
My view is that its simply people not knowing how to spell. It was argued that perhaps DRIECH (with I before E) is a regional dialect spelling, somewhere its pronounced with DRY- sound, and conversely DREICH is pronounced with a DREE sound.
This sounds reasonable, but imagine a reader, will they make the distinction? If they are reading it silently then maybe not, but if they’re are reading it out aloud, will they substitute their own usual pronunciation of the word for some other pronunciation? How often are people expected to read Scots prose out aloud? Do they do it on trains on the commute to work, or during quiet time at school, or before bed? Or do they read in silence?
There is a more practical way to look at this, if it is a regional dialect spelling, then what is the region? Luckily on my corpus I keep a record of where spellings are used.
Here we see that the writers using this spelling are based in Aberdeen, Edinburgh and Ards in Northern Ireland. This poses an absurd question, if we heard someone say that it was a bit DRIECH outside (not DREICH), would we assume that they were speaking the local Scots dialect of Ards, Edinburgh or Aberdeen?
We should set this against the fact that all the other writers in Aberdeen, Edinburgh and Ards, the majority of writers in these regions, use the more common DREICH spelling. Are they not using the local dialect?
Returning to ISNAE, one of the commentators on Facebook happened to use the spelling ISNIE, which isn’t in my corpus, I was curious, and they explained that they were from Glasgow and ISNIE is what they’d just naturally write, its the form that fits their speech, and “different places settle on different spellings”. This sounds reasonable. Except in my corpus I have dozens of other writers from Glasgow, and none of them use the ISNIE spelling, ISNAE is preferred.
It might be the case that ISNIE is the “settled” Glasgow spelling for a small dialect area within the city. But does it make any difference to the reader? They might notice that this writer isn’t using the same spelling as everyone else, but most readers will treat ISNAE and ISNIE as homophones, pronounced in the same way.
Now we are getting into the realms of Lewis Carroll’s Humpty Dumpty.
The whole point of written communication is that the reader understands what the writer has written. Its not just the writer randomly stringing together letters until they subjectively look pretty.
Linguist Professor Emily Bender posted a thing on BlueSky the other day, in the context of chatbots and LLMs, an extract from her 2021 paper on the Dangers of Stochastic Parrots.
That phrase “human-human communication is a jointly construed activity” is important in the context of Scots orthography. Its not just about how the writer wants to spell things, its also about how the reader will interpret those spellings.
In the general sense we can expect a reader to infer that ISNIE means the same thing as ISNAE, and that DRIECH means the same thing as DREICH.
Will the reader infer that the “different” spelling is intentional or accidental. Was it sloppy proof-reading and subediting, or is it some little-known regional dialect from one specific Glasgow street. Is it an explicit rejection of the conventional spelling, the writer trying to convey that they are rebelling against orthodoxy?
Or was it just random chance, no thought has been given to the Scots-reader?
Competance
But with Scots in particular there’s another aspect, the competence.
If we consider those librarians, reluctant to stock Scots books at the best of times, were somehow compelled by statute to acquire books and materials written specifically in the local dialect - would they choose the majority writers who use the same spellings as everyone else, or would they choose the writers who use spellings that other writers refrain from using?
Would they choose to ignore all the standard writers, buy a single copy of the unconventional writer’s work, and then get back to ignoring Scots speakers, writers and readers?
The Scottish public library sector has a book buying budget of around £5.2 million, of which around £1.6 million comes from Scots-speaking taxpayers. They acquire around 600,000 books each year, of which about 300 are written in Scots.
If they were spending their budget in a linguistically proportionate manner, then they would be buying 200,000 Scots books, almost a thousand times more than they currently buy.
I think a problem is that writers using really specific local dialect spellings are indistinguishable from writers who don’t know how other people spell words in Scots. And the “Scots isn’t standardised” argument just reinforces this indistinguishability.
Its a self-perpetuating thing. Its not easy to find books written in Scots, not many people are “well-read” when it comes to Scots literature. While I have more than two hundred Scots books in my living room, most public libraries in Scotland have literally six books written in Scots, typically three will be childrens books, one is a dictionary, one is poetry from the 1980s, one is that best-seller using non-standard unconventional Scots spellings, but its marked as English anyway and is dispersed among ten thousand English books.
I don’t have all the answers, but maybe the solution lies with a topic covered previously on this Substack, the lack of a Scots proficiency test. There’s no objective way to establish whether someone is fluent, literate or proficient in Scots. No Standard grades, Highers, GSCEs or A-levels, nothing that fits into the CEFR Common European Framework of Reference for comparing languages proficiency.
There is the CEFR self-assessment grid
But self-assessment isn’t acceptable here. We need to objectively distinguish between someone who is making a consciously choice about the spelling of regional dialects, and someone who just has no idea how Scots writers usually spell words.
In addition to testing whether people can read or write Scots, it might be useful to test if people can distinguish different varieties of Scots correctly, and can distinguish between conventionally spelled Scots and unconventional or rarely used spellings, is the writer just writing mince?
Occasionally people of an anti-Scots language persuasion on social media refer to the language as SCOATS - do these people genuinely think this is the commonly used term in Scots writing or do they know they are taking the pish?
Having such a Scots proficiency test applied to librarians would at least qualify them to select Scots books to stock, rather than leave it to chance or chaos. And apply the test to publishers’ subeditors to find people with experience and expertise to guide and correct those talented writers with a Scots-leaning.
Local dialects
I’m reluctant to use the word “preserve” when writing about local dialects, they are in a constant transient state whether we like it or not. Is it right to somehow freeze them in time, selecting which terms to set in stone, so they appear in textbooks like the works of Ozymandias?
My interest in the Scots language started with the Scots wikipedia thing back in 2020, some American teenage had written about 40% of the Scots version of wikipedia, despite not really speaking Scots themself, they just relied on dictionary translations, online orthographies and making up spellings to the best of their ability.
Native Scots writers and readers, took a look at it and recognised that it wasn’t right, in the uproar that followed, there was a project to fix the Scots wikipedia, and much of the weird spellings were removed. It would be absurd to argue that those pages should have been preserved as exemplar of the North Carolina regional dialect of Scots?
Similarly in the 1964 Disney movie “Mary Poppins”, actor Dick Van Dyke did a pretty lousy Cockney accent. Despite the film’s enduring popularity, its difficult to say that his rendition of Cockney did permanent damage to that variety of English. The film ought to be preserved as a cultural artefact, but not because of the actor’s accents.
So when a writer comes up with unconventional spellings for words, claiming they are regional spellings, what they’d just naturally write, the form that fits his speech, we shouldn’t take it at face value, it needs to be attested to by other writers in the same area. Someone with expertise needs to take an objective look and decide if its mince.
In some respects the presence of unconventional spellings proves that the language is standardised - otherwise you wouldn’t be able to point to a non-standard spelling and say “that isn’t right”, there has to be a standard to reject - as happened with the Scots wikipedia.
In closing
I fear that those who claim that Scots isn’t standardised are simply voicing a perception rather than some objective truth.
Its difficult to measure objectively, I could do another survey, but what’s the point.
There was a commentator in the Discord chatroom a few months back, who in response to the lack of standardisation had put together a new orthography. I asked them about this lack of standardisation, which writers in particular they found to be particularly hard to read, and they merely pointed to other people in the chat room, confessing that they hadn’t read any books written in Scots at all.
At this point I glance up at the two hundred Scots books on the shelves in my living room, and wonder what’s the point.
Whilst I just take is as fact that most writers use the same spellings for most words, it might be necessary to demonstrate or prove it. This is a convoluted process, but it is repeatable.
First, gather Scots writing by a substantial number of writers, I’d recommend at least fifty different writers. Then somehow sort all the writing into a word frequency list, counting the number of writers to use each word. Then collect together spelling variants of each individual word, and calculate the proportion of writers who use the most common spelling of each word form.
There’s a free online pdf version of my Scots frequency dictionary here, that a bored researcher could go through. The first page looks like this:-
When we take the first word, THE, while there are eight different spellings, the top spelling is used by 511 writers in the corpus. The sum of all the writes is 715 (which is a big ropey because there were only 600 writers in the corpus so we’re doubling up, but I’m ignoring that). 511 divided by 715 equals 0.71 or 71%.
If we go through each word and scribble on the percentage of writers who use the top wordform spelling, the page would look like this:-
If we go through the entire 2,500 words listed then the average is around 70%. This would also give us a fascinating measure or how standardised or unstandardised the Scots language is.
If our bored researcher was really into it, they could have a go at calculating similar sums for my forthcoming Doric frequency dictionary, which can be found in PDF form here. From a random sample, the proportion is about the same in Doric as it is for pan-dialectical Scots.
And its about the same for Ulster-Scots, an aw.
Here we are doing original research. If we rank the words by the proportion of writers who use same spelling, and then plot a graph, then its a pretty picture to look at.
If we look at the horizontal line at the top and note where it starts to fall, around 38%, this means that 38% of words have a single spelling used by all writers.
If we read off the 50% mark on the x-axis at the bottom, it read 85% of writers. So, 50% of words have a single spelling that is used by 84% of writers.
To get a kind of optimum value that we can quote at people, the max percent of words and percent of writers is around 68%.
By way of comparison, we might imagine that British English is “very” standardised. There are standard spellings for most words, that everyone uses, and only a small number of words with spelling variants, GREY / GRAY, BLOND / BLONDE, REALISE / REALIZE. There are also some Americanisms that might creep into a corpus of British English to muddy the waters - TAP / FAUCET, NAPPY / DIAPER and so on, but on the whole we might estimate that 97% of English words have standard spellings.
A similar “standardisation plot” for English might look like this:-
And based on my data for north eastern writing the plot for Doric looks like this:-
I don’t expect these graphs to persuade anyone that there is a standard form of Doric, or Ulster-Scots. They exist merely to demonstrate that the objective empirical data exists, and is not “made up” or imaginary, or a personal opinion. It is something we can objectively measure.
We might personally judge that the 68% for Ulster-Scots and 75% for Doric isn’t enough and they reflect an “anything goes approach” to spellings.
We could even prune the corpus, and select a smaller group of writers whose spelling are more closely aligned and then do a plot with a better optimum, it might be 90% or 95%.
But then if, in twenty five years time, the linguistic data analysts of 2051 repeated this exercise and found that that Doric and Ulster Scots then had values of 85% or 90%, then we could say that some progress has been made to make the varieties of Scots more standard than they were in the first quarter of the century. Or if the figures are lower, then it will have become less standardised.















