Sections

2024-02-12

Gudhót'n - if a Slavic language was spoken in Mordor


⚞Ⰳ⃝

(lazy Unifont pixelart)


Foreword

This is an English grammar for a dialect of a mostly aposteriori semi-constructed semi-planned language Budhót'n, which was inspired by Palaeogetmanic, adapted so that it fits with Neogetmanic and OSloJ. It was supposed to be an auxlang, given an undue emphasis on foreign words, however the phonology and orthography got out of hand for that to ever function as such, and Slavic grammar isn't exactly condusive for auxlangs. This dialect of mine I call Gudhót'n, which is a reference to git gud in conlanging, and also G follows V which follows B in Glagolitic.

Furthermore, all existing documentation of Budhót'n was in Czech, which itself is a Slavic language. You couldn't learn a foreign word heavy Slavic auxlang unless you knew a Slavic natlang, at which point you will be learning Old Church Slavonic instead of this. Conlang reviwers cannot be expected to learn Czech, an undescribable gobbledygook, just to make a review, which is also a reason for this grammar in English. Also, Neogetmanic grammar is in Czech still, however that isn't an auxlang, so this grammar doubles as a English version of that too.

Given the similarity of Slavic languages, this might not be a natlang ripoff per se, there's a lot of Polish, Russian, and BCMS words. Still, in conlanging it's better to keep off your natlang's language family as much as possible, and that's why I invented the word modlanging. Were it not for prescriptivism, there would be no Palaeogetmanic, it would just be a Czech idiolect.

There is to be a website, but yuck Wordpress: https://www.budhotn.cz/

Linked open data for this abomination may or may be not subject of my thesis.

This is canonical content and pronunciation: https://www.youtube.com/watch?v=RVKbzEPLnNY


Phonology and orthography

The original Budhót'n alphabet consists of 91 letters arranged linearly with some inconsistent assignments made because of Windows Czech keyboard driver being limited to CP1250 with dead keys. This grammar is made on Linux, where this restriction doesn't hold, therefore Gudhót'n is free to use accents consistently. Windows Character Map can't even Unicode properly. Custom .XCompose may be needed.

Legacy WinLatin2 alphabet:

'  A  Á  À    Ă  Å  Ą  B  C  Ć  Ç  Cz Č  D  Ď  Đ Dth Dz Dž E  Ë  É  È  Ê  F  G  Ğ  H  Ch  İ  I  Ï  Í  Ì  Π J  K  Kh L  Ĺ  Ľ  Ł  Lh M  N  Ń  Ņ  Ň  O  Ö  Ó  Ò  Ô  Ő  P  Q  R  Ŕ  Rh Rz  Ŗ  Ř  S  Ś  Sz Ş  Š  T  Ť  Ţ  Th U  Ü  Ú  Ù  Û  Ů  Ų  Ű  V  W  X  Y  Ÿ  Ý  Z  Ż  Ź  Ž  Zh

Neogetmanic mapping (different collation order):

'  A  Á  Ǝ̇  Ȧ  Ą    Ǫ  B  C  Ć  Ċ  Č  Č̇  D  Ď  Ʒ̇  Đ  Ʒ  Ǯ  E  Ä  É  Ë  Ė  F  G  Ġ  H  Ȟ  I̋  I  Ï  Í  Į  Ẏ  J  K  Ɣ  L  Ĺ  Ľ  W  Ꝇ  M  N  Ŋ  Ṅ  Ň  O  Ǝ  Ó  Ö  Ȯ  Ǝ́  P  Q  R  Ŕ  R̂  Ṙ  Ř̇  Ř  S  Ś  Š  Ṡ  Ṧ  T  Ť  Ṫ  Ŧ  U  Ü  Ú  Ų  U̇  Ô  U̇́  Ű  V  V́  X  Y  Ÿ  Ý  Z  Ž  Ź  Ž̇  Ż

IPA:

ʔ  a  aː ɐ  ɑ  ã  oa ɔ̃  b  t͡s t͡sː t͡ʂ t͡ʃ t͡ɕ d  ɟ  d͡ʐ ð  d͡z d͡ʒ ɛ  æ  ɛː e  ɜ  f  g  ɢ  ɦ  x  iː ɪ  i  ɪː ĩ  ɥ  j  k  ɣ  l  lː ʎ  w  ɬ  m  n  ŋ  ɴ  ɲ  o  ə  oː ø  ɔ  əː p  q  r  rː ɹ  ʐ̝  ʀ̝  r̝  s  sː ʃ  ʂ  ɕ  t  c  ʈ  θ  ʊ  y  uː ũ  ɵ  uo ɵː yː v  vː ks ɨ  ʏ  ɨː z  ʒ  zː ʑ  ʐ

BuSloJ flattening:

'  A  Á  A  A  A  OA A  B  C  Č  Č  Č  Č  D  Ď  Ǯ  Ʒ  Ʒ  Ǯ  E  E  É  E  E  F  G  G  H  H  Í  I  I  Í  I  I  J  K  H  L  Ĺ  Ľ  V  L  M  N  N  N  Ň  O  Y  Ó  Y  O  Ý  P  K  R  Ŕ  R  Ř  Ř  Ř  S  S  Š  Š  Š  T  Ť  Ť  C  U  Y  Ú  U  U  UO Ú  Y  V  V  KS I  Y  Í  Z  Ž  Z  Ž  Ž

GeSeL correspondence:

'  A  A  A  A  A  A  A  B  C  J  J  J  J  D  D  J  C  C  J  A  A  A  I  E  F  G  G  H  H  I  I  I  I  I  I  Y  K  G  L  L  L  W  L  M  N  N  N  N  O  E  O  E  O  E  P  Q  R  R  R  J  J  J  S  S  X  X  X  T  T  T  C  U  I  U  U  U  U  U  I  V  V  KS I  I  I  Z  X  Z  X  X

Neogetmanic table by accent:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ʒ Ǝ 
Á   Ć   É       Í     Ĺ     Ó     Ŕ Ś   Ú V́     Ý Ź   Ǝ́ 
    Č Ď       Ȟ       Ľ   Ň       Ř Š Ť           Ž Ǯ   
    Č̇                             Ṙ́ Ṧ   U̇́         Ž̇     
Ȧ   Ċ   Ė   Ġ             Ṅ Ȯ     Ṙ Ṡ Ṫ U̇       Ẏ Ż Ʒ̇   
                           Ô     R̂                     
Ä       Ë       Ï           Ö           Ü       Ÿ       
                I̋           Ő           Ű               
Ą     Đ     Ɣ   Į     Ꝇ   Ŋ Ǫ         Ŧ Ų               


Accents work as follows:

Sign  Description  Effect                 Letters                    Decomposition
´  Acute          Lengthtening            Á Ć É Í Ĺ Ó Ŕ Ś Ú V́ Ý Ź Ǝ́  doubling
 ̇́  Acute and dot  Alternate lenghtening   U̇́                          doubling
ˇ  Caron          Middle palatalization   Č Ď Ȟ Ľ Ň Ř Š Ť Ž Ǯ        cz dj ch lj nj
 ̌̇  Caron and dot  Soft* palatalization    Č̇ Ř̇ Ṧ Ž̇ 
˙  Dot (vowel)     Alternate               Ȧ Ė Ȯ U̇ Ẏ
˙  Dot (consonant) Hard palatalization     Ċ Ġ Ṅ Ṙ Ṡ Ṫ Ż Ʒ̇
^  Circumflex     Velarization            Â Ô R̂
¨  Umlaut         Vowel offset            Ä Ë Ï Ö Ü Ÿ
˝  Acute umlaut   Offset vowel length     I̋ Ő Ű    
˛  Tail           Short nasalization      Ą Į Ŋ Ǫ Ų
-  Strikethru     Eth and Thorn           Đ Ŧ
   Variation      Ghain and Llan          Ɣ Ꝇ

* except Ř̇, see below

This system is modular, in fact it's a 2D orthography, just like Brahmic abugidas or vocalized Arabic, but due to the sheer amount of legacy data, upwards modularity is restricted to Neogetmanic. It seems to work better than IPA for me. All script conversions as described in Neogetmanic hold here too. There's an original mapping to cyrillic, but mine better fits the Neogetmanic <-> DOSloJ span.

There had to be made sereval letters which were not originally in Neogetmanic. This is how crazy the phonology is. Namely the angled candabindu series and the exclamation point letters. As it turns out, Polish Ć Ś Ź is not the same as Czech Č Š Ž, despite them sounding way too soft to other Slavs. Both dot, caron, and acute were taken, so I came up with combining caron and dot. There is precomposed Ṧ in Unicode, good luck with generalizing that to the rest of the letters, and even better luck typing this with Windows' ancient keyboard driver. The other oddities are Ř̇ and U̇́. Ř̇ is logically a long Polish Rz, however it can also be long Czech Ř, which I write Ř́, or Getmanian Ř, which I write Ř̇ or Ř̇́ when long. In fact, Polish Rz is actually Ż in place where there was a sort of R sound, yet the orthography claims to be fully phonetic. U̇́ is supposed to be the Black Speech long Û, the short Û being written U̇, which is Czech Ů in Neogetmanic. This comes to the subtitle of a Slavic language spoken in Mordor.

There are some "long consonants", in the old wrong IPA, Ř̇ is transcribed the same as ̇´R and Ṫ is transcribed the same as Ť´ :

CC = Ć (Ć), RR = Ŕ (Ŕ), RzRz ~ Ř̇ (Ŗ), SS = Ś (Ś), ŤŤ ~ Ṫ (Ţ), VV = V́ (W)

For actual auxlang application purposes, a flattening into OSloJ called BuSLoJ is available. BuSloJ can further be decomposed into mainline OSLoJ and DOSloJ. BuSloJ isn't directly compatible with Visual OSloJ, as it puts the palatalization on the consonant instead of the vowel. There is a "simplified" original Budhót'n orthography, but not nearly enough. You can't have an auxlang with more than 60 letter long alphabet, unless it was made for increasing the entertaining value of the Phonology section of Conlang Critic. Whilst Palaeogetmanic had an undue amount of letters, they were all inverse contraptions.

Phonology in Gudhót'n remains generally unchanged from Budhót'n aside from me pronouncing Rs and Řs like [ʀ] instead of [r] and Ls like [ʟ] instead of [l]. Nonetheless, the official "IPA" is completely wrong because of a crappy keyboard layout, so I provide a more accurate one, though not completely stable as I still figure out what was meant. There are no specific orthography rules, everything is written phonetically, however some supposed phonemes are indistinguishible to me. This allows automatic generation of IPA with sed scripts, however because Google Sheets still can't λ, 128 nested SUBSTITUTEs have to be used, which doesn't work in LibreOffice. Just use AWK, sed, tr, and cut on TSVs.

There are some funny names for punctuation characters, inspired by INTERCAL and my attempt at translation, but let's not worry about them now.


Morphology

Standard Slavic parts of speech are followed.


Nouns

There are all the typical 3 genders, masculine, feminine, and neuter. Based on the nominative, genitive and accusative singular, each of them can also be animate (acc=gen), inanimate (acc=nom), ADHD (all distinct) or zombie (all same). There is no gender inclusivity in Budhót'n and Gudhót'n, only a nonbinary indication particle, see Neogetmanic for true "naplěvať" gender.

There is no dual number, fortunately.

Ending determines the declination paradigma, however it seems random. Furthermore, the endings change depending on the suffix. Prefixes have no impact on declination. Both prefixes and suffixes are ripped off from Czech and Slovak.


Masculine

   god   lock   guest  castle  youngman  forest  people   mate

1
2
3
4
5
6
7

1
2
3
4
5
6
7


Feminine

  goddess   power  girl  pain  woman  machine  kingdom

1
2
3
4
5
6
7

1
2
3
4
5
6
7


Neuter

  army   beer   chicken   sea

1
2
3
4
5
6
7

1
2
3
4
5
6
7


Adjectives

In addition to grading, adjectives also have gender. Luckily, there aren't gazillions of declinations.

There are 2 types of gradation. The older one from way less Slavic times of Budhót'n consists of prepending prefixes (oddly reminiscent of Icelandic), the new one consists of a suffix and a prefix, with the same irregularities.

  masculine       feminine   -ness        neuter

1
2
3
4
5
6
7

1
2
3
4
5
6
7



Pronouns

Slavic pronouns are more complex than it seems. There's about 200 of them belonging to different classes, and just like nouns, they have genders and declinations, in addition to the usual person.

A gender neutral pronoun for genders not known at the compile time is missing, just like in Czech. There is however a particle for nonbinarity. Neogetmanic has onǝ and ěǒ.

To complicate matters further, all 4 politeness levels have distinct pronouns, a feauture of Japanese and Korean no one should ever copy. Politeness is for subs anyway.

Personal

Since regular English is insufficient to differentiate plural and genders, some cringe slang terms were used.

   I   U   he   she   it   we  y'all   dem boiz  dem grlz  dem thangz

1
2
3
4
5
6
7

1
2
3
4
5
6
7

1
2
3
4
5
6
7

1
2
3
4
5
6
7


Possesive

   mine  ur  his  hers  its   our   yer  boiz'  grlz'  thangz'

1
2
3
4
5
6
7

1
2
3
4
5
6
7

1
2
3
4
5
6
7

1
2
3
4
5
6
7


Query

There is no table of corellatives like in Esperanto. There are prefixes ňe- for some-, ňi- for no-, gde- for -ever, -koꝇiv for any-. 

   who   what     which  which  whose

1
2
3
4
5
6
7

1
2
3
4
5
6
7


Non-declining adverbial pronouns:

where  when  how  whence  whither



Pointing

   dat boi   dat grl   dat thang    that

1
2
3
4
5
6
7

1
2
3
4
5
6
7



Quantificatiors

   each   every  all   some   one   alone   same

1
2
3
4
5
6
7

1
2
3
4
5
6
7



Numbers


There is a special modification of the long system, in that it uses it's own numbers instead of Latin. So instead of billion you have dvaljon, and so on and so on.


Suffixes for fraction, multiple, grouping.

SI prefixes are adapted, including the latest ronna and quetta.



Verbs


There are 5 tenses: legendary, past, present, future, and apocalyptical. Legandary and apocalyptical tenses are used for events that didn't or won't happen, just like in many religious scriptures. For more Atheist applications, they can be used for very remote past and future, like the Big Bang or the Heath Death of the Universe.

The endings are strictly concatenative, there is no stem change like in Czech. The only exception is the verb to be. Determinig a stem from dictionary is however not as straightforward.

   legendary   past   present    future   apocalyptical

1
2
3

1
2
3


Informal conjugation is available, which is a feature imported from Finnish.


   legendary   past   present    future   apocalyptical

1
2
3

1
2
3


There are verbal prefixes, but they seem to work same as in Czech.


Adverbs

These were undocumented until I mined the painstakingly hand-input data. As per author's comments, there seems to be a class that is indistinguishible from znamenje type nouns, and another class that is indistinguishible from neuter adjectives. Also many preposition and noun sequences are written without the space, in Czech syntax classes this is called adverbial determination (příslovečné určení).



Prepositions

There are some apriori prepositions.


Conjunctions

These parts of speech are actually apriori, or stolen from languages I don't know.


Particles

This category is considered the "other" or "miscelaneous". There are multiple special syntax elements.


Interjections

They can be appended with an ending to form a noun, adjective, verb, or adverb. Bare interjections are generally undocumented and made on the spot with the broad phoneme repertoire.

Some phrasal greetings are considered interjections. 


Syntax

Except for particles, the syntax in Budhót'n and Gudhót'n is a complete ripoff of Czech, because it wasn't documented, so the natlang stepped in.


There are many phrasal sentences written as a single long word.


Lexicon

There are 4 levels of politeness: literary, colloquial, unliterary, and expressive. In addition to that, there are some 3k archaisms I personally collected from the old website. These names are taken from Czech vocabulary layers, but should be thought of as being nondescript steps. The literary level consists of early words, as it's the original level.

There are multiple packages mostly copied from other dictionaries, each having its own global politeness level. Altogether this language is supposed to have around 500k words. Quite a lot for a conlang made by 2 people. There is no point in shoving words if your application doesn't use them. Google Sheets just get laggy.

Punctuation and letters - literary, 200

Basic words - all 4 levels, 4*10k

Phrases - literary, 1k

Foreign words - literary, 20k of projected 100k

Unforeign words - impromptu literary additions, 5k

Getmanisms - unliterary, also some impromptu, 3k

Czech 2.0 - colloquial, 10k of projected 30k

Conversation mini-dictionaries - archaic, 3k * 12 = 36k

Vulgarisms - expressive, 500

GeSeLisms - unliterary, projected 4*4k

Autosteal transcription - unliterary, projected 50k




No comments:

Post a Comment

Barely anyone comments, so I don't moderate. Free advertising, I guess.