Sections

2023-05-31

Subjectively Relevant Unicode Subset 64K

There's a limit of 65535 glyphs per OpenType font, and Unicode has 150249 graphic characters in version 15.1, of which 98682 is Han (you need 2 fonts for that alone), with leaves 51567 of everything else, however Indic scripts may require many ligatures, and (U)CSUR is tacitly endorsed as a method to use IP-encumbered scripts.

Alternatively, one can pretend Unicode 3.0 from 1999 is the last version with 49168 graphic characters all in BMP, or that Plane 2 never existed until 9.0 from 2016, when annoyingly large Tangut was added to Plane 1, but that gets outdated pretty quickly. If you ignore Tangut, Hieroglyphs, Cuneiform, and Bamum Supplement, you get basically Unifont-EX, which plateaus at 15.1 BMP and 11.0 SMP, but with some glyph deduplication could get to have the 16.0 Symbols for Legacy Computing and its Supplement.

In future Unicode, there will be more than 64K of nonHan, so better to choose interesting scripts and blocks now.

Hangul is best served by an advanced OpenType syllable composing system, there is way more than those precomposed 11172 syllables, 125*95*137=1626875, with North Korean extensions 128*99*141=1786752. That amount precomposed would need 25 or 28 fonts and doesn't even fit into 10:FFFFh limited Unicode, but may just fit into most-4-bytes UTF-8, which ends at 1F:FFFFh. Time to define usage of FDD0..FDEF as UTF-16 super-surrogate 8-plets to reach full 32 bits of UCS-4.

PUA assigment is based on Fairfax and Constructium. The E000~F8FF range should be considered mostly (inter)nationalized according to USCUR, as this contains the much needed tlhIngan pI'qaD and Tengwar. The Trekkies and Tolkienists are a stronger user base than medievalists and linguists. Most of MUFI, CYFI and SIL has been incorporated and the leftovers are mostly ligatures, variations, stylistic sets, or precomposed. There is SMuFL PUA agreement, but that is mostly getting into Unicode too, and I'm a tracker, pianoroll, and ASCII tab guy anyway. Also Nerd Fonts have finally fixed them overflowing into Arabic Presentation Forms by moving them to astral PUA, so they no longer mess with Quran text from tanzil.net, however Powerline conflicts with Tengwar (not Cirth though).


Begin   End     Name                                        Size  Stot  

000000  0033FF  Lower BMP                                   3400  3400  

004DC0  004DFF  Yijing Hexagrams                            0040  3440  

00A4D0  00ABFF  Middle BMP                                  0730  3B70  

00D7B0  00D7FF  Hangul Jamo Extended-B                      0050  3BC0  

00E000  00EDFF  Lower UCSUR                                 0E00  49C0  

00EF00  00EFFF  Hex Byte Pictures                           0100  4AC0  

00F000  00F1FF  Kamakawi                                    0200  4CC0  

00F200  00F27F  Box Drawing Ext, Fill Patterns, Shade Quads 0080  4D40  

00F400  00F43F  C1 Control Pictures                         0040  4D80  

00F4C0  00F4EF  Ath                                         0030  4DB0  

00F500  00F54F  Kodo Symbols                                0050  4E00  

00F550  00F55F  Mathematical Symbols Appendix               0010  4E10  

00F560  00F56F  Camp Duodecimal Numerals                    0010  4E20  

00F580  00F58F  Geomantic Figures                           0010  4E30  

00F590  00F5FF  C64-OS and Commander X16 Symbols            0070  4EA0  

00F600  00F7FF  Adobe: LGC Compatibility Forms              0200  50A0  

00F800  00F83F  Apple: Hoefler Ornaments                    0040  50E0  

00F880  00F89F  Adobe: Thai Compatibility Forms             0020  5100  

00F8A0  00F8FF  UCSUR: Aiha and Klingon                     0060  5160  

00FB00  00FFFF  Upper BMP                                   0500  5600  

 

010000  012FFF  Lower SMP, Cuneiform                        3000  8600

013000  015AFF  Egyptian, Anatolian, and Mayan Hieroglyphs  2B00  B170

016000  0160FF  Cirth and Tengwar (no Mandombe)             0100  B270

016140  0161FF  Sarati, other Tolkien scripts, and Moon     00C0  B330

016200  0167FF  Blissymbols                                 0600  B900

016EF0  016EFF  Bopomofo Ext-A, Kanbun Ext-A, IdeoSym&Punc  0060  B960

01A760  01A77F  Rejang Extended                             0020  B980

01AFD0  01AFFF  Kana Extended-C and B                       0030  B9B0

01B000  01B16F  Kana Supplement, Kana Ext-A, Small Kana Ext 0170  BB20

01BA00  01BCFF  Indus, Shorthands (RIP Rongorongo)          0300  BE20

01CC00  01CBFF  Symbols for Legacy Computing Supplement     0300  C120

01D100  01D24F  Musical Symbols, Ancient Greek Music Not.   0150  C270

01D2C0  01D2FF  Kaktovik and Mayan Numerals                 0040  C2B0

01D300  01D37F  Tai Xuan Jing Symbols, Counting Rod Nums    0080  C240

01D380  01D7FF  Mathematical Alphanumerical Symbols         0400  C640

01D800  01DAAF  Sutton SignWriting                          02B0  C7F0

01DF00  01E08F  Latin Ext-G, Glagolitic Sup, Cyrillic Ext-D 0190  C980

01E7E0  01E7FF  Buginese Sup, Lontara B-B, Ethiopic Ext-B   0090  CA10

01E900  01E95F  Adlam                                       0060  CA70

01EC00  01FFFF  Upper SMP                                   1400  DE70

 

0F0000  0F1C3F  Upper USCUR                                 1C40  FBB0

0FF030  0FF0DF  Domino Tiles Extended, Powerline Symbols    00B0  FC60

0FE000  0FE07F  Tengwar Presentation Forms                  0080  FCE0

0FE680  0FE6DF  Ewellic Presentation Forms                  0060  FD40

0FF380  0FF3FF  Tahano Veno and Aliphbeph                   0080  FDC0

0FF400  0FF51F  Voynich                                     0120  FEE0

0FF700  0FF7FF  7 Segment Display Patterns                  0100  FFE0

0FF900  0FFEFF  Sitelen Pona Presentation Forms-A,B         0300  02E0

0FFF00  0FFFFF  Symbols for Legacy Computing Appendix       0100  03E0


I am 3E0h=992 codepoints over 65536 in blocks, but some are intentionally oversized for potential expension (Cuneiform, Hieroglyphs), some are only proposals (Indus, Blissymbols), there are gaps in the allocated space (1300 codepoints in BMP alone), and some characters look the same and can use the same glyph. However there're the Indic scripts, which would have to contend with only viramas instead of ligatures.


OSloJ - Generic Slavic Language


OSloJ - Obecni Slovanski Jazik

"You fucking donkey!" -- Gordon Ramsay, also Shrek probably
Welcome to Jackass
Novaja-Semlja-fjordene
Pons asinorum GeSeL <-> Budhót'n


Outline

This is a total conversion of GeSeL and a simplified version of Neogetmanic, which itself is a description of my non-literary variety of Czech, acting like some sort of satire on Slavic languages. Visual appearance is close to a je-kavian Slovene with the option of volapük encoding. Unlike Interslavic, this zonlang is supposed to be more playful and filled with memes. Unlike Budhót'n, the orthography is dead simple, and the lexicon doesn't deal with non-Slavic words.

This semi-constructed zonal language specification is written in English, as it has a very simplified Slavic grammar, which non-Slavs may have the possibility to fathom.

In the Getmanian ecosystem, OSloJ serves as a donkey's bridge (using the Czech meaning) to Budhót'nska. It elaborates on the spelling reform Gudhotn by reforming the grammar too, and makes the GeSeL dialect Slavitic come alive with a more fitting grammar. The lexicon, which is in way more advanced stage at Budhót'n side, is to be borrowed in its entirety with some custom thematic jackassery. Since the V (slaVic) root pack for GeSeL is still nonexistent, it's very likely that it will be algorithmically derived from Budhót'n's lexicon too. Some of the derived lexicon will serve as a base for Satemic, whill will take additional sources from Lithuaninan and Sanskrit.

  .···········>   Neogetmanic   <-   Palaeogetmanic
  :                    ^                    |
  :                    :                    V
GeSeL -> Slavitic -> OSloJ <- Gudhotn <- Budhót'n
                       :
                       V
                    Satemic

Depending on the orthography, there are some variants:

* DOSloJ - underlying ASCII form for DOS and databases, not GeSeL-compatible

* mainline OSloJ - QWX are shown as ČŠŽ for aesthetic reasons

* BuSloJ - interface to Gudhót'n and Budhót'n, only consonant accents, acutes

* VOSloJ - "Visual OSloJ", preference for vowel accents, no acutes

* CySloJ, GreSloJ, ArSloJ, GruSloJ, RuSloJ, GlaSloJ - alternate scripts


Alphabets

Compared to GeSeL, certain letters have been modified:

* J has the sound of former Y, as in Esperanto and Slavic languages, and also acts as a palatalizer

* Q has the sound of former J in unvoiced plosive form, that is Č in most Slavic languages and usually CH in English, being reminiscent of cyrillic Ч in lowercase

* W has the sound of former X in unvoiced form, to restore the historical Shin and to align with cyrillic Ш

* X has now only the voiced form sound, and looks like cyrillic Ж

* Y is reserved for Schwa and Yers, like Bulgarian Ъ

Therefore there is no longer any representation for Qaf, which is flattened into KV, and Waw or Wynn, which is flattened into V. Thorn and Eth may not necessarily be transcribed as C.

QWX can be written ČŠŽ for increased familiarity, however Latin-1 has only ǧ². Collation order remains unaffected because this is a visual hack.

There are special semi-contraptions, mostly related to heavy iotaion and palatalization in Slavic languages, with corresponding non-ASCII letters:

DZ, DX; DJ, LJ, NJ, RJ, TJ; JA, JE, JI, JO, JU, JY
Ʒ , Ǯ ; Ď , Ľ , Ň , Ř , Ť ; Ǎ , Ě , Ǐ , Ǒ , Ǔ , Y̌
Ѕ , Џ ; Ђ , Љ , Њ , Ҏ , Ћ ; Я , Е , І , Ё , Ю , Ӥ

AA, EE, II, OO, UU, YY; RR, LL;JAA,JEE,JII,JOO,JUU,JYY;RJJ,LJJ,RRJ,LLJ
Á , É , Í , Ó , Ú , Ý ; Ŕ , Ĺ ; Ǎ́ , Ě́ , Ǐ́ , Ǒ́ , Ǔ́ , Y̌́ ; Ř́ , Ľ́ , Ŕ̌ , Ĺ̌
А́ , Э́ , И́ , О́ , У́ , Ъ́ ; Р́ , Л́ ; Я́ , Е́ , І́ , Ё́ , Ю́ , Ӥ́ ; Ҏ́ , Љ́ , Р́Ь, Л́Ь

Iotated vowels have priority to align with Cyrillic. If limited to Windows Latin, place caron over consonants when caron can't be placed on top of vowel, which should be possible thanks to Pinyin. Ř is not supposed to be the West Slavic one (rjeka = rěka = řeka). Ľ is supposed to have a caron over it, if it looks like an acute or apostrophe, that font sucks, same goes for lowercase ľ,ď,ť. Alternative accents can be used as long as they don't conflict with acute and apostrophe, detailed later.

Y̌, Ǎ́ thru Y̌́, Ř́, and Ľ́ are theoretical. Schwa is not usually iotated, and there is no orthography level to use both acutes like BuSloJ and vowel carons like VOSloJ (BuVOSloJ?). Vocalic and let alone long Ř and Ľ are purely fictious in the context of this language, but may be useful in BuSLoJ. Besides, their support is quite lackluster, as they require combining accents. Consolas under Win11 in the Blogger editing window does support them quite well, though.

Certain Slavic language make use of -W class of diphtongs AU, EU, IU, OU. These are flattened into AV, EV, IV, OV (avto, evro). They aren't diphthongs accross morphological borders. OU is often replaced with OJ (ženoj) or something else.

Due to a convenient legacy feature, H doubles for both H and KH, as well as G doubles for both G and GH. Certain Slavic have changed G into H, which OSLoJ doesn't. This contributes to a Yugoslav and Polish feel at the expense of Czech and Ukrainian feel. Suggested compromise pronounciation of G would be GH, which is found in Arabic and maybe French and is a sonorant.

Glottal stop (') isn't used phonemically, acts more like punctuation. R and L can be vocalic. In some Slavic languages, M is supposedly vocalic in the numerals 7 and 8, but no one actually pronounces them like that. Vowel length is not marked, like in South Slavic, but marks do exist for onomatopoeia and BuSloJ. All scripts shall use the acute, if that unsupported somehow, double the letter. Stress is always on 1st syllable so no need to mark it.

Alternative scripts shall respect the phonetic value and may assign special characters to the semi-contraptions above. There are 26 mandatory and 13 optional letters for a total of 39 in VOSloJ. In BuSLoJ only 7 of those 13 for subtotal of 33. With the 8 acutes it's grand total of 47, as in AK-47. In BuSLoJ, which doesn't use the caroned vowels, the total is just 41. Don't mind those 10 caron-acute letters, they are mostly for conversions from Neogetmanic and Gudhót'n, and would bring the grand total up to 57.

A B C D E F G H I J K L M N O P Č R S T U V Š Ž Y Z Ʒ Ǯ Ď Ľ Ň Ř Ť Ǎ Ě Ǐ Ǒ Ǔ Y̌
А Б Ц Д Э Ф Г Х И Й К Л М Н О П Ч Р С Т У В Ш Ж Ъ З Ѕ Џ Ђ Љ Њ Ҏ Ћ Я Е І Ё Ю Ӥ
Α Β Θ Δ Ε Φ Γ Χ Ι Ϳ Κ Λ Μ Ν Ο Π   Ρ Σ Τ Ω       Υ Ζ Ξ Ψ             Η
TODO Armenian
TODO Georgian
ᚨ ᛒ ᚳ ᛞ ᛖ ᚠ ᚷ ᚺ ᛁ ᛇ ᚴ ᛚ ᛗ ᚾ ᛟ ᛈ ᛃ ᚱ ᛋ ᛏ ᚢ ᚹ ᛲ ᛪ ᚣ ᛎ
Ⰰ Ⰱ Ⱌ Ⰴ Ⰵ Ⱇ Ⰳ Ⱈ Ⰹ Ⰻ Ⰽ Ⰾ Ⰿ Ⱀ Ⱁ Ⱂ Ⱍ Ⱃ Ⱄ Ⱅ Ⱆ Ⰲ Ⱋ Ⰶ Ⱏ Ⰸ Ⰷ Ⱟ Ⰼ         Ⱑ     Ⱖ Ⱓ

Note that in Cyrillic, J is written Й only after a vowel, since for other positions there is a special palatalized or iotated letter. Ď and Ť can be pronounced like soft Ǯ and Č. If the palatalized consonants are unsupported, Ь can be used after consonants (ДЬ, ЛЬ, НЬ, РЬ, ТЬ). If the iotated vowels are unsupported, Serbian Ј can be prepended (ЈА, Е, ЈИ, ЈО, ЈУ, ЈЪ), should Й appear too ugly. To save you from typographical horrors like "її", І doesn't need the dots.

Greek is provided for writing before Glagolitic. Someone in the 10th century stole a Kerch amphora from Harun (ΓΟΡΟΥΝΑ), or repurposed one after Harun received the contents, and changed his declined name in Greek letters to mustard (ГОРОУХЩА) in Proto-Cyrillic.

Runic is provided as a method to perform some pagan "črti i rězi" before Slavs were co-opted into Christianity and had to modify Latin or Greek, or watch what Georgians and Armenians were doing. There is evidence Western Slavs near the Thaya river were trying to learn Futhark. There may also be something going on with Alekanovo inscriptions, but that's probably proto-writing like Vinča/Tărtăria. True writing system is something that spreads, and all alphabets, abjads and abugidas come from Proto-Sinaitic, or were at least inspired, depending on the level of creativity involved. Runic doesn't define the optional letters (yet). Germanic languages are not iotation-heavy.


Morphology


OSloJ is back from transfixive to concatenative.


Nouns

There are still genders, but at least it's not an utter chaos like Neogetmanic. Ideally, each case would have the same ending in all declinations.


Neuter and Indeterminate

  town     sea      sign

1 sjelo    morje    znamenje
2 sjela    morja    znamenje
3 sjelu    morju    znamenje
4 sjelo    morje    znamenje
5 sjelo    morje    znamenje
6 sjelu    morju    znamenje
7 sjelom   morjom   znamenjem

1 sjela    morje    znamenje
2 sjel     morji    znamenje
3 sjelam   morjim   znamenjem
4 sjela    morje    znamenje
5 sjela    morje    znamenje
6 sjelah   morjah   znamenjah
7 sjelama  morjama  znamenjema

Words like kurje (chicken) decline like morje in singular, but sjela in plural. The -et- infix seen in many Slavic languages is removed. Indeterminate nouns use the pattern morje.

Feminine

  woman   rose    song      bone

1 žena    ruža    pisenj    kost
2 ženi    ruže    pisnji    kostji
3 ženje   ruži    pisnji    kostji
4 ženu    ružu    pisenj    kost
5 ženo    ružo    pisnji    kostji
6 ženje   ruži    pisnji    kostji
7 ženoj   ružoj   pisnjoj   kostjoj

1 ženi    ruže    pisnje    kostji
2 žen     ruži    pisnji    kostji
3 ženam   ružim   pisnjim   kostjim
4 ženi    ruže    pisnje    kostji
5 ženi    ruže    pisnje    kostji
6 ženah   ružih   pisnjih   kostjih
7 ženama  ružema  pisnjema  kostjima

Pisenj and kost could be merged, were it not for the fact that pisenj drops the E because it's really a yeri. I have to resist the urge of making the instumental plural of kost kostma to maintain simplicity.


Masculine

Is divided into animate and inanimate. Basically the only difference is that animate has the accusative same as genitive, and inanimate same as nominative. There is also a zombie masculine sudca where all 3 cases are the same. Predseda is a transgender Feminine-to-Masculine.


  sir     castle   man     machine   donkey  chairman    judge

1 pan     grad     muž     stroj     osel    predseda    sudca
2 pana    grada    muža    stroja    osla    predsedi    sudca
3 panu    gradu    mužu    stroju    osloj   predsedoj   sudci
4 pana    grad     muža    stroj     osla    predsedu    sudca
5 pane    grade    muži    stroji    osle    predsedo    sudca
6 panu    gradu    muži    stroji    osloj   predsedoj   sudci
7 panom   gradom   mužom   strojom   oslom   predsedoj   sudcom

1 panji   gradi    muži    stroje    osloj   predsedoj   sudci
2 panu    gradu    mužu    stroju    osloj   predsedu    sudcu
3 panom   gradom   mužom   strojom   oslom   predsedom   sudcom
4 pani    gradi    muže    stroje    osli    predsedi    sudce
5 panji   gradi    muži    stroje    osloj   predsedoj   sudci
6 panoh   gradoh   mužoh   strojoh   osloh   predsedoh   sudcoh
7 panama  gradama  mužema  strojema  oslama  predsedama  sudcema


As you can see, OSloJ means "to the donkey" or "donkeys". More loosely, "dobrodošli v osloj" means "welcome to jackass". The osel pattern contains alternatives to pan, and drops an E, just like the pisenj pattern over on the feminine side. Again I have to resist the urge to make instumental plural of stroj strojma to maintain simplicity and leave that to Neogetmanic.

OSloJ also leans quite heavily towards the -om ending. This is to facilitate meditation - kvjetOMMM lotosovOMMM. Atheism doesn't exclude nontheist religions, and that's why the 2nd largest religion in the Atheism capital of Europe, Czechia, is Buddhism (or maybe Jediism if enough people make fun of the census).


Adjectives

They both decline, grade, and have gender. Not all forms are distinct, especially plural is identical for all genders, which suggests declination is a decorative façade. The 2nd grade consists only of infix -š-, and the 3rd grade is just naj- prefix. Not nearly as horrible as verbs.

            young                        younger

1 mladoj    mlada    mlade    mladši     mladši   mladši
2 mladego   mlade    mladego  mladšego   mladše   mladšego
3 mlademu*  mlade    mlademu  mladšemu*  mladše   mladšemu
4 mladego*  mladu    mlade    mladšego*  mladšu   mladše
5 mladoj    mlada    mlade    mladši     mladši   mladše
6 mladom    mlade    mladom   mladšom    mladše   mladšom
7 mladom    mladoj   mladom   mladšom    mladšoj  mladšom

1           mladi                        mladši     
2           mladih                       mladših
3           mladim                       mladšim
4           mladi                        mladši
5           mladi                        mladši
6           mladih                       mladših
7           mladima                      mladšima

* mlad(š)oj for animate dative (mladoj osloj) and inanimate accusative


         spring                   *springer

1 jarni  jarni  jarni  jarnjejši  jarnjejši  jarnjejši
2
3
4
5
6
7

1
2
3
4
5
6
7


Not all adjectives necessarily grade, some don't make logical sense to do so. Possesive adjectives however never grade.

1 otca  otca  otca   matčin  matčina  matčino
2
3
4
5
6
7

1
2
3
4
5
6
7



Pronouns


ja  mi
ti  vi
on  oni
ona
ono
one


"One" can be used either for 4th person, transgender, or a gender unknown at the time of writing. The declination of agreeing sentence elements is the same as for neuter.


1 ja  ti  on  ona  ono  one  mi  vi  oni
2
3
4
5
6
7


moj   naš
tvoj  vaš
jeho  jih
jeji
ho
jejo


1 moj  tvoj  jeho  jeji  ho  jejo  naš  vaš  jih
2
3
4
5
6
7


Interrogative pronons are used as a help for determining the correct case. There is also a special reflexive "self" pronoun.

  who    what   which           which one      whose    self

1 kto    čo     kakoj  kaki     kteroj  kteri  či       -
2 kogo   čego                                           sa
3 komu   čomu                                           si
4 kogo   čo                                             sa
5  -      -                                             -
6 o kom  o čom                                          sě
7 kim    čim                                            soj

Interrogative pronouns for adverbs: kterak, kak, kama.



Numbers

To save brain resources, numerals do not decline. Also no mixed endian.

jedan/raz dva tri štiri pjatj šest sedem vosem devjatj deset

weak LE: jedanadeset dvanadeset trinadeset štirinadeset pjatjnadeset šestnadeset sedemnadeset vosemnadeset devjatjnadeset dvadeset/desetnadeset

BE: deset jedan ... deset devjatj

BE: dva deset, dva deset jedan, ..., dva deset devjatj, tri deset, ..., devjatj deset

LE: jedanadvadeset, dvaadvadeset, ...

BE: sto, dve sto, deset sto, deset jedan sto, devjatj deset devjatj sto

LE: devjatjadevjatjdeset a devjatjadevjatjdeset sto

tjiseč, miriada, lah, milijon, kror, mirijon, milijarda


Long system is preferred, but between milliard (billion in short system or ten myrions in myriad-based long system) and quintilliard (decillion in short system), SI units are preferable.

deka, hekto, kilo, mirija, mega, giga, tera, peta, eksa, zeta, jota, rona, kveta

deci, centi, mili, mirijo, mikro, nano, piko, femto, ato, zepto, jokto, ronto, kvekto

Binary prefixes:

16    128   1024  16384 1048576 2^30 2^40  2^50  2^60 2^70  2^80  2^90  2^100
debi, hebi, kibi, mirbi, mebi, gibi, tebi, pebi, ebi, zebi, jobi, robi, kvebi

We still need 2 more prefixes to reach 2^128, currently with SI as is the memory limit of 128-bit memory addressing is 340282366 kvetabajtu or 3,403e38 or 268435456 kvebibajtu.

Above SI units, scientific notation is preferable.

jedan e sto (gugol)


(fractions TODO)


Verbs

There are like 5 verb conjugation classes in Slavic languages with each having multiple patterns. The tenses are slightly more regular. Aorist and imperfectum are imported from Serbochroatian and Lausatian. These have been missing to me in Czech since I got a firm grasp of the English tense system, and are also gender-invariant.

Person, Number, Tense, Indicative/Conditional/Imperative, Active/Passive, Aspect

Actually Czech has been becoming quite analytical with the verbs, there are many helping verbs for some of the more obscure tenses.

1
2
3

1
2
3



Adverbs

Adverbs only grade.

-nje  -njej  nej-njej


Prepositions

There are some nonsyllabic prepostions that phonetically glue onto the following word. They all have a syllabic variant with a preceding I.


k o s u v z

ik is iv iz

 

Syllabic prepositions tend to eat the stress of the following word. This may be the reason why it's considered bad typographic practice to end lines with prepositions. Furthermore the lack of lack of stess at the beginning is bad for writing iambic pentameters, but iamb sucks, dactyl-trochee rules. Same goes for GeSeL.


Conjunctions


NOT  ne

AND  a i

OR   či

XOR  abo

IMP  diž - tak

PMI  diž

EQV  prted


Conditional clauses can be made with ne - či, like !a || b, and equivalence clauses with ne-či-a-ne-druge-či-prve, like (!a || b) && (!b || a). This is more convenient than Hilbert system.


Particles


-li - if

za - for, yea

jo - yeah

da - yes

ne - no

njet - not, ain't

proti - against, nay

kurva - universal swear word

pjerdol - another universal swear word


Most fucking language learning courses tryna be so fucking posh and don't teach how to fucking swear, but if you fucking learn the fucking swear words among the first fucking words, you can fucking achieve real fucking spoken language comprehension of up to 50 fucking % straight fucking away, especially if you fucking travel to fucking Poland.

There's 12x fucking in a 57-word paragraph, that makes 21 % of words alone. Posh bourgeoisie may complain about poor vocabulary, but you have just started learning, and if you later go full Oxford debate on the NPC normies all around there, they wouldn't understand you anyway. More often than not, natives aren't C1 or C2 level in any language.


Syntax


For legacy reasons as well as my personal preference, OSloJ is still SOV. This also enables Reverse Polish Notation, assuming human brain has a decent stack size. Questions are VSO, commands are SVO.


Osel čevapi jedol.

Jedol osel čevapi?

Osle, jez čevapi!



Vocabulary


To conserve brain capacity, OSloJ is somewhat oligosynthetic in addition to being flective. There are only genuine Slavic roots, which just so happen to be oddly similar to Sanskrit.


Single letter words

These are: a, ˇa, e, ě, i, ˇi, k, o, ˇo, č, s, u, ˇu, v, š, y, z.

a, e, i, o, u, y - letter names

ˇa, ě, ˇi, ˇo, ˇu - pronouns

k, o, s, u, v, z - prepositions

a, i - conjunctions

ě - verb

ˇo - particle

a, e, o, y, č, š - interjections


Months

There are 2 systems, Slavic and Latin one. The Slavic one has been specifically mixed up so that no misunderstanding can take place amongst Baltoslavs still using them. There are 4 endings: en, anj, enj, ik. The Latin one is mostly taken from Slovak, with some more Latin forms to utilize the special declination (marta, julia, augusta), and December with some Russian flavor due to Dekabrists.

However in practice months are referred to with their cardinal numbers in nominative. Not to be confused with numbers from the Latinate system.


Cardinal      Slavic      Latinate   English    Slavic Sources               
prvi          leden       januar     January    Czech, Polabian
drugi         gromničnik  februar    February   Kashubian
treti         sakavik     mars       March      Belarussian
štvrti        duben       april      April      Czech
pjati         svibanj     maj        May        Croatian
šesti         červenj     jun        June       Czech, Ukrainian, Belarussian
sedmi         mjodovnik   julius     July       Kashubian
vosmi         žnivenj     augustus   August     Belarussian
devjati       vrješenj    september  September  Polish, Ukrainian, Belarussian
desati        paždžernik  oktober    October    Polish
jedanadesati  nazimnik    november   November   Old Sorbian, Polabian
dvanadesati   snježanj    dekaber    December   Belarussian

 

Statistics: 5x Belarussian, 3x Czech, 2x Polabian, 2x Kashubian, 2x Ukrainian, 2x Polish, 2x Old Sorbian, 1x Croatian

Praise be to Belarussian, but the worst situation was with November. Also see (in Czech): https://getmania.blogspot.com/2022/05/slovansky-kalendarni-pizdec.html


Days of Week

Compared to Interslavic, this is again more Slovak. Muslims may call Friday džuma. Pagan god days are also available, in Western, Eastern, and Germanic variant. Since OSloJ isn't intended to compete with Russian and Germans are guilty of exterminating Polabians and Pommerians, the Western set is preferred.

pondjelok, utorok, srjeda, štvrtok, pjatok, šabat, nedjelja

devanok, radgoštok, velesok, perunok, ladok, moranok, dažbožok

hortok, svarožok, velesok, perunok, mokošek, stribožok, dažbožok

mesjacok, tirok, odinok, torok, frejok, vsjitok, slunek


Budhót'n words

Only formal forms, only Slavic ethymology. About 5000 expected.


Interslavic words

Language purism mode. Barevna, njet kolorovana ptica.


GeSeL-V (Slavitic) words

Eternally pending.