Ashraf Ras

Posted on May 3, 2024 • Edited on Dec 22, 2024

Technical Arabic

#ashras #arabic #techarabic

Technical Arabic is a specialized adaptation of the Arabic language, designed to complement rather than replace the original. This adaptation incorporates modifications that, based on extensive experience in programming and code writing, I have deemed essential. Each modification has been carefully considered and rigorously tested in practical applications.

No change was implemented without a clear and pressing need, thorough deliberation, and a cautious approach developed over years of experimentation and refinement. The goal has consistently been to introduce only minimal and lightweight adjustments. However, in some instances, more significant changes were unavoidable to address inherent challenges.

While Arabic is a language renowned for its literary and poetic richness, this very complexity makes it less suitable for technical applications, which demand simplicity, clarity, and adaptability. These modifications, though sometimes controversial, aim to bridge this gap, enabling Arabic to function effectively in technical and programming contexts without sacrificing its intrinsic beauty.

This article will outline all the modifications I have implemented, accompanied by the rationale and underlying motivations for each adjustment.

Non-Cursive Script

Non-cursive script is highly effective for technical applications, particularly in typography and machine use, where clarity and consistency are paramount. While cursive writing is well-suited for handwriting due to its smooth, continuous strokes that eliminate the need to lift the pen frequently, its practicality diminishes in technical contexts.

In typography, non-cursive script ensures that letters remain distinct and separate, meeting the requirements of a technical language. Cursive fonts, though visually appealing and appropriate for literary expression, lack the simplicity and functionality necessary for technical applications.

Although I believe that technology should adapt to the language rather than forcing the language to change, flexibility within the language significantly enhances its adaptability. An ideal language framework would support both styles: cursive for traditional handwriting and non-cursive for technical typography, as demonstrated by Latin-based languages.

The elegance of Arabic calligraphy is undeniably exceptional, but this aesthetic does not translate into practical value in technical settings. In programming and technical writing, the focus shifts from appreciating artistic beauty to prioritizing efficiency, simplicity, and clarity. Decorative elements often add unnecessary complexity, which hinders functionality in these contexts.

Adopting a non-cursive script resolves several challenges, the most significant of which is achieving a consistent letterform that remains unchanged regardless of its position in a word. This uniformity simplifies both the learning process and technical implementation. In a forthcoming article, I will explain how to adapt any font to a technical Arabic script, where each letter retains a single, clear shape while preserving readability. An example of this approach is provided below:

Plural and Dual Forms

Arabic plurals are highly diverse, with numerous forms that defy straightforward rule-setting. This complexity often necessitates learners to memorize both singular and plural forms of words. Without fluency in Arabic, transitioning between these forms can become a significant challenge.

In programming, the ability to automatically generate plural forms of nouns is often essential, particularly when working with database models. Achieving this functionality in Arabic requires considerable effort, and in some cases, the use of artificial intelligence. Such complexity feels disproportionate, given that pluralization is a fundamental linguistic feature.

For instance, if a non-native speaker seeks to learn Arabic quickly for programming purposes and asks for a set of pluralization rules to memorize, it would be nearly impossible to provide a concise, comprehensive answer. The rules are so numerous and varied that they cannot be easily distilled into a few simple guidelines. This complexity can result in individuals mastering programming concepts without ever fully grasping Arabic pluralization.

There is also an issue in Arabic with the non-human plural جمع غير العاقل, where feminine rules are often applied even if the plural is masculine, for example:

مفرد: هذا صاروخ سريع
جمع: هذه صواريخ سريعة

See how the phrase went from masculine هذا to feminine هذه, and from masculine سريع to feminine سريعة, the singular feminine rule was applied to the masculine plural of the phrase. Linguists say that this is due to the assumption that the word مجموعة is unwritten (hidden) so that the sentence originates like this:

هذه مجموعة صواريخ سريعة

But why is this rule not applied to the human plural جمع العاقل such as:

مفرد: هذا طفل كسول
جمع: هؤلاء أطفال كسالى

Since the word مجموعة is not assumed to be present, the sentence would read:

هذه (مجموعة) أطفال كسولة

From the previous sentence, it is clear that the original plural was not parsed according to مجموعة, but rather according to أطفال, so we say:

مجموعة أطفال كسالى

And we do not say:

مجموعة أطفال كسولة

In the same vein, we should have said:

مجموعة صواريخ سريع

Because صواريخ is masculine and the plural follows the word صواريخ and not the word مجموعة, as we saw for the word أطفال.

In addition, the non-human plural جمع غير العاقل takes the feminine plural for many masculine words such as:

إجراء => إجراءات
مشروع => مشروعات
تعديل => تعديلات

There are many examples of this, so why does the feminine have only one fixed plural rule, namely ات, while the masculine has many plural rules. Can't you see that the Arabian masculine always loves multiplicity?

Furthermore, I researched the plural in Hebrew, a sister language to Arabic, and found that they have a consistent plural rule of adding a yah and a mem يم for the masculine plural. The consistency of a language's grammar reveals its evolution, and simplifying grammar is important in the age of speed and information.

We already have a ready-made rule that applies only to the human plural جمع العاقل, so why not also apply it to non-human plural جمع غير العاقل?

معلم => معلمين
صاروخ => صاروخين

So, if you know that we have abandoned the case endings الإعراب and the endings of the words have become in the jussive form مجزومة (more details later), you will understand that the plural always remains in the form of ين and never changes to ون, for example:

رأيتْ المعلمينْ في القسمْ
دخلْ المعلمينْ إلى القسمْ

This is not unusual for Arabs, as most dialects have abandoned case marking الإعراب, for example, in the Egyptian dialect we say:

المعلمين داخلين القسم
المهندسين عملو إضراب

I am not Egyptian, but I use it as an example because it is the most familiar to the Arab ear. It's worth noting that the same rule applies to the Maghreb dialects I know, and to nearly all dialects: when we drop case marking الإعراب, the plural takes the form of ين.

هاد الصاروخين سريعين

Please note that the demonstrative pronouns هذا, هذه, and هؤلاء have been replaced with هاد to avoid some weird expressions such as:

هؤلاء الصاروخين سريعين

It is better to adopt a single demonstrative pronoun that is sufficient for all nouns, whether singular or plural, masculine or feminine, and this is also practiced in many dialects.

The Dual form

I tend to lean towards replacing the dual form with the words زوج or إثنين, like:

زوج صاروخين سريعين
إثنين صاروخين سريعين

However, for those who feel nostalgic about the dual form, we still have the "alif" and "noon" available to say:

صاروخان سريعان

Thus, the summary of the matter is as follows:

الئلف والنون للمثنى => صاروخان سريعان الياء والنون للجمع المذكر => صاروخين سريعين الألف والتاء للجمع المؤنث => سيارات سريعات

And so we have solved the problem of pluralization with simple rules.

Extra Letters

Consider the following letters: ث, ذ, ظ. What do they have in common?

The common factor is that there are letters in Arabic that resemble them in shape and pronunciation to the extent of being identical: ت, د, and ض. They are so similar that many Arabic dialects have abandoned them.

Furthermore, if you take all the words that contain one of these mentioned letters and replace it with its similar, non-dotted counterpart, you will never end up with a word that conflicts with an existing word in the language. For example:

ذئب => دئب
ظل => ضل
ثعلب => تعلب
لحضة => لحظة

This means that these letters are merely redundant and do not play any significant role in the language. Since we do not encounter conflicts in words by substituting them with their counterparts, what is their grammatical function other than to create confusion? Throughout my years working with technical Arabic, I replaced these dotted letters with their non-dotted alternatives without any issues or conflicts arising. What is the benefit of having such letters that are difficult for native Arabic speakers to pronounce, let alone non-Arabic speakers? I believe (though I do not assert) that the presence of such letters is merely a remnant of the historical differences in Arabic dialects, where some pronounced the letter د as ذ, ت as ث, and ض as ظ, and these variations have persisted in the Arabic language as a whole.

The Hamza

There is no disagreement that the hamza is a complex issue in the Arabic language; it is neither a letter nor a diacritic, and it does not have a single form. It can be written above various letters, which creates difficulties in typing. Moreover, the issue does not stop there; the hamza, like plural forms, has many rules, with differences among scholars. Consider the following example:

هيأة المحلفين
هيئة المحلفين

Which one is correct? Each of the writings above has a rule in Arabic that supports it. For instance, هيأة is correct because the hamza is open, the yā’ is silent, and the fatha is stronger than the sukoon, thus it is written on the alif. Similarly, هيئة is also correct because the hamza is preceded by a yā’, so we write the hamza on the vowel that corresponds to the yā’.

In general, the writing of the hamza depends on the vowel markings of the surrounding letters. Although we can establish rules for this (unlike plurals), it creates problems when we want to use Arabic as a language for science, mathematics, and programming.

I would overlook all of this, but what struck me the hardest is the hamza on the line; it is very small and can easily be missed when reading code. Sometimes, it can be mistaken for a dot or some mathematical symbols. It also gives the impression that the language before you is not technical at all and was not created to be that way. Consider the following example of code (written in non-technical Arabic):

دع إجراء = إجراء جديد(إجراء.إسم_مسار)؛ دع ء = 1؛ دع إجراءات = [إجراء]؛ إجراء = ء × ئجراء

If you are a programmer, do not try to understand the meaning of the code above; it has no meaning. I tried to use the detached hamza in the code to illustrate the problems it poses. If the font is not large and clear, it appears like a speck of dirt on your screen that you might mistakenly try to wipe away with your finger (and this has actually happened to me). Programmatically, we do not always have the luxury of changing fonts; sometimes you find yourself programming on primitive machines with a black screen and a default font, or programming on the black and white terminal.

So, I started to think that if I found a way to eliminate the hamza, I would save a lot of difficult rules in addition to many letters that would no longer need to exist on the keyboard, namely أ, إ, ؤ, ء.

The challenges associated with the hamza in Arabic may largely arise from the cursive writing system characteristic of the language. We previously discussed and analyzed this issue. In a non-cursive script, the hamza would appear as an isolated mark without a definitive position. After careful examination, I found that its only stable placement is on the yā’ (ئـ), which serves as its natural position, similar to many other Arabic letters.

Arabic consists of a limited number of fundamental shapes, with letters changing primarily through the addition of dots above or below. This raises the question: why not choose a form that remains consistent regardless of its position within a word? I found no suitable alternatives to the yā’. If the hamza were to be placed above letters other than the yā’ (such as the alif or the wāw), it could result in visually unappealing outcomes (see the example below). Additionally, creating a new letterform for the hamza would necessitate introducing an entirely new shape into the Arabic script, a complex and challenging endeavor.

Our objective should be to utilize existing resources, prioritizing simplicity over complexity.

Now compare the word إجرائات as an example:

إجراأات
ؤجراؤات
ئجرائات

Which one seems more natural?

Assuming a non-cursive font, the hamza becomes an independent letter, like all the other letters, applying the same rules and having a fixed shape regardless of its position in the word. Consequently, the previous code example becomes like this:

دع ئجرائـ = ئجرائـ جديد(ئجرائـ.ئسم_مسار)؛ دع ئـ = 1؛ دع ئجرائات = [ئجرائـ]؛ ئجرائـ = ئـ × ئجرائـ

Note that I intentionally extended the vowel mark to avoid it turning into this shape ئ, a problem that does not arise in a non-cursive font.

I firmly oppose altering the shape of a letter based on its position within a word, as this practice is neither standard nor technical. My objective is to achieve consistency, allowing for a single fixed representation of each letter:

Key press = One letter = One shape زر مفتاح = حرف واحد = شكل واحد

Thus far, we have successfully implemented this approach with the previous modifications, with the exception of the alif. As you are aware, Arabic features both the long alif الألف الممدوة and the short alif الألف المقصورة.

I have yet to find a compelling rationale for retaining the short alif in the non-cursive script. Consequently, in technical Arabic, the alif should always be represented in its elongated form, irrespective of its position in a word. Examples:

منى => منا إلى => ئلا إلا => ئلا ئو ئللا (تكرار ئختياري للحرف للتعبير عن الشدة، ئنضر ئسفله)

The Shadda

It is generally agreed that writing diacritics on letters in code is impractical, as it quickly leads to a visually appealing but unreadable script. Personally, I have not encountered any issues with omitting diacritics, as this is deeply rooted in the Arabic language and, in my opinion, cannot be changed. Attempting to modify this would result in a new language that bears no relation to Arabic.

The only challenge I have faced pertains to the shadda. In some cases, its inclusion is essential to distinguish between forms such as فَعَل,`فِعل, andفَعّل`. Consider the following example:

كائن.وجه()؛ كائن.وجه()؛

Since we do not write diacritics or shadda in code, how can one differentiate between وَجه (from "face") and وَجِّه (an imperative form of "to direct" or route)? This posed a dilemma for me, especially since I am firmly against including shadda, as it could create ambiguities and confusion in code naming conventions. Anyone with programming experience would likely agree with this perspective. Thus, I have made the following decision regarding the shadda:

I propose to repeat the consonant for shadda, unless the verb's ain عين and lam لام are similar, in which case it retains its original form. For example, consider the following imperative verbs:

مستند.فععل()؛ مستند.وججه()؛ مستند.حدد()؛

In general, you have the flexibility to repeat the consonant to denote shadda in technical Arabic, particularly in imperative forms such as فَععِل. Consequently, it is also acceptable to write sentences in technical Arabic as follows:

صننف الئتحاد الددولي اللاعب اللي سججل الهدف كئفضل لاعب
صنف الئتحاد الدولي اللاعب اللي سجل الهدف كئفضل لاعب

Both forms are correct; repeating the consonant to express shadda is merely a choice aimed at enhancing the readability of the sentence. Based on my experience, I have found this necessary only in certain imperative verbs.

DEV Community

Technical Arabic

Non-Cursive Script

Plural and Dual Forms

The Dual form

Extra Letters

The Hamza

The Shadda

Top comments (0)

Read next

AI Models Can Now Predict Why Scientists Cite Research Papers, Study Shows

New Framework Reveals How to Test and Trust AI Models: Research Introduces TrustGen Platform

"Unlocking Federated Learning: The Future of Knowledge Editing and Privacy"

YOLOv11 Object Detection