Arabic ChatGPT vs. English ChatGPT

by Amira Saadeh - May 18, 2023
Reading time: 3 min

Post Views: 1,610

arabic chatgpt, arabic, english, diacritics,

When comparing English ChatGPT and Arabic ChatGPT, it is important to consider the complexities of the Arabic language and the challenges it presents for AI.

ChatGPT may not be suitable for certain texts such as legal documents, medical reports, scientific studies, and literary works.
Arabic poses specific challenges for AI translation, including the difficulty of tokenization due to diacritics and inflections.

I’m not going to sit here and pretend like Arabic, specifically Classical Arabic, isn’t complicated. It is beautiful, especially the poetry, but difficult to learn and navigate. I mean, despite it being my mother tongue, I struggle with it, A LOT. But theoretically, an AI should not have such struggles. It can recommend you a thorough investment plan but isn’t adequately trained in the fifth most-spoken language globally behind Mandarin, Spanish, English, and Hindi? Arabic ChatGPT is not on the same level as the English one.

Lost in Translation?

Translation apps out there are not the greatest, we can agree on that much. They tend to translate word for word rather than meaning. ChatGPT is an elite AI tool, so, you would expect it to have prowess in that area. In a recent article, ChatGPT for Arabic-English Translation: Evaluating, the author pointed out that the bot lacked training in its understanding of domain-specific terminology and cultural context. As a result, they compared its outputs to professional translations of various text genres. they acknowledge that the OpenAI’s ChatGPT has merit as a translator largely due to its proficiency in managing complex and uncommon language combinations, performing simultaneous translation for time-critical tasks, and its capacity to learn from user feedback and enhance translation quality. They, however, came to the conclusion that despite ChatGPT generally providing accurate translations, its limitations make it unsuitable for some texts:

Legal documents
Medical reports
Scientific studies
Literary works

Arabic and Its Challenges

When I asked Arabic ChatGPT “Who is Crowned Prince Mohammad bin Salman?” in Arabic, it took 1 minute and 41 seconds to generate a 63-word paragraph (15 of which were spent “thinking”). But when I asked that same question in English, it took less than 10 seconds to get a 76-word response. Looks like I’m not the only one that struggles with the language. The paper found that the AI struggled on several fronts.

AI relies on something called Tokenization to break down a string of text or speech into identifiable units. Think of your child dividing words into syllables to learn how to read them. Same concept, different “species.”

Diacritics

Turns out the little, small marks above or below Arabic letters (e.g., fat-ḥah, dammah, and kasrah) are called diacritics. And they make tokenization of the written text more difficult. You might think that they are insignificant, but their presence signifies vowel sounds. They make a world’s difference. It’s the difference between Adam having written (كَتَبَ /kataba/) and Adam having been written (كُتِبَ /kutiba/).

Inflections

The Arabic language is highly inflected. An inflection in language is a modification in the form of the word expressing a grammatical function or attribute such as tense, mood, etc… Think of how the plural of “chicken” in English is “chickens.” In Arabic, however, it gets complicated, very complicated. And it, again, affects the tokenization. A simple example of this is saying that you bought 2 chickens. But in Arabic, “2 chickens” are a single word: the base word for “chicken” and the suffix for “2” and that suffix changes depending on where the word falls grammatically.

Final Thought

I get it. I do. The language is difficult. But is that reason enough to take about 313 million Arabic speakers out of the discourse? Leave them behind? Arabic ChatGPT needs to be on par with the English one.

Inside Telecom provides you with an extensive list of content covering all aspects of the tech industry. Keep an eye on our Intelligent Tech sections to stay informed and up-to-date with our daily articles.

Tags: AI. Natural Language Processing Arabic Artificial Intelligence ChatGPT English Languages OpenAI

Monty Holding Launches Its FinTech Academy with USJ as Its First Partner

Mastercard’s President Adam Jones: MyMonty's Partnership Will Unlock Lebanon’s Digital Finance Potential

Award-Winning. Future-Ready. Monty Mobile’s eSIM Platform Strikes Gold

G42 and UAE Team Emirates–XRG launch first global GEN AI-designed helmet and competition

Google Pay Goes Live in Lebanon, While Apple Pay Leaves iPhone Users Disappointed

Wi-Fi 8 Taking Connectivity to New Levels Starting 2028

Meta’s Under Sea Internet Cables Will Keep Us Connected

Is Ericsson’s 5G Uplink Speed Worth the Cybersecurity Risk?

Starlink’s Direct-to-Cell Service Goes Beyond Consumer Use

China Telecom Industry Open to Foreign Investors

Demographic Collapse Forces South Korea to Bet on AI Military

WeTransfer Backtracks ToS After AI Clause Ignites Creator Rage

AI’s Creating Modern ‘Plato’s Cave’ of Illusions, Greek PM Says

Musk’s xAI Launches Anime Virtual Girlfriends with NSFW Perks

US-Gulf Tech Alliance Shifts Middle East Security to AI, Cybersecurity

MyMonty: The New Era of Banking

Entering the Monty Multiverse at Seamless 2023

Seamless Dubai 2023 - From Concept to Reality: Shaffra Technologies Opens Doors to Metaverse Mastery

Take A Look in the Mirror. The Greatest Technology of All Will Stare Back at You

Monty Mobile Enters Multibillion-Dollar MNO Equipment Industry

Are We Addicted to Social Media? IG, TikTok Trigger Physical and Emotional Withdrawal

Meta's AI on Instagram, Facebook Helps Save Lives

US DoT’s New Safety Plan Introduces Car Communication

Little Girl Receives First Prosthetic Eye from MRI, CT Scans

DeepL’s AI Translation Software to Get Traditional Chinese