Linguistic Advancements: A Paradigm Shift in Natural Language Understanding
Written on
Chapter 1: A New Era in Linguistics
In recent times, we have witnessed a significant advancement in linguistics that streamlines Natural Language Understanding (NLU). These breakthroughs can appear unexpectedly, yet they provide robust support for a model rooted in the Role and Reference Grammar (RRG) framework.
Today's AI systems predominantly rely on statistical data manipulation rather than being grounded in linguistic or cognitive theories. In contrast, the RRG framework offers a principled theoretical model. By examining a variety of human languages and positing that human brains utilize similar constructs for meaning representation, we can uncover parallels between languages and cognitive functions. The pivotal insight here is the identification of consistent semantic categories across different languages, contributing to the development of a universal semantic dictionary that forms the basis of a super knowledge graph (SKG).
To illustrate the necessary requirements for establishing NLU, I will provide an overview of a segment from the forthcoming "Role and Reference Grammar Handbook." As a science communicator, my role is to elucidate these concepts, while my function as a scientist and engineer is to implement them. The RRG Handbook delves into these matters comprehensively, which accounts for its extensive detail.
Section 1.1: Understanding RRG's Evolution
RRG has evolved since the 1980s, and while some may find the material challenging, I believe that an NLU specialist lacking familiarity with RRG is akin to a rocket scientist unaware of gravitational theory. A global consortium of scholars has assessed and validated numerous languages against the RRG framework, including linguistically diverse tongues like the Lakota language of Native America. An 83-page bibliography of RRG-related papers is available for exploration by language.
This exploration extends beyond theoretical frameworks for future NLU systems. My company, Pat Inc. (PAT), has been applying this approach for several years to enhance sentence recognition, facilitate conversational exchanges, and generate multilingual outputs based on meaning. This application demonstrates that the theory is not merely academic.
Section 1.2: The Complexity of Language
Modern linguistics encompasses specific, well-articulated concepts, and an initial encounter with RRG may resemble navigating a 1960s NASA technical specification for rocket science—unless one possesses an expert vocabulary in linguistics. To assist those who may not, I will elucidate some of these concepts herein.
I anticipate that future developers of NLU systems will begin their journey by consulting RRG resources. Leveraging the vast linguistic expertise provided by RRG is a logical step forward, promising more meaningful interactions with emerging IT systems founded on centralized meaning.
Chapter 2: The Essence of Natural Language Understanding
The goal of NLU is embedded in its name—understanding. Human language represents an intricate system of precision and nuance; a single word altered in a sentence can convey drastically different meanings, yet native speakers can effortlessly discern the variations.
Reflecting on my past experiences with multilingual students while integrating content for Portuguese, German, and Spanish alongside English examples, I recall a Portuguese sentence where words intricately fused gender and location. After several hours of effort, I presented the outcomes to my students with enthusiasm. One student, Maria, immediately identified an error when the sentence was shown, suggesting a correction that demonstrated her deep understanding.
This scenario exemplifies the complexity of language and the remarkable proficiency with which humans navigate it. Emulating this capability allows for improved human interaction with language, especially given that the fundamental principles are increasingly understood, albeit not yet widely utilized.
In a recent discussion, Gary Marcus and Elliott Murphy highlighted some deficiencies in current AI systems, including (a) the disconnect between words, sentences, and their real-world references (which I term semiotics theory), (b) the absence of cognitive models to maintain a "persistent and dynamic understanding of the world" that could be realized as a lossless super knowledge graph, and (c) issues surrounding compositionality, which refers to how complex entities are interpreted.
Section 2.1: The Role of Theoretical Frameworks
Theoretical frameworks enable us to comprehend and model phenomena. Without understanding, accurate modeling becomes impossible. Current AI systems lack genuine comprehension and often utilize models that do not align with the intricacies of human language. For clarity, consider how to model the following sentences:
- Who was the fish eaten by?
- I thoroughly enjoyed Collins' book, but found Grisham's dull.
Before diving into parsing these sentences, let's explore the RRG model. The following diagram illustrates the RRG structure for a sentence, featuring pre-core slots (PrCS) and pre-detached phrases. Predicates (PRED) facilitate arguments occupying a PrCS (only one exists per clause), bringing it into narrow focus.
If this model holds true, it directly impacts the two example sentences. Let's highlight the PrCS usage:
- Who was the fish eaten by? (In core—"the fish was eaten by whom?")
- I thoroughly enjoyed Collins' book, but found Grisham's dull. (In cores, my interpretation—"I really enjoyed Collins' book" and "I found Grisham's boring.")
The experimental rule-based approaches of the past (1960s-1990s) proved inadequate for this kind of linguistic complexity, leading experts to label the models as NP-Hard for NLU—an unfortunate reality for those attempting to create effective NLU systems. A shift towards a more robust theoretical model is essential, and RRG offers that foundation.
Section 2.2: Innovations in Dictionary Design
The dictionary has always played a pivotal role in early AI systems. Today's focus is on how RRG enhances the semiotic model, demonstrating how signs and their phrases are language-specific, while meaning can remain language-independent. This paradigm shift advocates for a transition from traditional dictionary categories based on interpretants and meanings to a focus on signs.
What should a dictionary encompass? Should it include definitions in the source language, encyclopedic knowledge, word forms, or part-of-speech classifications?
As a cognitive science student, I often found it perplexing that parsing and word classifications were fragmented, leading to redundancy. For instance, the word "open" serves as both a verb and an adjective. Consider these sentences:
- The door was wide open (adjective—indicating a state of being).
- The doors open swiftly (verb—denoting action).
Isn't the underlying concept essentially the same? This is also evident in the phonetic consistency of the word.
In the case of "destroy," the term signifies causing ruin, with a noun form being "destruction." Depending on context, it can refer to either material conditions or actions, such as:
- "The city's destruction was cleared away by trucks."
- "The city's destruction took two hours."
Utilizing a unified meaning for various forms of "destroy" can resolve queries succinctly.
The Breakthrough in Linguistic Theory
Scientific breakthroughs occur when new theories efficiently explain observations. Ockham's Razor posits that theories should avoid unnecessary concepts while providing sufficient clarity for accurate predictions of new observations. For human language, a theory must elucidate commonalities across the world's diverse linguistic landscapes.
Employing semiotic terminology, a sign represents what we express in a language, while its meaning (the interpretant) is what we interpret it to signify.
What then constitutes a dictionary? Where are additional features cataloged? The distinction between "destroy" and "destroys" lies in subject-verb agreement in English. The third-person singular present tense employs "destroys" in "he/she/it destroys," while all other forms utilize "destroy" in "I/you/we/they destroy." To maintain simplicity in the model, this differentiator is treated as part of the sign.
If languages share a common representation due to the brain's recognition of concepts similarly, we can facilitate translation by converting the source language into its meaning, subsequently locating phrases and vocabulary in the target language to generate the intended meaning.
This industry application allows us to operate machines based on language-independent meanings, generating interactions in the required target language—eliminating the need for ad-hoc translations and yielding cost savings for industry.
Section 2.3: Visualizing Natural Language Understanding
To illustrate how machines comprehend language, I often demonstrate an interactive conversation. It's important to clarify that delivering the correct response does not imply understanding; conversely, providing an incorrect answer signifies a lack of comprehension.
In our previous discussion about "destroy" as a verb, the following illustration breaks down the semantic representation. The predicate in the nucleus is "destroy(ed)," and with the additional argument "the Vandals," it is evident they are the agents causing the city's ruin.
The sentence utilizing the noun form "destruction" retains the same meaning. Current displays require an extra click to unveil the embedded form—something we aim to eliminate in the JSON interface for consumers of meaning.
With the introductory overview complete, we can delve into the handbook for explanations regarding the English adjective category and how other languages approach it differently while permitting shared meanings.
Section 2.4: Lexical and Syntactic Categories in Context
To understand the fundamentals of language operations, I encourage you to explore my introductory articles on Medium, accompanied by relevant videos. In those discussions, I introduce the RRG layered model in detail and compare English with the Australian Dyirbal language and Abkhaz from Georgia.
The prior section highlighted pre-core slots, post-core slots, and both pre- and post-detached phrases. The following illustration substantiates the RRG layered model concerning these additional core slots and detached phrases.
The nucleus of a sentence may not always be a verb. Traditional teachings suggest that a verb phrase heads a sentence. However, RRG presents a superior model, which is the focal point of this discussion. The nucleus forms the foundation around which a sentence is constructed, and it can be identified in various forms, including reference phrases (RP), modifier phrases (MP), or even prepositional phrases that resolve to appropriate semantic categories.
This model diverges from the endocentric requirement of each phrase in a sentence having a head. For instance, "a good lawyer" centers around "lawyer," "extremely tall" centers around "tall," and "in the house" is governed by its predicate, "in." Yet, RRG does not mandate a head, illustrating a principled theory.
Observing the features of these English sentences reveals that the predicator can be a noun, adjective, or preposition—not solely a verb or auxiliary as often taught. This understanding allows for a generalized model, which is the essence of today's discussion—the breakthrough that necessitates analysis of other languages to draw comprehensive conclusions.
Chapter 3: The Implications of Language-Specific and Universal Features
Throughout my work, I have grappled with "common sense" knowledge, as illustrated in this example. When adding terms to a dictionary, "car" and "table" are straightforward. However, the terms "destruction" and "dancer" present complexities. Is "dancer" simply a verb in disguise?
In the case of "destruction," it embodies an event where the arguments represent either a destroyer or the thing being destroyed, functioning similarly to a predicate before it acts as a noun. "Dancer" exhibits comparable attributes. If the conversation is restricted to: "He is a dancer," one can immediately respond to another inquiry: "What does he do?" The answer is, "He dances." Thus, the predicate "dance" is integrated within the meaning of "dancer."
Human languages utilize nouns and verbs differently than English. In the highlighted passage above, the semiotic model becomes relevant again. The sign can assume a category—noun or verb as seen in "destruction" and "destroy," respectively—while both retain the same interpretant, the accomplishment of causing ruin.
This clarification can be addressed in the lexicon if done correctly. The distinction lies in that the meaning does not solely associate the lexical item with a noun or verb; rather, it is the semiotic sign—the word itself.
In practice, decisions arise regarding whether to attribute features to the word form, as in inflected forms (like "destroy," "destroys," "destroying," and "destroyed") or the meaning (such as its predicate class, where "say" has phrase properties that define its allowable arguments).
The Conclusion: RRG's Breakthrough Insights
RRG identifies that certain languages like English maintain distinct categories for adjectives, whereas others, like Lakota, utilize subclasses of verbs, and languages such as Dyirbal extend the noun class for similar purposes. The functioning of the language relies on employing "morphophonological" or "morphosyntactic" methods—essentially, the changes in sound pronunciation or alterations in syntax.
"Lexical categories are language-specific but with a universal semantic foundation."
Why is this a breakthrough for NLU? For a dictionary to serve all global languages, it should be grounded in the shared building blocks (universal semantic foundation) that reside in the meaning layer—predicates and referents—independent of any specific language.
In the sign layer, language-specific nuances exist. Nouns and verbs are proposed to be present in all languages, but additional predicates can manifest differently, as in English with adjectives or by extending the noun class in Dyirbal or the verb class in Lakota, along with other language-specific variations.
By separately storing language-independent information, such as in a super knowledge graph, we can align the semiotic distinctions. To construct phrases from signs, language-specific details must be provided (identifying whether the sign is a noun, verb, or another classification and determining the specific rendering needed in the target language).
In summary, human languages comprise both a language-specific set of signs and phrases, as well as a language-independent layer built upon semantic building blocks.
Let's conclude with a discussion on "adpositional predicates" that can serve as arguments, as illustrated in the following example.
Summary of Key Insights
Today's exploration of the forthcoming RRG Handbook aimed to provide a glimpse into the advancements within the field of linguistics. RRG, in particular, continues to delve into the profound understanding of human language and, in my view, represents the only active linguistic model capable of propelling NLU forward today. It seeks to comprehend how languages function and centers its model around these observations. This focus has garnered popularity among speakers of many languages that defy traditional theoretical explanations.
From a scientific perspective, their initial inquiry is commendable: "What would a theory of language entail if we began by considering the workings of the world's diverse languages?" This encapsulates the essence of RRG today.
As a cognitive scientist with extensive experience in the relevant sciences for NLU development, I recognize that I would require a lifetime of expertise to match the true specialists in RRG linguistics. My approach is to strive for a deeper understanding of their work and apply it accordingly.
If you encounter aspects that seem flawed, challenge them scientifically. However, where RRG's findings surpass all others, continue to study their implications. The crucial message is to educate current students on our learnings to empower them in shaping the future.
By emphasizing RRG and associated linguistic studies, I envision a world within the next decade where our current systems evolve into next-generation models driven by human language, ensuring accuracy and enjoyment for all.