Solving Terminology Bloat via a Semantic Web and LLM-based Chat Bot Symbiosis

The computer industry has long been plagued by a problem that seems to grow alongside its innovations: the emergence and evolution of terminology. With each new technological advancement, a slew of terms is born, often leading to more confusion than clarity. This confusion is typically the result of marketing-driven territorial battles, where different entities push their own terms to gain an edge.

The concept of a Semantic Web was supposed to offer a solution to this problem. It aimed to clearly define terms from the outset, making them easily searchable and referenceable courtesy of HTTP-based hyperlinks. However, the road to a Semantic Web has been anything but smooth, with numerous obstacles hindering its progress.

Solution Part 1: LLM-based Natural Language Processors

Today, we’re witnessing the rise of Large Language Model (LLM)-powered natural language processors, such as Chat Bots, which are the key to unlocking the full potential of a Semantic Web. These advanced tools can process natural language prompts and provide clear, concise answers, helping to disentangle today’s web of confusing terminology.

To illustrate this, I’ve conducted an experiment using a simple concept scheme (Defined Term Set) and presented the same prompt to various LLM-based Chat Bot interfaces, including Microsoft's CoPilot, OpenLink’s Personal Assistant (OPAL), and OpenAI’s native ChatGPT interface.

Prompt-based Experiment.

The following prompts are designed to explore what terms such as Smart Agent, Assistant, and CoPilot denote:

What’s the difference between a Smart Agent, Assistant, and a CoPilot?
How would you represent this in a taxonomy?
Can you express the taxonomy in JSON-LD as a Defined Term Set using terms from Schema.org?
Finally, enhance this by incorporating prose-based descriptions of each term, along with language tag values associated with names and descriptions.

Response-based Results.

Each Chat Bot provided its own unique take on the prompts, showcasing the capabilities of LLMs in clarifying terminology. The responses were as follows:

Microsoft CoPilot UI for ChatGPT

Response in plain English

Basic Taxonomy Representation

JSON-LD based Taxonomy Representation

OpenLink Personal Assistant (OPAL) UI for ChatGPT using GPT 4.0 Turbo

Response in plain English

Basic Taxonomy Representation

JSON-LD based Taxonomy Representation

OpenAI Native UI for ChatGPT using GPT 4.0

Note, this UI doesn’t currently allow sharing links at the level of a specific prompt & response level, so the same link is used across each item below.

Response in plain English

Basic Taxonomy Representation

JSON-LD based Taxonomy Representation

Unfortunately, other LLMs like Claude 3 and Mistra do not currently offer a way to share prompt and response combinations via a hyperlink, hence their exclusion from this exercise.

Solution Part 2: Contributing to a Semantic Web

After producing the terminology definitions, we can proceed with uploading representations based on JSON-LD for each term to a designated, publicly accessible data space on the Web, thereby progressively enriching a burgeoning Semantic Web.

The task can be efficiently accomplished using specific tools designed for ease of use and effectiveness:

The OpenLink Structured Data Sniffer (OSDS) browser extension, which facilitates JSON-LD discovery, extraction, and upload, streamlining the process of making your definitions accessible.

Identifying API Endpoint for JSON-LD Upload to a Knowledge Graph hosted by target Virtuoso instance

A publicly accessible Virtuoso DBMS instance, such as the live URIBurner Service, offers an ideal upload destination. This platform offers robust support for managing and sharing structured data crafted using a variety of formats.

Automatically generated HTML-based Exploration Page, following successful upload. Click on the following link to view live version of what's depicted: https://tinyurl.com/4yfer2jx

Defined Term Set description page provided by Virtuoso instance. Click on the following link to view the live version of what depicted: https://tinyurl.com/264y55h2

Conclusion

This exercise showcases how a Semantic Web and Chat Bot symbiosis provide an intelligent and productive way to negotiate ambiguities and inconsistencies in terminology as new innovations emerge without being held back by confusing terminology conflicts.

Tools Used

Microsoft's CoPilot UI
OpenAI's Native UI for ChatGPT
OpenLink Personal Assistant (OPAL) UI for ChatGPT
OpenLink Structured Data Sniffer (OSDS) Browser Extension – for Chrome, Firefox, and Safari
OpenLink Virtuoso
URIBurner SPARQL Query Service Endpoint