A collection of facts from which inferences can be made is called data. It is the basis on which factual information is derived, providing relevant results to the end users. Data is the cornerstone of contemporary society and is crucial to many facets of people’s lives. In order to gain knowledge and make wise decisions, […]
Generative AI and Semantic Compliance
Only CPT and its peers know how many statements have been made based on results from generative AI. But there are loads of them. My background as a data modeler over many years makes me shiver a little bit, because what the friendly AI helpers help us produce is subjected to cognitive processes, where we, the readers, process […]
Should You Consider a Unified Data Model?
A unified data model allows businesses to make better-informed decisions. How? By providing organizations with a more comprehensive view of the data sources they’re using, which makes it easier to understand their customers’ experiences. A singular, interrelated network that’s connected to one source of truth gives organizations a more efficient, accurate, and comprehensive analysis of […]
Modeling Modern Knowledge Graphs
In the buzzing world of data architectures, one term seems to unite some previously contending buzzy paradigms. That term is “knowledge graphs.” In this post, we will dive into the scope of knowledge graphs, which is maturing as we speak. First, let us look back. “Knowledge graph” is not a new term; see for yourself […]
Connecting the Three Spheres of Data Management to Unlock Value
Many organizations have mapped out the systems and applications of their data landscape. Many have documented their most critical business processes. Many have modeled their data domains and key attributes. But only very few have succeeded in connecting the knowledge of these three efforts. The remainder of this point of view will explain why connecting […]
2023: Mitigating Data Debt by Knowing or by Guessing?
One of the newer data buzzwords is “data debt.” Actually, it is approximately 10 years old, and it became popular ever since agile people realized that postponing things creates not only technical debt, but certainly also data debt. Will we, in 2023, be better at not creating so much data debt, and will it be […]
It’s All About Relations!
The new ISO 39075 Graph Query Language Standard is to hit the data streets in late 2023 (?). Then what? If graph databases are standardized pretty soon, what will happen to SQL? They will very likely stay around for a long time. Not simply because legacy SQL has a tremendous inertia, but because relational database paradigms […]
The Data Engineer’s Roadmap
Data engineering is a fascinating and fulfilling career – you are at the helm of every business operation that requires data, and as long as users generate data, businesses will always need data engineers. In other words, job security is guaranteed. But, with such great power comes great responsibility. The journey to becoming a successful data engineer […]
A Primer to Optimizing Your Apache Cassandra Compaction Strategy
When setting up an Apache Cassandra table schema and anticipating how you’ll use the table, it’s a best practice to simultaneously formulate a thoughtful compaction strategy. While a Cassandra table’s compaction strategy can be adjusted after its creation, doing so invites costly cluster performance penalties because Cassandra will need to rewrite all of that table’s data. Taking […]
Say Hello to Graph Normal Form (GNF)
You thought you knew all normal forms? (And possibly also some abnormal …) Well, think again: There is also “graph normal form (GNF).” The diagram below is a fifth normal form graph concept model, which is just a few steps from GNF, so hang on: Where GNF comes from GNF is based on serious mathematics, […]