When we talk about Relational Databases we refer to a type of tabular databases, whose standard language is commonly known as SQL and which are therefore also defined as SQL Database System. Their peculiarity lies in being based on a system of related tables according to the so-called Relational Model, invented by the English computer scientist Edgar F. Codd in 1969 and described in his paper entitled: “A Relational Model of Data for Large Shared Data Banks“.
In the Codd model, data is organized in one or more tables linked by relationships and divided into columns and rows, with a “key” that uniquely identifies each row.
Typically, each table / relationship represents an “entity type” (such as customer or product). Rows represent instances of entity while columns represent the values attributed to that instance. Today, this type of database still represents the most used solution in business scenarios, even if not always the most suitable.
Starting from the 2000s (when the 3 V model of “Big Data“, coined by analyst Douglas Laney, dates back), the increasing complexity and quantity of data flows determined the need to create alternative archiving tools.
In fact, in tabular databases, it is necessary to create new data columns or new tables, in order to express a relationship between entities. Performing queries to highlight relationships between millions of data, in a tabular system such as the relational one, can become cumbersome and computationally very time-consuming.
Non-relational databases (for example the NoSQL Databases and the XML Databases) were introduced in those years in order to reconstruct and manage more quickly the existing connections among entities belonging to data lakes which, by now, have massively grown.
Graph Databases, also known as Semantic Databases, have thus spread to all those areas characterized by an important and rapid influx of data, such as fraud detection, social media analysis, supply chains, search engine optimization and Internet of Things.
But what is their functioning and what makes them so effective in these areas?
Graph Databases owe their name to the mathematical term “graph”, used to express a set of nodes (or vertices), each containing information (properties) and with relations (or edges) between the nodes.
The element that distinguishes them is therefore being mapped on graphs composed of points (which express the nodes) and lines (which express the edges). The model is thus more flexible and able to highlight existing relationships through faster queries.
While the query speed in the relational database depends on the number of tables to be joined and on the amount of data present in each queried table, in the graph database it depends only on the number of actual relationships and not on the total volume of data in the database.
On the other hand, relational databases are still very useful in several respects.
First of all, by presenting the data in a tabular form they are easy to interpret and understand, while graphs may not be within everyone’s reach; moreover, relational databases are equipped with a unified language for queries, or standardized SQL, to allow the development and execution of various applications. Graph databases, by contrast, have their own different languages for each vendor that owns the service: for example, Neo4j presents the Cypher language, as TigergGraph with GSQL and ArangoDB with AQL.
In conclusion, it may be well thought that graph databases are destined to spread more and more in companies, given their high performances, but there is only so much that they can do to completely replace relational databases, which still retain their practical usefulness in many areas.