Monthly Archives: January 2024

PostgresBuild 2021 Session Highlights: Dr. Michael Stonebraker on Maximizing Database Application Performance


Among everything we do and talk about in our space, one thing is crystal clear—all of the technology and methodologies that exist serve one purpose: to support distinct businesses. Industries include e-commerce,  banking, insurance, manufacturing, telecommunications, and everything in between. This leads us to my main focus points in this article.

Building for business application success

One of the pinnacles of the IT industry is to create applications—applications for business workers, end users, and other applications. Apart from the obvious requirements that the application has to function correctly, two major challenges reign in the space:

  1. The speed required to add new features
  2. The speed at which applications work

These are age-old application development challenges that Dr. Michael Stonebraker addressed, much to my enthusiasm, in his closing keynote at  PostgresBuild 2021. Let us explore this in a little more detail.

Dr. Stonebraker remarked that “[a] cursor interface [to your database] is insanely expensive.” Using a database like Postgres as a storage mechanism to simply store and retrieve data makes no sense. You will get no benefit from all the intelligence that has been programmed in Postgres to help optimize application performance and you will add “insanely expensive” overhead to your application.

There have been multiple attempts to find solutions for this so-called “impedance mismatch,” but none have taken flight thus far. My colleague Hettie Dombrovskaya wrote some interesting papers on this. Let’s also dive in a bit, here.

Applications

For many years, applications have been separated from the databases they use to store, retrieve, and manipulate data. We have grown accustomed to having applications written in specialized platforms and languages to facilitate the need for speed (as mentioned above) and maximize user experience.

Databases

Databases are a logical and practically unstoppable part of applications. Relational databases will fulfill a continuously increasing role in data management. A relational database engine like Postgres is an extremely powerful and versatile platform that gives virtually unlimited possibilities to not only store and retrieve data but also manipulate data—very much so.

Business logic

With the split of where applications live and where data is stored and processed, a development started that was focussed on a further decoupling of these two parts of an application. When writing this, going into all of the individual elements is not part of the objectives at hand.

A 3-tier model—frontend, middle tier, backend—was developed that would lay the foundation for future models of application development. We will also disregard the front end here, as this is the realm of browsers and high-end user experiences. While that is certainly important, it is less relevant or interesting for database folks. Well, up to a point, but that will have to wait for another article.

The way applications are built focuses on user functionality, which typically gets organized into objects or classes.
The way databases operate focuses on data transformation, which typically is organized in rows and columns.

The problem, addressed by both Stonebaker and Dombrovskaya, is that the transformation from rows to objects takes place where the application lives. This has three important implications that cause sorrow and much toil.

  • An incredible amount of traffic between the two most costly elements in the equation: between the database server and the application server, and for every data element (row) a message is sent
  • Processing of data takes place at a point in the chain where information handling is the objective, creating a responsibility mashup. The application layer is responsible for interaction with information rather than transforming data into information.
  • By disregarding database functionality for integrity and transformation, you deny yourself the optimal security measures of having consistent, reliable, and idempotent data processing.

Making it usable

So where does this leave us?

A lot has been written and researched about this topic. Research has been referenced early on in this article and many other approaches have been considered. This includes research on “SmartDB” from Oracle’s Toon Koppelaars, where he empirically proved that the process of doing the actual data transformation in the middle layer of a modern application sets you up for imminent scalability issues.

Business logic in the database

I believe that the reason for this is the rock-solid conviction that “business logic should never be in the database.” This has been a teaching since the mid-nineties of the previous millennium and I think we have arrived at a time where that needs to be re-evaluated.

In the 25-odd years that have passed, much has changed. One of the reasons for this stance was the concept of “database agnostic applications,” which we all know today is impractical and relatively senseless. Senseless due to the development in data management systems such as Postgres, delivering unparalleled speed and functionality to this point.

Logical split

Additionally, the understanding has emerged that the concept of business logic is two-fold:

  1. There is application business logic, which handles how information inside an application is managed and which business requirements the application needs to fulfill.
  2. The remaining part of business logic is the data business logic, which deals with how data is transformed into information and defines which actions are required to maintain consistency inside the database.

Again, there is a lot more to discuss around the split in logic than we have room for here.

If you were to do this, and that is nothing more than a simple architectural decision when designing any application, you can find a wealth of opportunity.

Reflecting on the three challenges we faced when revising our original business logic stance:

  1. Excessive traffic
  2. Mashup of responsibilities
  3. Data security risks

Sparking further discussion

I am fully aware that a topic as extensive as this, one that is potentially controversial and challenging, needs more discussion and thought than a blog post can cover. There have been countless studies and debates, yet as an industry, we still trip over this.

My goal is to fan the flame of this discussion in 2022, which was sparked by Dr. Stonebraker in his keynote at PostgresBuild.

The key to applications is performance; working with a technology like the Postgres database gives you many of the tools to achieve this. Use the force wisely.