27/08/2018

We Need to Resurrect Desktop Databases

by Rahul Batra

In 1981 the IBM PC was launched with 640 KB of RAM and the world of computing was never the same again. Apple Computers and IBM raced to get a desktop computer in every home and office desk. Computing truly became personal in the 1980s.

On the business front, relational database management systems were firmly establishing their dominance as the data backbone of enterprises. By 1986 SQL was the established and standardized RDBMS query language. Other than database administrators and programmers, a new function of database users emerged who would query and interpret results themselves.

Fast forward to the present day and our business intelligence needs remain largely similar. Tabular data remains, and will remain for the foreseeable future, the core of business data. Business analysts, data analytics specialists and even data scientists spend a large amount of time exploring data and interpreting results.

In these intervening three decades, spreadsheets have become the defacto choice for using and exploring data which can be arranged in a table. There is an insider joke in the financial business that the world really runs on Microsoft Excel. If you happen to spend some time in any such organization, you’d realize that the truth is not too far away. But spreadsheet software has its limitations, especially when dealing with tables on the order of millions of rows. The spreadsheet’s very inception was to support always visible, cell based editing and tabulating - not summarizing heaps of data.

People automatically turn to centralized client/server databases, data warehouses, or in some cases even to Hadoop ecosystem projects like Hive and Spark to run queries on data in large enough volumes to be too slow for Excel. But there is a void here that I think can be perfectly serviced by desktop databases.

Let’s turn the clock back once again to the late 80s and early 90s. The birth of personal computing gave rise to personal software packages, or desktop applications as they are usually called. Desktop database software options were aplenty, some of them very worthy. One of my favorite demos of such software came from the classic TV show, The Computer Chronicles, in the episode Shareware which was broadcast in 1988. That episode showcases a desktop database software package called PC-File which seemed to do quite a lot of the same things that we expect our new technology stacks to do. It is interesting to note that PC-File was capable of holding around 65,000 records at a time when your standard desktop came with less than one megabyte of RAM.

There were other notable applications too – like dBase and Paradox. Most of them were capable and packed quite a punch for their reasonable price tag. But most importantly, they allowed you to utilize the computing capabilities of your desktop to a large extent. By and large such desktop database applications had just two limitations:

They were capable of only relatively small datasets due to hardware limits.
They all had differing ways to program and query them.

The first is not really a problem anymore; you would be hard pressed today to find a desktop computer with less than 8GB of RAM and a quad-core processor. The average machines of today are far more powerful than their counterparts three decades ago, and by multiple orders of magnitude.

As for discrepancies in programming and querying capabilities, we again have an answer in the form of our trusty SQL language that is all but ubiquitous today. SQL has proven to be such a useful query language that even many non-relational database systems try to mimic SQL syntax for commercial viability. One even sees SQL for so-called NoSQL databases. There are more than half a dozen popular SQL engines for Hadoop, with many more waiting in the wings to pick up steam.

The time is ripe for desktop database engines to not just be resurrected, but also remembered by users who need to run analytics on slowly changing or informal datasets. Yes, there is Microsoft Access, but we need more competition in this area. We also need for people to reach out for the right tool for the right job. A combination like SQLite with DB Browser for SQLite would be a very capable desktop database system. SQLite is fairly performant dealing with tables having 10-15 million rows in them, and there is a plethora of options to tune SQLite even further.

There were many great qualities of the early desktop database systems which we must not abandon. Most importantly, they were efficient, affordable and accessible. Today there are many analytics tools that run as applications native to your operating system, but most of them target enterprise users and match the price tag with the audience. I hope for a new generation of tools that keeps the affordability spirit of the shareware era alive and brings SQL-based analytics to the reach of everyone who has a need. Such tools would not only save centralized databases from rogue, long running queries, but will also fully utilize the very capable machines that our personal computers have become.

About the Author

Rahul Batra is author of the book SQL Primer. He was first introduced to programming in 1996 with GWBASIC, but he did not seriously foray into programming until 2001 when he started learning C++. Along the way, Rahul dabbled in many other languages such as C, Ruby, Perl, Python, and Lisp. He has worked with databases such as PostgreSQL, Sybase ASA, and SQLite. He is passionate about sharing knowledge, especially with those who are just starting out. Rahul currently lives and works in Gurgaon, India.

This article was contributed by Rahul Batra, author of SQL Primer: An Accelerated Introduction to SQL Basics.