We could think in more general terms about this quote from an interesting benchmarking paper. (I renember a DFG referee saying that I am no expert for the applied study. Vice versa we all see studies where everything is treated with a hammer.)
With a lot of data on my desk, however, I am more interested in the technical conclusions of the paper and feel quite comfortable with their opinion that commercial RDBMSs are not always the best choice. These RDBMSs include more and more features, and missing features are included in add-on packages from third party vendors. With these ever increasing features also useless overhead is being increased with penalty for performance.
A redesign for special databases like those used in genetic epidemiology and bioinformatics therefore seems to be invitable. Some may have already noticed my preference for SQLite, HDF-5, NetCDF.
- Do we really need client-server mode?
- We may ask if not 90% of all tasks can be done in presorted arrays (or materialized views).
- Why can`t processes run completely in virtual memory without disk I/O?
- Is there any chance to compile to machine code for better performance?
- Why not ordering task for priority with those having minimum latency being the first in the row?