We can have similar architecture discussions at the database layer as well. If I am looking at a simple table lookup, why pay for a robust database like Oracle 12c? Why not use something like Azure Table Storage Services and do a simple select statement from a file store. Why not put this in a free version of Oracle in Apex on the web and define a REST api to pull the data based on a simple or potentially more complex select statement. Again, 90% of the problems can be solved with simple solutions. Simple table lookups like translating a simple part name to a price can be done with Excel, MySQL, APEX, JSON processing, or REST apis. The difficulty comes up with the remaining 10%. How do I correlate multiple tables together to figure out the price of an item based on cost of inventory, cost of shipping, electrical costs, compensation costs for contractors and sales people, and other factors that determine profitability and pricing. How do I do a shortest routing algorithm for a trucking system based on traffic, customer orders, inventory in a warehouse, the size of a truck, and the salary of the driver and loading dock personnel. For things like this you need a more complex database that can handle multiple table joins, spatial data, and pulling in road conditions and traffic patterns from external sources. Products like IBM DB2, Oracle Database, and Microsoft SQL Server can address some of these issues.
We also need to look at recovery and restoration time. When a Postgress server crashes, how long does it take to recover the database and get it back online? Can I fail over to a secondary parallel server because downtime is lost revenue or lost sales. If you go to HomeDepot to order plumbing parts and their site goes down, how long does it take to go to the Ace or Lowes web site and order the same part and have it delivered by the same delivery truck to your home or office? Keeping inventory, order entry, and web services up becomes more than just answering a query. It becomes a mission critical service that can not go down for more than a few seconds. Services like Data Guard, Golden Gate, and Real Application Clustering are required to keep services up and active. MySQL, MongoDB, Amazon Aurora, and other new entry level database technologies can handle simple requests but take minutes/hours to recover information for a database. Failing over through storage to another site is typically not an answer in this case. It takes minutes/hours to recover and restart a moderate database of 20 TB or larger. First the data replication needs to finish then the database needs to be booted at a secondary site and it needs to maintain consistency in the data as it comes back up. The application server then needs to connect to the new service and recommit requests that came in during and since the system failure. As this is happening, customers are opening a new browser tab and going to your competition to find the same part on another site.
In summary, it takes more than just getting a bigger and faster application server or database. Moving the services to the cloud isn't necessarily the answer. You need to make sure that you move the two components together the majority of the time. Look at your application and ask where do you spend more of your time? It is tuning sql statements? Is it writing new queries to answer business questions? Is it optimizing your disk layout to get tables to the database faster? Take a step back and ask why is the database pounding the disk so hard. Can I cache this data in the database by adding a little more memory to the disk controller or database server? Can I cache the data at the application server by adding more memory there and keep from asking the database for the same information over and over again? In the next few days we are going to look at database options and database monitoring. We are going to look at some of these tools and refer back to the bigger picture. Yes, we can tune the storage to deliver all of the bits at the highest rate possible. Our question will not be how to do this but should we be doing this. Would something like an Exadata or an in-memory option allow us to transfer less data across the storage network and get us answers faster? Would adding memory somewhere allow us to buffer more data and reduce the database requests which reduces the amount of data needed from the disk.