X

Pat Shuff's Blog

Recent Posts

Iaas

links to Corente tutorials and workshops

Rather than go through a full install of the Corente VPN configuration I thought I would post references to where I pulled tutorials. The obvious place to start is Corente Documentation. I would not start with this but would start with tutorials and workshops. The documentation is confusing and could lead you to locking your account and getting frustrated (trust me). Yesterday we talked about starting the install process. Normally we would go through a full install but rather than doing that I am going to reverence three documents that I found internal to Oracle.Corente Cloud Services CookbookCorente VPN for PaaS and IaaSCorente VPN WorkshopMy favorite of the three is the Workshop because it goes through pictures on how to install and configure with step by step instructions. It starts out by installing Linux in the Oracle Cloud and configuring it to be the App Manager console then gets the Orchestration up and running to start the gateway in the cloud. The single change that I would make with this configuration is to configure the on premise system using a Linux image in VirtualBox locally rather than doing it from Firefox on your desktop. If you follow the same steps that you did to spin up a Linux image and install the packages in the cloud, you can do the same steps in VirtualBox. This deviation starts on page 26 with an install of Linux on a local VirtualBox and goes back to the Workshop without skipping a beat. I was able to follow the 52 page Workshop easier than the 76 page Cookbook. On the flip side, I do like the focus on network configuration in the Cookbook document for the cloud Linux instance. The net of all this discussion is that there are various ways to configure Corente. It is a time consuming project and you can't just click a button and make it work. If you want to integrate your Cisco of Juniper router in your data center there are online instructions on configuring your router. We will not go through this because we don't have access to hardware to configure and play with.In summary, this should be the last posting on Corente. It is a very powerful tool that allows you to create a virtual private network between computers in your home, office, or data center and computers in the cloud. It allows you to configure a typical two tier configuration in the Oracle Cloud and hide the database from the public internet while giving your DBAs and developers direct connection to the database. It also allows you to replicate your production systems that are running in your data center and create a high availability site in the Oracle Cloud. This can be done using NFS or SMB file shares and rsync to keep files synchronized or DataGuard to replicate database data between a two servers. Corente VPN allows you to create a trusted and secure communication link with your data center and the Oracle Cloud.

Rather than go through a full install of the Corente VPN configuration I thought I would post references to where I pulled tutorials. The obvious place to start is Corente Documentation. I would not...

Iaas

Corente on VirtualBox revisited

Last week we started talking about setting up Corente and came to the conclusion that you can not run Corente Gateway in a VirtualBox. It turns out that not only was I wrong but I got a ton of mail from people like product managers, people who got it working, and people who generally did not agree with my conclusion. Ok, I will admit that I read the manuals, played with the suggested configurations, and tried deploying it on my own. It appears that I did a few things backwards and cornered myself into an area that caused things not to work. Today we are going to walk through the steps needed to get Corente up and running in your data center using VirtualBox as a sandbox.The first thing that you absolutely need is a Corente admin account. Without this you will not be able to create a configuration to download and everything will fail. You should have received an account email from "no-reply-cloud@oracle.com" with the title "A VPN account was created for you". If you have multiple accounts you should have received multiple emails. This is a good thing if you got multiples. It is a bad thing if you did not get any. I received mine back on August 11th of this year. I received similar emails back on April 27th for some paid accounts that I have had for a while. The email readsThe VPN account information included in this email enables you to sign in to App Net Manager Service Portal when setting up Corente Services Gateway (cloud gateway) on Oracle Cloud, which is Step 2 of the setup process.Account DetailsUsername: a59878_adminPassword: --not shown--Corente Domain: a59878Click here for additional details about how to access your account. The link takes you to the documentation on how to setup a service gateway. The document was last updated in August and goes through the workflow on how to setup a connection.Step 1: Obtain a trial or paid subscription to Oracle Compute Cloud Service. After you subscribe to Oracle Compute Cloud Service, you will get your Corente credentials through email after you receive the Oracle Compute Cloud Service welcome email.Step 2: Set up a Corente Services Gateway (on-premises gateway) in your data center. This is where everything went off the rails the first time. This actually is not step 2. Step 2 is to visit the App Net Manager and register your gateway using the credentials that you received in the email. I went down the foolish path of spinning up a Linux 6 instance and running the verification to make sure that the virtualization gets passed to the guest operating system. According to the documentation, this is step 2. VirtualBox fails all of the tests suggested. I then looked for a second way of running in VirtualBox and the old way, CSG-VE, is not intended to be used with the Oracle Cloud. The CSG-VE is different from the gateway deployment and is for legacy Corente customers. It was never intended to be a solution for the Oracle Cloud. If you follow the cookbooks that are available internal to Oracle you can make the Corente Service Gateway work properly. I found two cookbooks and both are too large to publish in this blog. I will try to summarize the key steps. Ask your local sales consultant to look for "Oracle Corente Cloud Services Cook Book" or "Oracle Cloud Platform - Corente VPN for PaaS and IaaS". Both walk you through installation with screen shots and recommended configurations.Step 2a: Go to www.corente.com/web and execute the Java code that launches the App Net Manager. When I first did this it failed. I had to download a newer version of Java to get the javaws image to install. If you are on a Linux desktop you can do this with a w get http://javadl.oracle.com/webapps/download/AutoDL?BundleId=211989 or go to the web page https://java.com/en/download/linux_manual.jsp and download the Linux64 bundle. This allows you to uncompress and install the javaws binary and associate it with the jsp file provided on the Corente site. If you are on Windows or MacOS, go to https://java.com/en/download/ and it will figure out what your desktop is and ask you to download and install the latest version of Java. What you are looking for is a version with a JDK containing the javaws binary. This binary is called from the web browser and executes the downloadable scripts from the Corente site. Step 2b: When you go to the www.corente.com/web site it will download java code and launch the App Manager. It should look likeThe first time there will be no locations listed. We will need to add a location. It is important to note that the physical address that you use for the location has no relevance to the actual address of your server, gateway, or cloud hosting service. I have been cycling through major league baseball park addresses as my location. My gateway is currently located at Minute Maid Park in Houston and my desktop is at the Texas Rangers Ballpark in Arlington with my server at Wrigley Field in Chicago. Step 2c: Launch the New Location Wizard. The information that will be needed is Name, address, maintenance window (date and reboot option), inline configuration, dhcp, dhcp client name is optional, and lan interface. Note that it is important to know ahead of time what your lan interface is going to be. Once you get your gateway configured and connected the only way to get back into this console is to do it from this network. When I first did this I did not write down the ip address and basically locked my account. I had to go to another account domain and retry the configuration. For the trial that I did I used 192.168.200.1 as the lan address and had it use 255.255.255.0 as the netmask. This will become your gateway for all subnets in your data center. By default there is a dhcp server in my house that assigns IP addresses to the 192.168.1.X network. You need to pick something different than this subnet because you can't have a broadband router acting as a gateway to the internet and a VPN router acting as a gateway router on the same subnet. The implication to this is that you will need to create a new network interface on your Oracle Compute Cloud instances that have a network connection that talk on the 192.168.200.X network. This is easy to do but selection of this network is important and writing it down is even more important. The wizard will continue and ask about adding the subnet to the Default User Group. Click Yes and add the 192.168.200.X subnet to this group. Step 2d: At this point we are ready to install a Linux 6 or Linux 7 guest OS in VirtualBox and download the Corente Services Gateway software from http://www.oracle.com/technetwork/topics/cloud/downloads/network-cloud-service-2952583.html. From here you agree to the legal stuff and download the Corente Gateway Image. This is a bootable image that works with VirtualBox.Step 2e: We need to configure the instance with 2G of RAM, at least 44G of disk, and two network interfaces. The first interface needs to be configured as active using the Bridged Adapter. The second interface needs to be configured as active using the Internal Network. The bridged adapter is going to get assigned to the 192.168.1.X network by our home broadband DHCP server. The second network is going to be statically mapped to 192.168.200.1 by the configuration that you download from the App Manager. You also need to mount the iso image that was downloaded for the Corente Gateway Image. When the server boots it will load the operating system into the virtual disk and ask to reboot once the OS is loaded.Step 3: Rather than rebooting the instance we should stop the reboot after shutdown happens and remove the iso as the default boot device. If we don't, we will go through the OS install again and it will keep looping until we do. Once we boot the OS it will ask us to download the configuration file from the App Manager. We do this by setting the download site to www.corente.com, selecting dhcp as the network configuration and entering our login information for the App Manager in the next screen. Step 4: At this point we have a gateway configured in our data center (or home in my case) and need to setup a desktop server to connect through the VPN and access the App Manager. Up to this point we have connected to the app manager via our desktop to setup the initial configuration. From this point forward we will need to do so from an ip address in the 192.168.200.x network. If you try to connect to the app manager from your desktop you will get an error message and nothing can be done. To install a guest system we boot Linux 6 or Linux 7 into VirtualBox and connect to https://66.77.134.249. To do this we need to setup the network interfaces on our guest operating system. The network needs to be the internal network. For my example I used 192.168.200.100 as the guest OS ip address and the default router is 192.168.200.1 which is our gateway server. This machine is configured with a static IP address because by default the 192.168.1.X server will answer the DHCP address and assign you to the wrong subnet. To get the App Manager to work I had to download the javaws again for Linux and associate the jsp file from the www.corente.com/web site to launch using javaws. Once this was done I was able to add the guest OS as a new location. At this point we have a gateway server configured and running and a computer inside our private subnet that can access the App Manager. This is the foundation to getting everything to work. From here you can then provision a gateway instance in the cloud service and connect your guest OS to computers in the cloud as if they were in the same data center. More on that later.In summary, this was more difficult to do than I was hoping for. I made a few key mistakes when configuring the service. The first was not recording the IP address when I setup everything the first time. The second was using the default network behind my broadband router and not a different network address. The third was assuming that the steps presented in the documentation were the steps that I had to follow. The fourth was not knowing that I had to setup a guest OS to access the App Manager once I had the gateway configured. Each of these mistakes took hours to overcome. Each configuration and failure required starting over again from scratch and once I got to a point in the install I could not go back to scratch but had to start over with another account to get back to scratch. I am still trying to figure out how to reset the configuration for my initial account. Hopefully my slings and arrows will help you avoid the pitfalls of outrageous installations.

Last week we started talking about setting up Corente and came to the conclusion that you can not run Corente Gateway in a VirtualBox. It turns out that not only was I wrong but I got a ton of mail...

Iaas

Making Hadoop easier

Last week we looked at provisioning a Hadoop server and realized that the setup was a little complex and somewhat difficult. This is what people typically do the first time when they want to provision a service. They download the binaries (or source if you are really crazy) and install everything from scratch. Our recommendation is to do everything this way the first time. It does help you get a better understanding of how the setup works and dependencies. For example, Hadoop 2.7.3 required Java 1.8 or greater. If we go with Hadoop 2.7.2 we can get by with Java 1.7. Rather than going through all of the relationships, requirements, and libraries needed to get something working we are going to do what we would typically do to spin up a server if we suddenly need one up and running. We go to a service that provides pre-compiled and pre-configured public domain code sandboxes and get everything running that way. The service of choice for the Oracle Compute Cloud is Bitnami We can search for a Hadoop configuration and provision it into our IaaS foundation. Note that we could do the same using the Amazon EMR and get the same results. The key difference between the two are configurations, number of servers, and cost. We are going to go through the Bitnami deployment on the Oracle Cloud in this blog.Step 1 Search for Hadoop on http://oracle.bitnami.com and launch the instance into your region of choice. Step 2 Configure and launch the instance. We give the instance a name, we increase the default disk size from 10 GB to 60 GB to have room for data, we go with the hadoop 2.7.2-1 version, select Oracle Linux 6.7 as the OS (Ubuntu is an alternative), and go with a small OC3 footprint for the compute size. Don't change the security rules. A new one will be generated for you as well as the ssh keys when you provision through this service.Step 3 Log into your instance. To do this you will need ssh and use the keys that bitnami generates for you. The instance creation takes 10-15 minutes and should show you a screen with the ip address and have links for you to download the keys. Step 4 Once you have access to the master system you can execute the commands that we did last week. The only key difference with this implementation is that you will need to install java-1.8 with a yum install because by default the development kit is not installed and we need the jar functionality as part of configuration. The steps needed to repeat our tests from the previous blog entry. --- setup hdfs file system hdfs namenode -format hdfs getconf -namenodes hdfs dfs -mkdir input cp /opt/bitnami/hadoop/etc/hadoop/*.xml input hdfs dfs -put input/*.xml input --- setup simple test with wordcount hdfs dfs -mkdir wordcount hdfs dfs -mkdir wordcount/input mkdir ~/wordcount mkdir ~/wordcount/input vi file01 mv file01 ~/wordcount/input vi ~/wordcount/input/file02 hdfs dfs -put ~/wordcount/input/* wordcount/input vi WordCount.java --- install java-1.8 to get all of the libraries sudo yum install java-1.8\* --- create ec.jar file export HADOOP_CLASSPATH=/opt/bitnami/java/lib/tools.jar hadoop com.sun.tools.javac.Main WordCount.java jar cf wc.jar WordCount*.class hadoop jar wc.jar WordCount wordcount/input wordcount/output hadoop fs -cat wordcount/output/part-r-00000 --- download data and test pig mkdir data cd data w get http://stat-computing.org/dataexpo/2009/1987.csv.bz2 w get http://stat-computing.org/dataexpo/2009/1988.csv.bz2 bzip2 -d 1987.csv.bz2 bzip2 -d 1988.csv.bz2 hdfs dfs -mkdir airline hdfs dfs -copyFromLocal 19*.csv airline vi totalmiles.pig pig totalmiles.pig hdfs dfs -cat data/totalmiles/part-r-00000Note that we can do the exact same thing using Amazon AWS. They have a MapReduce product called EMR. If you go to the main console, click on EMR at the bottom of the screen, you can create a Hadoop cluster. Once you get everything created and can ssh into the master you can repeat the steps above. I had a little trouble with the WordCount.java program in that the library version was a little different. The JVM_1.7 libraries had a problem linking and adding the JVM_1.8 binaries did not properly work with the Hadoop binaries. You also need to change the HADOOP_CLASSPATH to point to the proper tools.jar file since it is in a different location from the Bitnami install. I think with a little tweaking it would all work. The pig sample code works with no problem so we were able to test that without changing anything. In summary, provisioning a Hadoop server or cluster in the cloud is very easy if someone else has done the heavy lifting and pre-configured a server or group of servers for you. I was able to provision two clusters before lunch, run through the exercises, and still have time to go through it again to verify. Using a service like private Marketplaces, Bitnami, or the AWS Marketplace makes it much simpler to deploy sandbox images.

Last week we looked at provisioning a Hadoop server and realized that the setup was a little complex and somewhat difficult. This is what people typically do the first time when they want to...

Iaas

Hadoop on IaaS - part 2

Today we are going to get our hands dirty and install a single instance standalong Hadoop Cluster on the Oracle Compute Cloud. This is a continuing series of installing public domain software on Oracle Cloud IaaS. We are going to base our installation on three componentsOracle Compute Cloud ServicesApache Hadoop 2.7.3Oracle Linux Release 6 Update 7We are using Oracle Linux 6.7 because it is the easiest to install on Oracle Compute Cloud Services. We could have done Ubuntu or SUSE or Fedora and followed some of the tutorials from HortonWorks or Cloudera or Apache Single Node Cluster. Instead we are going old school and installing from the Hadoop home page by downloading a tar ball and configuring the operating system to run a single node cluster.Step 1:Install Oracle Linux 6.7 on an Oracle Compute Cloud instance. Note that you can do the same thing by installing on your favorite virtualization engine like VirtualBox, VMWare, HyperV, or any other cloud vendor. The only true dependency is the operating system beyond this point. If you are installing on the Oracle Cloud, go with the OL_67_3GB..... option, go with the smallest instance, delete the boot disk, replace it with a 60 GB disk, rename it and launch. The key reason that we need to delete the boot disk is that by default the 3 GB disk will not take the Hadoop binary. We need to grow it to at least 40 GB. We pad a little bit with a 60 GB disk. If you check the new disk as a boot disk it replaces the default Root disk and allows you to create an instance with a 60 GB disk. Step 2:Run yum to update the os, install w get, and java version 1.8. You need to login as opc to the instance so that you can run as root.Note that we are going to diverge from the Hadoop for Dummies that we referenced yesterday. They suggest attaching to a yum repository and doing an install from the repository for the bigtop package. We don't have that option for Oracle Linux and need to do the install from the binaries by downloading a tar or src image. The bigtop package basically takes the Apache Hadoop bundle and translates them to rpm files for an operating system. Oracle does not provide this as part of the yum repository and Apache does not create one for Oracle Linux or RedHat. We are going to download the tar file from the links provided at Apache Hadoop homepage we are following install instructions for a single node cluster.Step 3:Get the tar.gz file by pulling it from http://apache.osuosl.org/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gzStep 4:We unpack the tar.gz file with the tar xvzf hadoop-2.7.2.tar.gz commandStep 5:Next we add the following to the .bashrc file in the home directory to setup some environment variables. The java code is done in the same location by the yum command. The location of the hadoop code is based on downloading into the opc home directory.export JAVA_HOME=/usrexport HADOOP_HOME=/home/opc/hadoop-2.7.3export HADOOP_CONFIG_DIR=/home/opc/hadoop-2.7.3/etc/hadoopexport HADOOP_MAPRED_HOME=/home/opc/hadoop-2.7.3export HADOOP_COMMON_HOME=/home/opc/hadoop-2.7.3export HADOOP_HDFS_HOME=/home/opc/hadoop-2.7.3export YARN_HOME=/home/opc/hadoop-2.7.3export PATH=$PATH:$HADOOP_HOME/binStep 6Source the .bashrc to pull in these environment variablesStep 7Edit the /etc/hosts file to add namenode to the file. Step 8Setup ssh so that we can loop back to localhost and launch an agent. I had to edit the authorized_keys to add a return before the new entry. If you don't the ssh won't work. ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsacat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keysvi ~/.ssh/authorized_keysssh localhostexitStep 9Test the configuration then configure the hadoop file system for single node.cd $HADOOP_HOMEmkdir inputcp etc/hadoop/*.xml input./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output 'dfs[a-z.]+'vi etc/hadoop/core-site.xmlWhen we ran this and there were a couple of warnings which we can ignore. The test should finish without error and generate a long output list. We then edit to core-site.xml file by changing the following lines at the end. (omit the spaces, the blog software masked them and the only way to show the full file was to add spaces)< configuration >< property >< name >fs.defaultFS< /name >< value >hdfs://namenode:8020< /value >< /property >< /configuration >Step 10Create the hadoop file system with the command hdfs namenode -formatStep 11Verify the configuration with the command hdfs getconf -namenodesStep 12Start the hadoop file system with the command sbin/start-dfs.shAt this point we have the hadoop filesystem up and running. We now need to configure MapReduce and test functionality.Step 13Make the HDFS directories required to execute MapReduce jobs with the commands hdfs dfs -mkdir /user hdfs dfs -mkdir /user/opc hdfs dfs -mkdir input hdfs dfs -put etc/hadoop/*.xml inputStep 14Run a MapReduce example and look at the output hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar grep input output 'dfs[a-z.]+' hdfs dfs -get output output cat output/* output/output/*Step 15Create a test program to do a wordcount of two files. This example comes from an Apache MapReduce Tutorialhdfs dfs -mkdir wordcounthdfs dfs -mkdir wordcount/inputmkdir ~/wordcountmkdir ~/wordcount/inputvi ~/wordcount/input/file01 - add Hello World Bye Worldvi ~/wordcount/input/file02- addHello Hadoop Goodbye Hadoophdfs dfs -put ~/wordcount/input/* wordcount/inputvi ~/wordcount/WordCount.javaCreate WordCount.java with the following codeimport java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount { public static class TokenizerMapper extends Mapper{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }}import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount { public static class TokenizerMapper extends Mapper{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }}Step 16Compile and run the WordCount.java codecd ~/wordcountexport JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64export HADOOP_CLASSPATH=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el6_8.x86_64/lib/tools.jarhadoop com.sun.tools.javac.Main WordCount.javajar cf wc.jar WordCount*.classhadoop jar wc.jar WordCount wordcount/input wordcount/outputhadoop fs -cat wordcount/output/part-r-00000At this point we have a working system and can run more MapReduce jobs, look at results, and play around with Big Data foundations.In summary, this is a relatively complex example. We have moved beyond a simple install of an Apache web server or Tomcat server and editing some files to get results. We have the foundations for a Big Data analytics solution running on the Oracle Compute Cloud Service. The steps to install are very similar to the other installation tutorials that we referenced earlier on Amazon and Virtual Machines. Oracle Compute is a good foundation for public domain code. Per core the processes are cheaper than other cloud vendors. Networking is non-blocking and higher performance. Storage throughput is faster and optimized for compute high I/O and tied to the compute engine. Hopefully this tutorial has given you the foundation to start playing with Hadoop on Oracle IaaS.

Today we are going to get our hands dirty and install a single instance standalong Hadoop Cluster on the Oracle Compute Cloud. This is a continuing series of installing public domain software...

Iaas

Hadoop on IaaS

We are going to try a weekly series of posts that talks about public domain code running on the Oracle Cloud IaaS platform. A good topic seems to be Big Data and Hadoop. Earlier we talked about running Tomcat on IaaS as well as WordPress on IaaS using bitnami.com. To start this process we are going to review what is Big Data and what is Hadoop. We are going to start with the first place that most people start with and that is looking at what books are available on the subject and walking through one or two of them. Today we are going to start with Hadoop for Dummies by Dirk deRoos. This is not the definitive source on Hadoop but a good place to have terms and concepts defined for us.Years ago one of the big business trends was to create a data warehouse. The idea was to take all of the coporate operational data and put it into one database and grind on it to generate reports. History has shown that aggregation of the data was a difficult task as well as the processing power required to grind through reports. The task took significant resources to architect the data, to host the data, and to write select statements to generate reports for users. As retail got more and more ingrained on the web, sources outside the company became highly relevant and influential on products and services. Big Data and Hadoop have come with tools to pull from non-structured data like Twitter, Yelp, and other public web services and correlate comments and reviews to products and services.The three characterizations of Big Data according to Hadoop for Dummies areVolume - high volumes of data ranging from dozens fo terabytes to petabytes.Variety - data that is organized in multiple structures, ranging from raw text to log files.Velocity - data that enters an organization has some kind of value for a limited amount of time. The higher the volume of data entering an organization per second, the bigger the velocity of change.Hadoop is architected to view high volumes of data and data with a variety of structures but it is not necessarily suited to analyze data in motion as it enters the organization but once it is stored and at rest. Since we touched on the subject, let's define different data structures. Structured data is characterized by a high degree of organization and is typically stored in a database or spreadsheet. There is a relational mapping to the data and programs can be written to analize and process the relationships. Semi-structured data is a bit more difficult to understand than structured data. It is typically stored in the form of text data or log files. The data is typically somewhat structured and is either comma, tab, or character delimited. Unfortunately multiple log files have different formats so the stream of formatting is different for each file and parsing and analysis is a little more challenging. Unstructured data has none of the advantages of the other two data types. Structure might be in the form of directory structure, server location, or file type. The actual architecture of the data might or might not be predictable and needs a special translator to parse the data. Analyzing this type of data typically requires a data architect or data scientist to look at the data and reformat it to make it usable.From Dummies Guide again, Hadoop is a framework for storing data on large clusters of commodity hardware. This lends itself well to running on a cloud infrastructure that is predictable and scalable. Level 3 networking is the foundation for the cluster. An application that is running on Hadoop gets its work divided among the nodes in the cluster. Some nodes aggregate data through MapReduce or YARN and the data is stored and managed by other nodes using a distributed file system know as the Hadoop distributed file system (HDFS). Hadoop started back in 2002 with the Apache Nutch project. The purpose of this project was to create the foundation for an open source search engine. The project needed to be able to scale to billions of web pages and in 2004 Google published a paper that introduced MapReduce as a way of parsing these web pages. MapReduce performs a sequence of operations on distributed data sets. The data consists of key-value pairs and has two phases, mapping and data reduction. During the map phase, input data is split into a large number of fragments which is assigned to a map task. Map tasks process the key-value pair that it assigned to look for and proces a set of intermediate key-value pairs. This data is sorted by key and stored into a number of fragments that matches the number of reduce tasks. If for example, we are trying to parse data for the National Football League in the US we would want to spawn 32 task nodes to that we could parse data for each team in the league. Fewer nodes would cause one node to do double duty and more than 32 nodes would cause a duplication of effort. During the reduction phase each task processes the data fragment that it was assigned to it and produces an output key-value pair. For example, if we were looking for passing yardage by team we would spawn 32 task nodes. Each node would look for yardage data for each team and categorize it as either passing or rushing yardage. We might have two quarterbacks pay for a team or have a wide receiver throw a pass. The key for this team would be the passer and the value would be the yards gained. These reduce tasks are distributed across the cluster and the results of their output is stored on the HDFS when finished. We should end up with 32 data files from 32 different task nodes updating passing yardage by team.Hadoop is more than just distributed storage and MapReduce. It also contains components to help administer and coordinate servers (HUE, Ambari, and Zookeeper), data movement management (flume and sqoop), resource management (YARN), processing framework (MapReduce, Tez, Hoya), Workflow engines (Oozie), Data Serialization (Avro), Data Collection (MapReduce, Pig, Hive, and HBase), and Data Analysis (Mahout). We will look into these system individually later.There are commercial and public domain offerings for Hadoop.ClouderaEMC Pivotal HDHortonworksIBM - InfoSphere BigInsightsIntel Apache HadoopOracle Big Data MachineOracle Big Data Cloud ServiceMapRA good project to start a small Hadoop project is log analysis. If you have a web server, it generates logs every time that a web page is requested. When a change is made to the web site, logs are generated when people log into manage the pages or change the page content. If you web page is a transactional system, orders are being placed for goods and services as well as credit card transaction processing. All of these generate log files. If we wanted to look at a product catalog and correlate what people look at in relationship to what is ordered, we could do what Amazon has done for years. We could come up with recommendations on what other people are looking at as well as what other people ordered along with this item. If, for example, we are buying a pair of athletic shoes. A common purchase with a pair of shoes is also socks. We could give a recommendation on socks that could go with the shoes or a shoe deoderant product that yields a higher profit margin. These items could be displayed with the product in the catalog or shopping cart to facilitate more goods sold on the web. We can also look at the products that no one is looking at and reduce our inventories since they are not even getting looked at casually. We can also use Hadoop as a fraud detection or risk modeling engine. Both provide significant value to companies and allow executives to look at revenue losses as well as potential transactions that could cause a loss. For example, we might want to look at the packing material that we use for a fragile item that we sell. If we have a high rate of return on a specific item we might want to change the packing, change the shipper, or stop shipping to a part of the country that tends to have a high return rate. Any and all of these solutions can be implemented but a typical data warehouse will not be able to coordinate the data and answer these questions. Some of the data might be stored in plain text files or log files on our return web site. Parsing and processing this data is a good job for Hadoop.In the upcoming weeks we will dive into installation of a Hadoop framework on the Oracle Cloud. We will look at resources required, pick a project, and deploy sample code into a IaaS solution. We will also look at other books and resources to help us understand and deploy sandboxes to build a prototype that might help us solve a business problem.

We are going to try a weekly series of posts that talks about public domain code running on the Oracle Cloud IaaS platform. A good topic seems to be Big Data and Hadoop. Earlier we talked...

Iaas

Corente DataCenter Setup

Yesterday we went through the theory of setting up a VPN to connect a subnet in our data center to a subnet in the Oracle Cloud. Today we are going to go through the setup of the Corente Gateway in your data center. We will be following the Corente Service Gateway Setup. Important, this lab has problems. Corente does not work with VirtualBox. The first step that we need to do is ensure that we have a Linux server that we can install the services on in our data center. We will be installing these services on an Oracle Linux 6.7 release running in VirtualBox. To get started we install a new version from an iso image. We could just as easily have cloned an existing instance. For the installation we select the software development desktop and add some adminstration tools to help look at stuff later down the road. According to the instructions we need to make sure that our user has sudo rights and can reconfigure network settings as well as access the internet to download code. This is done by editing the /etc/sudoers file and adding our oracle user to the access rights. We then runmodprobe -v kvm-intelegrep '^flags.*(vmx|svm)' /proc/cpuinfoto verify that we have the right type of virtualization needed to run the VPN software. It turns out that VirtualBox does not support nested virtualization which is needed by the Corente software. We are not able to run the Corente Gateway from a VirtualBox instance. We need to follow a different set of instructions and download the binaries for the Corente Gateway Services - Virtual Environment. Unfortunately, this version was depreciated in version 9.4. We are at a roadblock and need to look at alternatives for connecting Corente Gateway Services from out sandbox to the Oracle Cloud.I debated continuing on or showing different failed paths in this post. I decided that showing a failed attempt had as much value as showing a successful attempt. Our first attempt was to install the gateway software on a virtual instance using VirtualBox since it is a free product. Unfortunately, we can't do this since it does not support passing the virtual interfaces from the Intel Xeon chip into the guest operating system. The second attempt was to go with a binary specifically designed to work with VirtualBox and load it. It turns out that this version was decommitted and there really is not solution that works with VirtualBox. Tomorrow we will look for alternatives of running the gateway on a native Windows host and a MacOS host since I use both to write this blog. Installing a gateway on a physical host is not optimum because we might need to reconfigure ethernet connections. My preference is to stay in a sandbox but setting up an OracleVM server, VMWare server, or HyperV server would all be difficult at best. An alternative that we might look at is setting up our gateway server in another cloud instance and connecting one cloud vendor to another cloud vendor. It all depends on who exposes the hardware virtualization to their guest instances. More on that tomorrow.

Yesterday we went through the theory of setting up a VPN to connect a subnet in our data center to a subnet in the Oracle Cloud. Today we are going to go through the setup of the Corente Gateway in...

Iaas

connecting subnets

This week we are going to focus on connecting computers. It seems like we have been doing this for a while. April 4 - connecting ssh to a compute instanceJune 3 - connecting rdp to a Windows compute instanceAug 24 - hiding a server in the cloudWe have looked at connecting our desktop to a cloud server (Linux and Windows). We have also looked at hiding a server in the cloud so that you can only get to it from a proxy host and not from our desktop or anywhere on the cloud. In this blog we are going to start talking about changing the configuration so that we create a new network interface on our desktop and use this new network interface to connect through a secure tunnel to a second network interface on a compute cloud. A good diagram of what we are trying to accomplish can be seen below.Note that there are three components to this. The first is Corente Gateway running in your data center. The second is Corente Gateway running in the Oracle Cloud. The third is the Corente App Net Manager Service Portal. The basic idea behind this is that we are going to create a second network as we did with hiding a server in the cloud. We initially setup one instance so that we could access it through a public IP address 140.86.14.242. It also has a private IP address for inside the Oracle Cloud of 10.196.135.XXX. I didn't record the internal IP address because I rarely use it for anything. The key here is that by default a subnet range of 10.196.135/24 was configured. We were able to connect to our second server at 10.196.135.74 because we allowed ssh access on the private subnet but not on the public network. When this blog was initially written Oracle did not support defining your own subnet and assigned you to a private network based on the rack that you got provisioned into inside the Cloud data center. As of OpenWorld, subnet support was announced so that you could define your own network range. One of the key feedbacks that Oracle got was that customers did not like creating a new subnet in their data center to match the Oracle subnet. They would rather define their own subnet in the Oracle Cloud to match the subnets that they have in their own data center. Let's take each of these components one at a time. First the Corente Gateway running in your data center. This is a virtual image that you download or software components that you install on a Linux system and run in your data center. The concept here is that you have a (virtual) computer that runs in your data center. The system has two network interfaces attached. The first can connect to the public internet through network address translation or is directly connected to the internet or through a router. The second network interface connects to your private subnet. This IP address is typically not routable like a 10. or 192.168. network. There is not mistake that your are hoping to get to a machine on the internet because these networks are non-routable. The key is that the Corente Gateway actually has a listener that looks for communications intended for this non-routable network and replicates the packets through a secure tunnel to the Corente Gateway running in the Oracle Cloud. All of the traffic passes from your local network which is non-routable to another network hundreds or thousands of miles away and gives you the ability to talk to more computers on that network. This effectively gives you a private virtual network from your data center to a cloud data center. Rather than using a software virtual gateway you can use a hardware router to establish this same connection. We are not going to talk about this as we go through our setup exercises but realize that it can be done. This is typically what a corporation does to extend resources to another data center, another office, or a cloud vendor for seasonal peak periods or cheaper resources. The benefit to this configuration is that it can be done by corporate IT and not by an individual.The key things that get setup during this virtual private network connection are name parsing (DNS), ip routing (gateways and routers), and broadcast/multicast of messages. Most VPN configurations support layer 3 and above. If you do a arp request the arp is not passed through the VPN and never reaches the other data center. With Corente it uses the GRE tunneling protocol which is a layer 2 option. Supporting layer 2 allows you to route ping requests, multicast requests, and additional tunnel requests at a much lower and faster level. As we discussed in an earlier blog, Microsoft does not allow layer 2 to go into or out of their Azure cloud. Amazon allows layer 2 inside their cloud but not into and out of their cloud. This is a key differentiator between the AWS, Azure, and Oracle clouds.The second component of the virtual private network is the Oracle Cloud Corente Gateway. This is the target side where the gateway in your data center is the initiator. Both gateways allow traffic to go between the two networks on the designated subnet. Both gateways allow for communication between servers in the data center and servers in the Oracle cloud. When you combine the VPN gateways with the Security Lists and Security Rules you get a secure network that allows you to share a rack of server and not worry about someone else who is using the Oracle Cloud from accessing your data center even if they are assigned an IP address on the same subnet. When you define a Security List or Security Rule, these exceptions and holes allow for traffic from computers in your account to access the VPN. No one else in the same rack or same cloud data center can access your VPN or your data center. The third component is the app net management service portal. This portal establishes connections and rules for the other two components. When you install each of the components it communicates with the admin portal to get configuration information. If you need to change a configuration or keys or some aspect of the communication protocol this is done in the admin portal and it communicates to the other two components to update the configuration. This service also allows you to monitor traffic and record traffic between the Oracle Cloud and your data center. The network resources for your data center installed service will look likewith br0 being your public facing connection and br1 connecting to your subnet in your data center. A similar configuration is done in the Oracle Cloud but this is pre-configured and can be provisioned as a public image. The only thing that you need to configure is the subnet address range and relationship with the app net management service portal.Today was a lot of theory and high level discussions. Tomorrow we will dive into configuration of the gateway in your data center. The day after that we will look at provisioning the gateway in the Oracle Cloud and connecting the two. Just a quick reminder, we talked about how to establish a connection between your desktop and a cloud server. By going the a VPN configuration we will get around having to hide a server in the cloud. We can setup all of our servers to have private network links and only open up web servers or secure web servers to talk to the public internet. We can use ssh and rdp from our desktops at home or in our offices to communicate to the cloud servers. Setting up the VPN is typically a corporate responsibility and giving you access to the resources. What you need to know are what cloud resources you have access to and how much money you have in your budget to solve your business problem.

This week we are going to focus on connecting computers. It seems like we have been doing this for a while. April 4 - connecting ssh to a compute instance June 3 - connecting rdp to a Windows compute...

Iaas

subnets

Let's take a step back and look at networking from a different perspective. A good reference book to start with from a corporate IT perspective is CCENT Cisco Certified Entry Networking Technician ICND1 Study Guide (Exam 100-101) with Boson NetSim Limited Edition, 2nd Edition. You don't need to read this book to get Cisco Certified but it does define terms and concepts well.At the lowest level you start with a LAN or local area network. From the study guide, "Usually, LANs are confined to a single room, floor, or building, although they can cover as much as an entire campus. LANs are generally created to fulfill basic networking needs, such as file and printer sharing, file transfers, e-mail, gaming, and connectivity to the Internet or outside world." This typically is connected with a single network hub or a series of hubs and a router or gateway connects us to a larger network or the internet. The key services that you need on a LAN are a naming service and gateway service. The naming service allows you to find services by name rather than ip address. The gateway service allows you to connect to this service that you want to connect to that are not on your local network. It is basically as simple as that. A gateway typically also acts as a firewall and or network address translation device (NAT). The firewall either allows or blocks connections to a specific port on a specific ip address. It might have a rule that says drop all traffic or allow traffic from anywhere, from a network range, or from a specific network address. Network address translation allows you to communicate to the outside world from your desktop on a private nonroutable ip address and have the service that you are connecting to know how to get back to you. For example. my home network has an internet router that connects to AT&T. When the router connects to AT&T, it gets a public ip address from the internet provider. This address is typically something like 90.122.5.12. This is a routable address that can be reached anywhere from the internet. The router assigns an ip address to my desktop and uses the address range 192.168.1.0 to 192.168.1.100 to assign the addresses. This implies that I can have 101 devices in my house. When I connect to gmail.com to read my email I do a name search for gmail.com and get back the ip address. My desktop, assigned to 192.168.1.100 does an http get from gmail.com on port 80. This http request is funneled through my internet router which changes the ip header assigning the transmitter ip address to 90.122.5.12. It keeps track of the assignment so that a response coming back from gmail.com gets routed back to my desktop rather than my kids desktop on the same network. To gmail.com it thinks that you are connecting from AT&T and not your desktop.It is important to take our discussion back to layer 2 and layer 3 when talking about routing. If we are operating on a LAN, we can use layer 2 multicast to broadcast packets to all computers on our local network. Most broadband routers support all of layer 3 and part of layer 2. You can't really take a video camera in your home and multicast it to your neighbors so that they can see your video feed but you can do this in your home. You can ping their broadband router if you know the ip address. Typically the ip address of a router is not mapped to a domain name so you can't really ask for the ip address of the router two houses down. If you know their ip address you can setup links between the two houses and through tcp/ip or udp/ip share video between the houses. If we want to limit the number of computers that we can put on our home or office network we use subnet netmasks to limit the ip address range and program the router to look for all ip addresses in the netmask range. The study guide book does a good job of describing subnetting. The diagram below shows how to use a netmask to define a network that can host just over a hundred computers.Note that we have defined a network with a network id of 192.168.1.64 by using netmask 255.255.255.192 which limits the number of computers to 127 computers. If we put a computer with ip address of 192.168.1.200 on this network we won't be able to connect to the internet and we won't be able to use layer 2 protocols to communicate to all of the computers on this network. With this configuration we have effectively created a subnet inside our network. If we combine this with the broadcast address that is used when we create our network connection we can divide our network into ranges. The study guide book goes through an exercise of setting up a nework for different floors in an office and limiting each floor to a fixed number of computers and devices. One of the design challenges faced by people who write applications is where do you layer security and layer connectivity. Do you configure an operating system firewall to restrict address ranges that it will accept requests from? Do you push this out to the network and assume that the router will limit traffic on the network? Do you push this out to the corporate or network firewall and assume that everything is stopped at the castle wall. The real answer is yes. You should setup security at all of these layers. When you make an assumption things fall apart when someone opens an email and lets the trojan horse through the castle gates. If you look at the three major cloud vendors they all take the same basic approach. Microsoft and Oracle don't let you configure the subnet that you are assigned to. You get assigned to a subnet and have little choice on the ip address range for the computers that you are placed upon in the cloud solution. Amazon allows you to define a subnet and ip address range. This is good and bad. It makes routing a little more difficult in the cloud and address translation needs to be programmed for the subnet that you pick. Going with vendors that assign an ip address range have hardwired routing for that network. This optimizes routing and simplifies the routing tables. Amazon faces problems with EC2 and S3 connectivity and ends up charging for data transmitted from S3 to EC2. Bandwidth is limited with these connections partly due to routing configuration limitations. Oracle and Microsoft have simpler routing maps and can put switched networks between compute and storage which provides a faster and higher throughput storage network connection. The fun part comes when we want to connect our network which is on a non-routable network to our neighbors. We might want to share our camera systems and record them into a central video archive. Corporations face this when they want to create a cloud presence yet keep servers in their data center. Last week we talked about hiding a server in the cloud and putting our database where you can't access it form the public internet. This is great for security but what happens when we need to connect with sql developer to the database to upload a new stored procedure? We need to be able to connect to this private subnet and map it to our corporate network. We would like to be able to get to 10.10.1.122 from our network which is mapped to 192.168.1.0. How do we do this? There are two approaches. First, we can define a secondary network in our data center to match the 10.10.1.0 network and create a secure tunnel between the two network. The second is to remap the cloud network to the 192.168.1.0 subnet and create a secure tunnel between the two networks. Do you see a common theme here? You need a secure tunnel with both solutions and you need to change the subnet either at the cloud host or in your data center. Some shops have the flexibility to change subnets in their corporate network or data center to match the cloud subnet (as is required with Oracle and Microsoft) while others require the cloud vendor to change the subnet configuration to match their corporate policy (Amazon provides this).Today we are not doing to dive deep into virtual private networks, IPSec, or secure tunnels. We are going to touch on the subjects and discuss them in depth later. The basic concept is a database developer working on their desktop needs to connect to a database server in the cloud. A Java developer working on their desktop needs to connect to a Java server in the cloud. We also need to hide the database server so that no one from the public internet can connect to the database server. We want to limit the connection to the Java server to be port 443 for secure https to public ip addresses and allow ssh login on port 22 from our corporate network. If we set a subnet mask, define a virtual private secure network between our corporate network and cloud network, and allow local desktops to join this secure network we can solve the problem. Defining the private subnet in the cloud and connecting it to our corporate network is not enough. This is going back to the castle wall analogy. We want to define firewall rules at the OS layer. We want to define routing protocols between the two networks and allow or block communication at different layers and ports. We want to create a secure connection from our sql developer, java developer, or eclipse development tools to our production servers. We also want to facilitate tools like Enterprise Manager to measure and control configurations as well as notify us of overload or failure conditions. In summary, there are a variety of decisions that need to be made when deploying a cloud solution. Letting the application developer deploy the configuration is typically a bad idea because they don't think of all of the corporate requirements. Letting the IT Security specialist deploy the configuration is also a bad idea. The solution will be so limiting that it makes the cloud services unusable. The architecture needs to be a mix of accessibility, security, as well as usability. Network configurations are not always the easiest discussion to have but critical to have early in the conversation. This blog is not trying to say that one cloud vendor is better than the other but trying to simply point out the differences so that you as a consumer can decide what works best for your problem.

Let's take a step back and look at networking from a different perspective. A good reference book to start with from a corporate IT perspective is CCENT Cisco Certified Entry Networking Technician...

Iaas

networking differences between cloud providers

In this blog entry we are going to perform a simple task of enabling an Apache Web Server on a Linux server and look at how to do this on the Oracle Cloud, Amazon AWS, and Microsoft Azure. Last week we did this for the Oracle Cloud but we will quickly review this again. As we go down this path we will look at the different options presented to you as you create a new instance and see how the three cloud vendors diverge in their approach to services. Which version of Linux we select is not critical. We are looking at the cloud tooling and what is required to deploy and secure an instance. Our goals areDeploy a Linux instance into a cloud serviceEnable port 22 to allow us to communicate from our desktop into the Linux instanceEnable port 80 to allow us to communicate from the public internet into the Linux instanceDisable all other services coming into this instance.We will use DHCP initially to get an ip address assigned to us but look at static ip addresses in the endStep 1:Deploy a Linux instance into a small compute service. Go with the smallest compute shape to save money, go with the smallest memory allocation because we don't need much for a test web server, go with the default network interfaces and have an ip address assigned, go with the smallest disk you can to speed up the process.Step 1a - Oracle Public CloudWe go to the Compute Console and click on Create Instance. This takes us through screens that allow us to select an operating system, core count and memory size. When we get to the instance config we have the option of defining network security rules with a Security List. We can either create a new security list or select an existing security list. We will in the end select the default that allows us to connect to port 22 and modify the security list at a later point. We could have selected the WebServer entry from the Security List because we have done this before. For this exercise we will select the default and come back later and add another access point. Once we get to the review screen we can create the instance. The only networking questions that we were asked was what Security List definition do we want. Step 1b - Amazon AWSWe go to the EC2 Console and click on EC2 followed by Launch Instance. From the launch screen we select a Linux operating system and start the configuration. Note that the network and subnet menus allow you to deploy your instance into an ip address range. This is different than the Oracle Cloud where you are assigned into a non-routable ip address range based on the server that you are dropped into. Since these are private ip addresses for a single server this is really not a significant issue. We are going to accept the defaults her and configure the ports in a couple of screens. We are going to go with a dhcp public ip address to be able to attach to our web server. We accept the default storage and configure the ports that we want to open for our instance. We can define a new security group or accept an existing security group. For this example we are going to add http port 80 since it is a simple add at this point and move forward with this configuration. We could go with a predefined configuration that allows port 80 and 22 but for this example we will create a new one. We then review and launch the instance. Step 1c - Microsoft AzureWe go to the Azure Portal and click on Virtual Machine -> Add which takes us to the Marketplace. From here we type in Linux and pick a random Linux operating system to boot from. We are assigned a subnet just like we were with the Oracle Cloud and have the ability to add a firewall rule to allow port 80 and 22 through from the public internet. Once we have this defined we can review and launch our instance. Step 2: Log into your instance and add the apache web server. This can easily be done with a yum install apache2 command. We then edit the /var/www/index.html file so that we can see an answer from the web server. Step 3: Verify the network security configuration of the instance to make sure that ports 80 and 22 are open. Step 3a: Oracle CloudWhen we created the instance we went with the default network configuration which only has port 22 open. We now need to add port 80 as an open inbound port for the public internet. This is done by going to the Compute Instance console and viewing our web server instance. By looking at the instance we can see that we have the default Security List associated with our instance. If we have a rule defined for port 80 we can just click on Add Security List and add the value. We are going to assume that we have not defined a rule and need to do so. We create a new rule which allows us to allow http traffic from the public internet to our security list WebServer. We than need to go back and add a new Security List to our instance and select WebServer which allows port 80 and 22.Step 3b and 3c: AWS and AzureWe really don't need to do anything here because both AWS and Azure gave us the ability to add a port definition in the menu creation system. Had we selected a predefine security list there would be no step 3 for any of the services.Surprisingly, we are done. Simple network configuration is simple for all three vendors. The key differences that we see are that Amazon and Microsoft give you the ability to define individual port definitions as you create your instance. Oracle wants you to define this with Security Rules and Security Lists rather than one at a time for each instance. All three platforms allow you to configure firewall rules ahead of time and add those as configurations. In this example we were assuming a first time experience which is not the normal way of doing things. The one differential that did stand out is that Amazon allows you to pick and choose your subnet assignment. Oracle and Microsoft really don't give you choices and assign you an ip range. All three give you the option of static of dynamic public ip addresses. For our experiment there really isn't much difference in how any of the cloud vendors provision and administer firewall configurations.

In this blog entry we are going to perform a simple task of enabling an Apache Web Server on a Linux server and look at how to do this on the Oracle Cloud, Amazon AWS, and Microsoft Azure. Last week...

Iaas

Instance and storage snapshot

Yesterday we went through and created an E-Business Suite 12.5.5 instance on three servers in the Oracle Public Cloud. On previous days we talked about how to protect these instances by hiding the database and removing access from the public internet and only allowing the application server to connect to the database instance. Today we are going to assume that our work as an architect is done and we need to backup our work. We could go through the standard ufsdump and backup our system to network storage. This only solves half the problem in the cloud. We can restore our data but things are a little different in the cloud world. We need to backup our Security Lists, Security Rules, and instance configurations. We might want to replicate this environment for a secondary dev/test or QA environment so creating a golden master would be a nice thing to do.With the Oracle Cloud we have the option of doing an instance snapshot as well as storage snapshot. This is equivalent to cloning our existing instance and having it ready to provision when we want. This is different from a backup. A backup traditionally assumes a fixed computer architecture and we can restore our operating system bits and application code onto a disk. If we suddenly change and add a virtual private network for communications with our on premise data center the backup might or might not have that configuration as part of the bits on the network disk. Many customers found that this was the case with VMWare. When you can redefine the network through software defined networks, create virtual disks and virtual network interfaces, these additions are not part of a ufsdump or OS level backup. You really need to clone the virtual disk as well as the configurations. Oracle released snapshots of storage as well as snapshots of instances in the May/June update of cloud services. There really are no restrictions on the storage snapshots but there are a few on the instance snapshots. For the instance snapshot you need to make sure that the boot disk is non-persistent. This means that you don't pre-create the disk, attach it to the instance and boot from it. The disk needs to have the characteristic of delete upon termination. This sounds very dangerous up front. If you create customizations like adding user accounts to /etc and init files to the /etc/init directory these get deleted on termination. The key is that you create an instance, customize it, and create a snapshot of it. You then boot from a clone of this snapshot rather than a vanilla image of the operating system. First, let's look at storage snapshots. We can find more information in the online documentation for the console or the online documentation for the REST API and command line interface. There are some features in the REST API that are worth diving a little deeper into. According to the REST API documentation you can create a snapshot in the same server to allow for faster restores by specifying /oracle/private/storage/snapshot/collocated as a property when you create the snapshot. From this you can create a storage volume from a snapshot. We can do most of these functions through the compute console. We select the storage volume and select the Create Snapshot menu item.We can now restore this snapshot as a bootable disk and can create a new instance based on this volume. We restore by going to the storage snapshot tab, selecting the snapshot, and selecting Restore Volume from the menu. We can see the restored volume in the storage list.We can create an instance snapshot as well. The key limitation to creating a snapshot from an instance is that the disk needs to be non-persistent. This means that we have a disk that is deleted on termination rather than created and mounted as part of the instance. This is a little confusing at first. If you follow the default instance creation it creates a storage volume for you. You need to delete this storage volume and have it replaced by a ROOT disk that is deleted upon termination. If we walk through an instance creation we have to change our behavior when we get to the storage creation. The default creates a storage instance. We want to remove it and it will be automatically replaced by a nonpersistent volume. Once we have this hurdle removed, we can create an instance snapshot. We select the instance and click on the Create Snapshot from the menu item. If the menu item is greyed out we have a persistent storage volume as our boot image. We can create a bootable image from this snapshot by clicking on the menu for the snapshot and Associate Image with this snapshot. This allows us to create an instance from our image.The key to using instance snapshots is we create a bootable instance, configure it the way that we want and then create a snapshot of this instance. This gives us a golden master of not only the boot disk but of the network and customizations that we have done to the instance. You have to think a little differently when it comes to instance snapshots. It is a little strange not having a persistent root disk. It is a little strange knowing that any customizations will be lost on reboot. It is a little strange knowing that default log files will be wiped out on reboot. You need to plan a little differently and potentially reconfigure your logs, configurations, and customizations to go to another disk rather than a default root disk. If you think about it, this is not a bad thing. The root disk should be protected and not customized. Once you have the customized it should be frozen in time. One key advantage of this methodology is that you can't really insert a root kit into the kernel. These types of intrusions typically need to reboot to load the malware. Rebooting reverts you back to a safe and secure kernel and default libraries. This does mean that any packages or customizations will require a new snapshot for this customization to be persistent.In summary, snapshots are a good way of freezing storage and an instance in time. This is good for development and test allowing you to create a golden master that you can easily clone. It also adds a new level of security by freezing your boot disk with packages that you want and locks out malware that requires reboot. It does add a new layer of thought that is needed in that any package or root file customization requires a new golden image with a new snapshot. Hopefully this helps you think of how to use snapshots and create a best practice methodology for using snapshots.

Yesterday we went through and created an E-Business Suite 12.5.5 instance on three servers in the Oracle Public Cloud. On previous days we talked about how to protect these instances by hiding...

Iaas

E-Business Suite in the Oracle Cloud

For the last three days we talked about deploying multiple servers and securing the different layers by hiding services in the cloud. Today we are going to go through the installation processes and procedure to install E-Business Suite 12.2.5 (EBS) into a multiple compute cloud instance. We could use this instance for development and test, quality assurance, or even production work if we wanted to. Note that there are many ways to install EBS. We could install everything into Oracle Compute Cloud Services (IaaS) and install WebLogic and the Oracle Database onto Linux servers. We could install the database component into Oracle Database Cloud Services (DBaaS) and the rest on IaaS. We could also install the database into DBaaS, the application into Oracle Java Cloud Services (JaaS), and the rest in IaaS. The current recommended deployment scenario is to deploy everything into IaaS and bring your own licenses for EBS, Database, WebLogic, and Identity Servers. We are going to go through the tutorial for installing EBS on IaaS for this blog. We are going to go down the multi-node install which first requires installing the provisioning tools to boot all of the other images into standalone instances. We will need at least four compute instances with 500 GB of disk storage to deploy our test. The individual requirements are shown in the diagram below.Before we can start deploying we must first go to the Oracle Cloud Marketplace and download five EBS bootable images. We start by going to the marketplace and searching for "e-business" images. A list of the images that we need are shown in the diagram below.Step 1:Download EBS 12.2.5 Fresh Install DB Tier Image. This is done by selecting the image that is returned from the search. When we get to the image page we click on "Get App". This brings up a usage terms screen that we need to click on and click OK. Once we have accepted the terms we are presented with a list of cloud instances that we can deploy into. If you don't see a list of servers you need to go into your preferences for your instance and click the checkbox that allows you to provision the marketplace apps into your instance. You will also need Compute_Admin roles to provision these boot images. You don't need to go to the compute instance after you download the image. You are mainly trying to copy the DB Tier Image into your private images. Step 2:Download EBS 12.2.5 Demo DB Tier Image. Unfortunately there is no go back feature so you need to go to the marketplace page, search again for e-business, and select the Demo DB Tier Image.Step 3:Download EBS 12.2.5 Application Tier Image.Step 4:Download EBS OS-Only ImageStep 5:Download EBS Provisioning Tools ImageStep 6:Verify that all of the images are ready. You should get an email confirmation that the image is ready. You should also be able to create a new instance and see the images in the private images area. You should have five images available and we could create a bootable instance for all of them. Step 7:Create a compute instance using the Provisioning Tool image. We are going to go with an OC3 instance and accept the default. We will create a new security list and rule that allows http access. We do have to select the boot image from the private image list. You get to review this before it is provisioned.This will create an Orchestration that will create the bootable disk and boot the instance. It will take a few minutes to do this and once it is done we should have all of the provisioning tools ready to execute and deploy our multi-node EBS instance.Step 8:Connect to the server via ssh using opc. Get the ip address from the previous screen. When I first tried to connect I had to add default to the Security List otherwise the connection timed out. Once I added the ssh rule, everything worked as expected.Step 9:change user to oracle and execute knife oc image listYou will need the compute endpoint of the compute service because you will be prompted for it. To find this you need to go to the Compute Dashboard and look at the Compute Detail. The RESTapi Endpoint is shown but for our instance we need to change it a little bit. We have two zones associated with this domain. We want to connect to the z16 instead of the z17 zone. Once we enter the endpoint, identity domain, account id, and account password, we get a list of images that we can boot from. At the bottom of the list we see the EBS images and should be good to go. It is important to repeat that using the z17 zone will not show the images so we had to change over to the z16 zone. This is due to a Marketplace configuration that always deploys images into the lowest numbered zone for your instance.Step 10:Edit /u01/install/APPS/apps-unlimited-ebs/ProvisionEBS.xml and replace the id-domain and user name with the output of the knife command. It is important to note that your substitute command will be a little different from the screen shot below. I also had to change the OS-Image to include the date otherwise the perl script that we are about to execute will fail as well. The file name should be /Compute-obpm44952/pat.shuff@oracle.com/Oracle-E-Business-Suite-OS-Image-12032015 but your instance and user will be different. Step 11:Run perl /u01/install/APPS/apps-unlimited-ebs/ProvisionEBS.pl to start the install. This will ingest the xml file from the previous section and present you a menu system to install the other instances. The system will again ask for the restAPI Endpoint for the compute server, your restAPI Endpoint for storage (go to Dashboard and click on Storage to get this), your identity domain, account, and password again. For our test installation we selected option 3 for a multi-node single application server installation. The perl script then installs chef, pulls cookbooks, and installs the database, app server, and forms server instances into compute instances. This step will take a while. I recommend playing around with all of the options and configurations until you get comfortable with what you are installing. We were going for the demo installation rather than a dev/test installation. We went for a single app node and a single database node. We could have gone for multiple app nodes and gone with demo or dev deployments. Some of the screen shots from this process are below. We called our installation prsEBS so if you see references to this it relates to our installation. The process deploys orchestrations to the cloud services then starts these services in the Oracle Cloud. We can confirm that this is doing what is expected by looking at the Orchestration page under the compute console.When it is complete we will see four instances are running in compute. In summary, we are able to provision multiple instances that comprise a single application, E-Business Suite. This process is well documented and well scripted. Hopefully these screen shots and steps help you follow the online tutorial mentioned earlier. What is needed next is to apply the security principles that we talked about in the past few days to secure the database and hide it from the public.

For the last three days we talked about deploying multiple servers and securing the different layers by hiding services in the cloud. Today we are going to go through the installation processes...

Iaas

hiding a server in the cloud

There was a question the other day about securing a server and not letting anyone see it from the public internet. Yesterday we talked about enabling a web server to talk on the private network and not be visible from the public internet. The crux of the question was can we hide the console and shell access and only access the system from another machine in the Oracle Cloud. To review, we can configure ports into and out of a server by defining a Security Rule and Security List. The options that we have are to allow ports to communicate between the public-internet, sites, or instances. You can find out more about Security Lists from the online documentation. You must have the Compute_Operations role to be able to define a new Security List. With a Security List you can drop inbound packets without acknowledge or reject packets with acknowledgement. The recommended configuration is to Drop with no reply. The outbound policy allows you to permit, drop without acknowledgement or reject the packet with acknowledgement. The outbound policy allows you to have your program communicate with the outside world or lock down the instance. By default everything is configured to allow outbound and deny inbound. Once you have a Security List defined, you create exceptions to the list through Security Rules. You can find out more about Security Rules from the online documentation. You must have the Compute_Operations role to manage Security Rules. With rules you create a name for the rule and either enable or disable communications on a specific port. For example the defaultPublicSSHAccess is setup to allow communications on port 22 with traffic from the public-internet to your instance. This is mapped to the default Security List which allows console and command line login to Linux instances. For our discussion today we are going to create a new Security List that allows local instances to communicate via ssh and disable public access. We will create a Security Rule that creates the routing locally on port 22. We define a port by selecting a Security Application. In this example we want to allow ssh which corresponds to port 22. We additionally need to define the source and destination. We have the choice of connecting to a Security List or to Security IP List. The Security IP List is either to or from an instance, the pubilc internet, or a site. We can add other options using the Security IP List tab on the left side of the screen. If we look at the default definitions we see that instance is mapped to the instances that have been allocated into this administrative domain (AD). In our example this maps to 10.196.96.0/19, 10.2.0.0/26, 10.196.128.0/19 because these three private ip address ranges can be provisioned into our AD. The public internet is mapped to 0.0.0.0/0. The site is mapped to 10.110.239.128/26, 10.110.239.0/26, 10.110.239.192/26. Note that the netmask is the key difference between the site and instance definitions. Our exercise for today is to take our WebServer1 (or Instance 1 in the diagram below) and disable ssh access from the public internet. We also want to enable ssh from WebServer2 (or Instance 2) so that we can access the console and shell on this computer. We effectively want to hide WebServer1 from all public internet access and only allow proxy login to this server from WebServer2. The network topology will look likeStep 1:Go through the configuration steps (all 9 of them) from two days ago and configure one compute instance with an httpd server, ports open, and firewall disabled. We will call this instance WebServer1 and go with the default Security List that allows ssh from the public internet.Step 2:Repeat step 1 and call this instance WebServer2 and go with the default Security List that allows ssh from the public internet. Step 3:The first thing that we need to do is define a new Security List. For this security list we want to allow ssh on the private network and not on the public network. We will call this list privateSSH. Step 4:Now we need to define a Security Rule for port 22 and allow communication from the instance to our privateSSH Security List that we just created. We are allowing ssh on port 22 on the 10.x.x.x network but not the public network. Step 5:We now need to update the instance network options for WebServer1 and add the privateSSH Security List item and remove the default Security List. Before we make this change we have to setup a couple of things. We first copy the ssh keys from our desktop to the ~opc/.ssh directory to allow WebServer2 to ssh into WebServer1. We then test the ssh by logging into WebServer2 then ssh from WebServer2 to WebServer1. We currently can ssh into WebServer1 from our desktop. We can do this as opc just to test connectivity. Step 6:We add the privateSSH, remove default, and verify the Security List is configured properly for WebServer1. Step 7:Verify that we can still ssh from WebServer2 to WebServer1 but can not access WebServer1 from our desktop across the public internet. In this example we connect to WebServer1 as opc from our desktop. We then execute step 6 and try to connect again. We expect the second connection to fail.In summary, we have taken two web servers and hidden one from the public internet. We can log into a shell from the other web server but not from the public internet. We used web servers for this example because they are easy to test and play with. We could do something more complex like deploy PeopleSoft, JDE, E-Business Suite, or Primavera. Removing ssh access is the same and we can open up more ports for database or identity communication between the hidden and exposed services. The basic answer to the question of "can we hide a server from public internet access" the answer is yes. We can easily hide a server with Security Lists, Security Rules, Security IP Lists, and Security Applications. We can script these in Orchestrations or CLI scripts if we wanted to. In this blog we went through how to do this from the compute console and provided links to additional documentation to learn more about using the compute console to customize this for different applications.

There was a question the other day about securing a server and not letting anyone see it from the public internet. Yesterday we talked about enabling a web server to talk on the private network and...

Iaas

Networking 102 - part 2

Yesterday we looked at what it takes to start an Apache Web Server on a Linux instance in the Oracle Cloud. We had to create a security rule, security list, and associate it with a running instance as well as configure the firewall on the operating system. We took a castle defense strategy and turned off the firewall on the operating system and are trusting that the external cloud firewall into our server is good enough. Today we are going to drop that assumption and spin up a second server in the same compute zone and configure a secure communication between the two servers. The idea is if we have a database server as a back end to a shopping cart, we want to hide the database from public access. We want to be able to store credit card information, customers addresses and phone numbers and do so securely. We don't want to expose port 1521 to the public internet but only expose it to our Web/Application server and keep it secure in our cloud instance from anyone hacking into it. To achieve this level of security we will again look at our network diagram and realize that we can communicate on the private network interface rather than the public interface that is facing the public internet. What we want to do is change the network configuration on instance 1 to only listen for traffic from instance 2 and have instance 2 open to the public internet. We will do this with an Apache Web Server because it is easy to configure and test.Step 1:Go through the configuration steps (all 9 of them) from yesterday and configure one compute instance with an httpd server, ports open, and firewall disabled. We are going to fix the security issues in the upcoming steps and make them only accessible from the second instance that we are about to spin up.Step 2:Create a second Oracle Linux instance on the same compute cloud. This is done by going into the Create Instance, selecting Oracle Linux 6.6 and accepting the default configurations. A few minutes later we should have a second Linux instance that we can play with. Note that we could have cheated by creating a snapshot of our first instance and spinning up a new instance based on the first instance. This would save us a few steps and configuration options if our installation were more complex. We will save that for another day. Today we will provision an Oracle Linux 6.6 instance with WebServer as the security list which opens up port 80 and 22 for the instance. We accept all of the other defaults. Step 3:Log into our instance as opc by getting the public ip address from the compute console and using ssh on a Mac or putty on Windows. Once we log in we install w get as a package so that we can read web pages from WebServer1.Step 4:We can now read the web page from WebServer1 by getting the index.html page from the public ip address as well as the private ip address. We find these ip addresses from the compute console.Step 5:Now that we can read from the private ip address, we can turn off the public ip address from WebServer1 and communicate on the 10.196 network. This is done by changing the Security List from WebServer back to default for WebServer1. We add the default in the security list and remove the WebServer in the security list. Step 6:We can test the interface by repeating the read from the http server. We will get a timeout on the public ip address and timeout on the private ip address as well. We will need to create a new security list that allows network communications from the 10.196 network on port 80 to get to the server.Step 7:We need to define a new Security List that allows port 80 on network 10.196. This is done by going to the Network tab on the compute console and defining a new security list. We will call it privateHttp. Once we have this defined we will allow http on port 80 on the private network but not the public network. We create a security rule for privateWebServer that allows us to go from an instance using port 80 to local instances. Once we have this defined we need to add the privateHttp in the security list for the WebServer1 instance. Step 7a - add privateHttp to security listStep 7b - add privateWebServer to security ruleStep 7c - associate new security list with instanceStep 8:Verify that we have connectivity on private network but lack of connectivity on public networkIn summary, we created a new compute instance and reconfigured the network for our two compute instances. The goal was to setup our WebServer2 instance so that we could server the public internet with an Apache Web Server. Note that we did not go through these steps because we did this yesterday. We wanted to have WebServer2 talk to WebServer1 but do it on the private network and not have WebServer1 accessible from the public internet. We used an Apache Web Server as the example because it is easy to configure. We could have made this an identity server, a database, a file server, or any service that we want. The key difference would be the port that we create for the communication and security rule/list. Think of running EBusiness Suite or JD Edwards. We really don't want port 1521 of our database exposed to the public internet but we do want the http server exposed. If we run the ERP database on a separate server we need a secure way of communicating with our WebLogic server that is running the ERP logic without exposing drivers license numbers, credit cards, or private information to the public cloud. Hopefully this example allows you to take this concept with web servers and deploy more complex systems into the public cloud securely. It is important to note that we didn't fix the iptables issue and have the firewall turned off for the Linux instance on WebServer1. This is not best practice but we will leave that for another day.

Yesterday we looked at what it takes to start an Apache Web Server on a Linux instance in the Oracle Cloud. We had to create a security rule, security list, and associate it with a running instance as...

Iaas

Networking 102

This week we are going to go through some basic networking tutorials. It is important to understand how to do some simple stuff before we can do more complex stuff. We are going to start out by deploying an Oracle Linux instance, installing the Apache httpd service, and opening up port 80 to the world. This is a simple task but it helps understand how to find the ip address of your instance, how to open a port in the operating system as well as the cloud compute console, and how to connect to the instance from your desktop and from a second instance that we will spin up. Later this week we will spin up two Oracle Linux instances and configure an Apache httpd service on one but configure it only to talk to the other instance in the cloud. Step 1: Open the Oracle Compute Console and Create Instance. We are going to create an Oracle Linux 6.6 instance. We are not going to do anything special but accept the defaults. We will call our server WebServer1 and give it the default network connection so that we can ssh into the instance.After a few minutes we should have a Linux instance that has just port 22 open and we can ssh into the server. We don't have an Apache Web Server installed and if we did port 80 is locked down in the operating system and cloud networking interfaces. Step 2: Connect to our instance with ssh to verify that we have command line access. We connect as opc so that we can execute commands as root. In this example we do this with the ssh command in a terminal window from MacOS. We could have just as easily used putty from a Windows box to make the connection.Step 3: Add the Apache httpd software to the Linux instance with yum. We could just as easily have downloaded the software from apache.org and installed it that way but yum allows us to do this quickly and easily in one step. You need to make sure that you logged in as opc in the previous step because to sudo command will not work if you logged in as oracle. Note that the first time that you run this command it will take a while because you have to download all of the manifests for the different kernel versions and check for dependencies. The httpd package does not need many extras so the install is relatively clean. It will take a while to download the manifests but the actual install should not take long.Step 4:Configure the httpd software to run by editing the index.html file and starting the service. Note that this will not allow us to see the service anywhere other than on this computer because we need to enable port 80 in the operating system and in the cloud service to pass the requests from the client to the operating system.Step 5:Configure the cloud service to pass port 80 from the public internet to our instance. This is done in the Compute Console by clicking on the Networking tab and creating a new Security List. In this example we are going to create a new list that includes http and ssh as the protocols that we will pass through. We first create a Security List. We will call it WebServer. Step 6:Configure port 80 as a Security Rule for the Security List that we just created. We create a rule for http and a rule for ssh. We then verify that the new rule has been created. Note that our instance is associated with the default rule, We need to change that in the next step.Step 7:Associate our new rule with our instance. This is done by going into the Instance tab and clicking on View instance. We want to see what Security List is associated with our instance and change it. We are initially connected to the default list which only contains ssh. We want to add WebServer list and then delete the default list. The resulting list should only contain our WebServer list which enables ssh and http. We can easily now add https or sftp if we wanted to to help maintain our web server and not effect any other instances that are using the default rule/list. Step 8:We now need to open up the ports in the operating system. This is done by modifying the SELINUX interface and iptables interface. We want to let traffic come into the server on port 80 so we can either turn off these services or add an iptables rule to allow everything on port 80 to pass through. We can disable all firewall rules by turning off the SELINUX services and iptables as shown below. It is not recommended to do this because it opens up all ports and makes your operating system vulnerable to attacks if other ports are open to this machine or other machines inside the same rack that you are running in. You can either watch a video or execute the commands shown on a tutorial web site that disables SELINUX and iptables. The important thing is to set SELINUX=disabled and turn off the iptables services for all of this to work. Step 9:To test the changes, open a browser and try to attach to the Apache server. We should be able to go to the public ip address with a simple web client and get the index.html file. We should get back the message "I am here!" on the web page. Again, this is the insecure way of doing this. We really want to customize iptables to allows port 80 to pass and deny everything else that is not ssh. In summary, we configured a Linux server, installed the Apache httpd, and configured the network rules at the cloud console and at the operating system to allow traffic to pass from the public internet into our compute instance. We are blocking all traffic at the cloud interface other than ports 80 and 22. Even though it is poor practice we disabled the firewall on the compute operating system and are allowing all traffic in and using our cloud instance as a firewall. This is not good practice because other compute services in the data center can access these open ports. We will dive deeper into that tomorrow and look at turning the operating system firewall back on and configuring it properly. We will also look at inter server communications inside the data center to allow hiding services from public access but allowing our front end public facing server to access the services securely.

This week we are going to go through some basic networking tutorials. It is important to understand how to do some simple stuff before we can do more complex stuff. We are going to start out by...

Iaas

New features in the Oracle Compute Cloud

Today Oracle updated the Oracle Compute Cloud Service by adding three new features. Integration of the Oracle Marketplace into the Compute Cloud Console to make it easier to deploy custom solutionsExpanding the functionality of Backup Services to snapshot Compute Instances and clone them from snapshotsMaking it easier to import existing virtual machines into the Oracle Cloud and making these images available to the Public and Private MarketplaceWe talked earlier this week on pulling an image from the Oracle Marketplace. Previous to today you had to go to cloud.oracle.com/marketplace, setup preferences to link your account to your compute account, get an app from the marketplace, and provision the instance through the compute cloud. Today we just need to go into the create instance menu system from the Compute Console and select an image from the Marketplace to provision into a compute instance. This modification reduces the number of steps required to use the Marketplace as well as making it easier to provision preconfigured solutions into a compute instance or set of compute instances. Note the Marketplace tab below the Private Images. A list of instances in the Marketplace along with a search engine are integrated into the provisioning of a new compute instance. Compute instances now also have a backup tab similar to the monitor tab that was introduced a few weeks ago. This allows you to create snapshots of whole instances, save the snapshot as a bootable image, and provision new instances based on this snapshot. This allows you to provision a predefined instance from a default OS or Marketplace instance, customize it, take a snapshot, then provision new instances from the customized installation.The third new feature release is for users to have the ability to import VMWare instances directly into the Compute Cloud private images. The goal of this release is to allow users to import VMDK formatted images into the Oracle Cloud and run them with little or no modifications. This includes defining multiple network interfaces at import time rather than having to go back and configure multiple interfaces after the fact. The import does not require the users to modify the network drivers before importing but leverages the experience of the Ravello team for translating VMWare definitions into Oracle Compute Cloud definitions using Orchestration to create the network definition and provision it as the compute instance is started. In summary, three new features were released today to make it easier to use the Oracle Compute Cloud Service. This is an ongoing improvement of services to help allow for frictionless migration of services from your data center into the cloud. These improvements along with those that will be announced between now and Oracle OpenWorld in September will help users treat the Oracle Public Cloud as an extension of their own data center for capacity expansion, disaster recovery, and development and test.

Today Oracle updated the Oracle Compute Cloud Service by adding three new features. Integration of the Oracle Marketplace into the Compute Cloud Console to make it easier to deploy custom solutions Expa...

Iaas

Ravello cloud virtualization

Yesterday we talked about what it would take to go from a bare metal solution or virtualized solution in your data center to a cloud vendor. We found out that it is not only difficult but it requires some work to make it happen. There are tools to convert your VMDK code to AMI with Amazon or VHD with Microsoft or tar.gz format with Oracle. That's the fundamental problem. There are tools to convert. You can't just simply pull a backup of your bare metal install or the VMDK code and upload and run it. Java ran into this problem during the early years. You could not take a C or C++ bundle of code, take the binary and run in on a Mac or Linux or Windows. You had to recompile your code and hopefully the libc or libc++ library was compatible from operating system to operating system. A simple recompile should solve the problem but the majority of the time it required a conditional compile or different library on a different operating system to make things work. The basic problem was things like network connections or reading and writing from a disk was radically different. On Windows you use a forward slash and on Linux and MacOS you use a backslash. File names and length are different and can or can't use different characters. Unfortunately, the same is true in cloud world. A virtual network interface is not the same between all of the vendors. Network storage might be accessible through an iSCSI mount, an NFS mount, or only a REST API. The virtual compute definition changes from cloud vendor to cloud vendor thus creating a need for a virtualization shim similar to a programming shim as Java did a few decades ago. Ravello stepped in and filled this gap for the four major cloud vendors.Ravello Systems stepped in a few years ago and took the VMDK disk image proposed by VMWare and wrote three components to virtualize a cloud vendor to look like a VMWare system. The three components are nested virtualization, software defined networking, and virtual storage interfaces. The idea was to take not only a single system that made up a solution but a group of VMWare instances and import them into a cloud vendor unchanged. The user took a graphical user interface and mapped the network relationships between the instances and deployed these virtual images into a cloud vendor. The basis of the solution was to deploy the Ravello HVX hypervisor emulator onto a compute instance in the cloud vendor for each instance then deploy the VMWare VMDK on top of the HVX instance. Once this was done the storage and network interfaces were mapped according to the graphical user interface connections and the VMDK code could run unchanged.Running a virtual instance unchanged was a radical concept. So radical that Oracle purchased Ravello Systems early this spring and expanded the sales force of the organization. The three key challenges faced by Ravello was that 50% of the workloads that run in customer data centers do not port well to the cloud, many of these applications utilize layer 2 IP protocols which are typically not available in most cloud environments, and VMWare implementations on different hardware vendors generate different virtual code and configurations enough to make it difficult to map it to any cloud vendor. The first solution was to virtualize the VMWare ESX and ESXi environment and layer it on top of multiple cloud vendor solutions. When an admin allocates a processor does this mean a thread as it does in AWS or a core as it does in Azure and Oracle? When a network is allocated and given a NAT configuration, can this be done on the cloud infrastructure or does it need to be emulated in the HVX?The nested virtualization engine was designed to run VMWare saved code natively without change. Devices from the cloud vendor were exposed to the code as VMWare devices and virtual devices. The concept was to minimize the differences between different cloud solutions and make the processor and hypervisor look as much like ESX and ESXi as possible. HVX employs a technology called Binary Translation to implement high-performance virtualization that does not require these virtualization extensions. When virtualization extensions are available, the easiest way to implement the illusion is using "trap and emulate" .Trap and emulate works as follows. The hypervisor configures the processor so that any instruction that can potentially "break the illusion" (e.g., accessing the memory of the hypervisor itself) will generate a "trap". This trap will interrupt the guest and will transfer control to the hypervisor. The hypervisor then examines the offending instruction, emulates it in a safe way, and then it will allow the guest to continue executing. HVX, the Ravello hypervisor, uses a technology called binary translation. Unlike the trap-and-emulate method, binary translation does work when virtualization extensions are not available.Pure L2 access is difficult and VLANs, span ports, broadcast/multicasting usually do not work. Ravello allows you to run existing multi-VM applications unmodified in the cloud, not just single virtual machines. To make this possible, Ravello provides a software-defined network that virtualizes the connectivity between the virtual machines in an application. The virtual network is completely user-defined and can include multiple subnets, routers, and supplemental services such as DHCP, DNS servers and firewalls. The virtual network can be made to look exactly like a datacenter network. The data plane of the virtual network is formed by a fully distributed virtual switch and virtual router software component that resides within HVX. Network packets that are sent by a VM are intercepted and injected into the switch. The switch operates very similar to a regular network switch. For each virtual network device, the virtual switch creates a virtual port that handles incoming and outgoing packets from the connected virtual NIC device. Ravello’s storage overlay solution focuses on performance, persistence and security. It abstracts native cloud storage primitives such as object storage and various types of block devices into local block devices exposed directly to the guest VMs. Everything from the device type and controller type to the location on the PCI bus remains the same. Hence it appears to the guest as-if it was running in its original data-centre infrastructure. This allows the guest VM to run exactly as is with its storage configuration as if it was running on premises. Cloud storage abstraction (and presentation as a local block device), coupled with the HVX overlay networking capabilities allows for running various NAS appliances and their consumption over network based protocols such as iSCSI, NFS, CIFS and SMB. These block devices are backed by a high performance copy-on-write filesystem which allows us to implement our multi-VM incremental snapshot feature.We could walk through a hands on lab developed by the Ravello team to show how to import a Primavera on site deployment into the Oracle Compute Cloud. The block diagram looks like the picture shown below. We import all of the VMDK files and connect the instances together using the GUI based application configuration tool. Once we have the instances imported we can configure the network interfaces by adding a virtual switch, virtual gateway, virtual nic, assigning public IP addresses, and adding a VLAN to the configuration.Ravello allows us to define features that are not supported with cloud vendors. For example, Amazon and Microsoft don't allow layer 2 routing and multicast broadcasting. VMWare allows for both. The HVX layer traps these calls and emulates these features by doing things like ping over TCP or multicast broadcasts by opening connections to all hosts on the network and sending packets to each host. In summary, Ravello allows you to take your existing virtualization engine from VMWare and deploys it to virtually any cloud compute engine. The HVX hypervisor provides the shim and even expands some of the features and functions that VMWare provides to cloud vendors. Functions like layer 2 routing, VLAN tagging, and multicast/broadcast packets are supported through the HVX layer between instances.

Yesterday we talked about what it would take to go from a bare metal solution or virtualized solution in your data center to a cloud vendor. We found out that it is not only difficult but it requires...

Iaas

uploading custom boot image for compute cloud

One of the things that we talked about yesterday was uploading an image for a bundled solution into the Oracle Cloud Marketplace. Partners can register to bundle and sell solutions on the Oracle Cloud by providing a bootable image and have that image run in the Oracle Compute Cloud. Some examples of this areJDE 9.2 Trial Edition by OracleWordPress on Oracle Linux 6.7 by BitnamiChef Server by Chef.ioAll of these solutions started with a base operating system and layered other products onto the operating system. They then took the resulting virtual machine configuration and bundled it into a tar.gz file and uploaded it to the Marketplace. Once this is done it can be offered as a product through the search engine and tracked, marketed, and sold by the partner. The resulting boot image is loaded into your private images and allows you to boot these instances as a compute service where you select the processor, memory, disk, and network configurations. The beauty of this interface is that it is the same that you would use if you wanted to create a custom boot image of your own. This enables features like snapshots of instances so that you can clone instances, replicate instances for scaling, and backup instances to a common storage location for fast disaster recovery.Before we dive into how to create a bootable image for the Oracle Compute Cloud, let's review some basics so that we have the same terminology and a common language to discuss how to take an instance from your data center and run it in the Oracle Compute Cloud. We will also look at the steps needed for AWS and Azure and what it takes to upload an image to their compute clouds. Some of the different ways that you can run an application in your data center areBare Metal - load an operating system on a computer and load applications on the operating system. Backup the instance and restore into cloudOracle VirtualBox - create a virtual instance on your desktop/laptop/server and convert the resulting OVA/VDI into an uploadable formatVMWare ESX - create a virtual instance on your VMWare cluster and convert the resulting VMDK images into an uploadable formatCitrix or public domain Xen Server - create a virtual instance on your Xen server and convert the VMDK images into an uploadable formatMicrosoft HyperV - create a virtual instance on your HyperV server and convert the VHD image into an uploadable formatThe key difference between all of these steps are the conversion into an uploadable format for the target cloud provider. Oracle Compute Cloud currently requires that you create a tar.gz binary from a supported operating system and load it into your private image space. Documentation on creating your own image is available from Oracle and goes into detail on what operating systems are currently supported and references a Tutorial on building a LAMP Stack instance. In this example the tutorial uses VirtualBox to boot from an iso that is downloaded from edelivery.oracle.com which is the traditional way of building an image on bare metal or a virtual server. We will not go through the installation process with images as we usually do because the tutorial does a very good job of showing you how to do this. The process takes a while (budget a half day) to setup VirtualBox, download the iso image, load and boot the Linux instance, patch the instance, install the Apache Web Server, MySQL, and PHP. The most difficult part of this is configuration of the network. The tutorial suggests that you turn off selinux and iptables. We suggest that you go to the extra effort to enable these services and open up the ports needed to communicate to your services. Leaving these services open inside any cloud providers internal network and relying upon the external firewall and routing rules is a slippery slope to insecurity. We suggest multiple layers of security at the cloud admin layer, at the network configuration layer with whitelist routing, and at the operating system layer. You might need to first allow connection to port 80 from anywhere at the operating system layer and tighten up these rules once you know where your clients are coming from but turning off security in the operating system is not recommended.Microsoft provides a writeup on how to import a Linux image which could be configured with the LAMP stack. It is important to note that the article assumes that you are starting with a VHD file or have converted your existing file format into VHD for upload. The writeup also assumes that you have the Azure Command Line or PowerShell configured and installed which again assumes that you are starting with a Windows desktop to do the heavy lifting. There are few blogs that detail how to do this. Most recommend using Bitnami to load images or load a Linux image and rebuild the instance in Azure rather than building one of your own and uploading it. Most of the blog entries talk about having the para-virtual drivers installed to enable HyperV to properly work once the image is running.Amazon provides a good documentation and tutorials on importing images from VMWare, Windows, and other formats. Their tools allow for either command line import or web based imports and converts them to Amazon AMI format. It is important to note that once you have imported the images you do need to reconfigure networking and security as must be done with almost all of the other cloud vendor solutions. The only true exception to this is Ravello Systems which is available in the Amazon, Google, Azure, and the Oracle Cloud. Ravello allows you to import your VMWare images unchanged and run them in all four cloud providers. Note that this is different from converting it to a runnable image in the cloud provider format. Ravello uses a hypervisor emulator that uploads a shim to translate VMWare calls into cloud provider calls and interfaces. In summary, all of the cloud providers allow you to import existing in house images into their cloud. All use a format translator to translate the original format into the cloud provider format. The single exception is using the Ravello import utility that takes VMWare format and imports it unchanged and runs it in the cloud provider of your choice. The key difference between the different import mechanisms are what tools are needed. Do you need to start with a Windows desktop and use PowerShell to import the image? Do you need to install a command line utility that runs on most operating systems and convert your image to a custom image? Can you upload and download the image from the cloud provider to maintain version control in your data center? Does the cloud provider have the ability to run these images in your data center as if they were cloud services on a machine under your security umbrella? The more we dive into topics the less we seem to get answers but more and more questions. Hopefully today's blog gives you more insight into running existing applications on different cloud vendor platforms and what control you give up, what options you have, and what it takes to import virtual images from your data center into a cloud solution.

One of the things that we talked about yesterday was uploading an image for a bundled solution into the Oracle Cloud Marketplace. Partners can register to bundle and sell solutions on the Oracle Cloud...

Iaas

Cloud Marketplace part 2

The Oracle Cloud Marketplace is a location where customers can discover partner provided applications and services that complement the Oracle Public Cloud services. It is also a location where partners can list and promote cloud based applications and services that extend, integrate with, or build on Oracles Public Cloud services. This is an interesting concept that has various ways of joining customers to partners and giving partners view into what customers are looking at, downloading, and executing. There are over 1500 apps, both commercial and public domain compiled source, available from over 1500 partners and system integrators. Some applications are helper apps like CloudBerry Explorer that assists you in consuming cloud storage services. Other applications are Oracle provided deployments of E-Business Suite running on infrastructure as a service for development and test. The Marketplace has system integrators listed that specialize in cloud services. Deloitte is an example of a Cloud Elite Systems Integrator who provides a variety of consulting services to help build custom cloud solutions for customers. For partners, there are tools that allow you to develop, market, and sell applications. The development allows you to build binaries that you can download to use as tools to access cloud services, bundles to upload to the cloud, or images that can be launched in the cloud. The tools also exist to help you market your application once it is built and uploaded. You can look at page views, who downloaded your application, and geographic data about the customers to help target marketing campaigns. There are also tools to help with revenue capture and making money from your application either through the Marketplace page or redirection to your own page. There are lead generation integrations into the partner portal to help with follow up and calling campaigns for customers that are downloading and using applications. Partners must be silver level or above and get this service for free by signing up through the Oracle Partner Portal. Partners also have access to BI reports to look at trending, usage, and utilization of the applications that they have listed. These reports are designed with partners like Bitnami who lists hundreds of public domain compiled images on the Oracle Cloud for compute resources. The reports help them look at the most popular download, packages that are trending, feedback from customers both positive and negative, as well as the age of packages and ones that might need updating.Customers can search for applications based on key words, categories of applications, and company names. You do need to enable the Compute_Operations role before you deploy an image into the Oracle Cloud as well as going into the Properties tab at the top right of the Cloud Console and creating a linkage between the Marketplace and your cloud account. In summary, tools exist for customers and partners to help link people who want to create applications with people who want to use applications. There is a review mechanism to help with feedback as well as notifications of updates and changes to applications. This tool is constantly being updated and changing so don't be surprised if the screen shots that you see here are a little different when you visit the pages.

The Oracle Cloud Marketplace is a location where customers can discover partner provided applications and services that complement the Oracle Public Cloud services. It is also a location where...

Iaas

Cloud Marketplace

At OpenWorld 2013, Oracle announced the Cloud Marketplace. This is a place where customers can download or purchase solutions that are not stock out of the box images to boot from. For example, if you want a LAMP stack (Linux, Apache, MySQL, PHP) you could boot a Linux image, install the three other components as well as update the operating system with the latest patches. Alternatively you could search for LAMP in the Marketplace and get an image in a few minutes rather than building one. The logic behind the Marketplace is that you have an account that is authorized to create a Compute Instance and give the account permission to upload bootable images from the Marketplace. You also have to go into the Preferences menu and check the box that allows you to use this account to upload images. Once you have the user that has permissions to create compute instances and a user that has permission to upload images, you can select an image and upload it to a cloud account. This is done easily by searching for an image. There are free images and images that you pay for. The pay images are typically bundled solutions that consist of one or more service with production quality software to solve a specific business problem. Oracle has many partners that provide both pay and free services for customers to use. If, for example, we wanted to run a LAMP image, we would search for LAMP in the Marketplace.We have the option of selecting a variety of Linux implementations (Oracle Linux, Ubuntu, etc) and loading this selection as a boot image. We will select an Oracle Linux implementation as an example.When we select Get App it asks us to agree to legal terms and select the instance that we want the app to run in. It is important to note that the instance will default drop into the first zone available in your list. If you account has multiple zones associated with it, the bootable image will be dropped into the first zone and might or might not be available in the other zones. This is a known bug with the Marketplace that should be fixed if it has not already been fixes at the time of this posting.Once you are done uploading the bootable disk you get a link to launch the compute console.We can either continue with the provisioning of the instance by following the wizard or exiting out and launching a new instance. It is important to note that the new boot image will be listed as a private image since we uploaded it from the Marketplace. This allows us to load public or private images from a public or private Marketplace. In June the Marketplace was changed to take advantage of Orchestrations. We talked about how these work two weeks ago and went through the configuration files. This week we will just accept the fast that these files will spin up our instance with an Apache Web Server, MySQL, and PHP. The entire process took just a few minutes and we have a running server. We can spin up new instances by copying the json Orchestration files and launch as many instances that we would like.In summary, we used the Cloud Marketplace to load a pre-configure instance and launch it through either a compute wizard or by launching a new instance. You can learn more about the Marketplace in the online Cloud Marketplace documentation. Tomorrow we will talk about how to register as a partner to create images that you can sell or use as marketing tools to promote the name of your company.

At OpenWorld 2013, Oracle announced the Cloud Marketplace. This is a place where customers can download or purchase solutions that are not stock out of the box images to boot from. For example, if you...

Iaas

networking wrap up

Today we are going to wrap up our detailed discussion with networking. We will probably revisit network performance at a later time but we are going to get out of the weeds today and talk about things at a slightly higher level. Yesterday we looked at screen shots of the Oracle, AWS, and Azure security lists and rules. It is important to note that Azure only offers TCP and UDP as rules. You can't define firewall and routing rules for other protocols. Amazon allows for ICMP firewall and routine rules which Oracle also allows. The big question in all of this becomes "so what". So what if you don't support ICMP? What functionality do you loose if you don't provide this packet header on top of IP? First, let's review what ICMP is. ICMP stands for Internet Control Message Protocol and is defined in RFC 792. It is primarily used for diagnostic and control as well as discovery of services on a network. ICMP packets are treated differently from normal IP packets because they typically require a response to a query or an error code to be returned as part of the response. These packets are good for testing latency and connectivity between machines that need to traverse complex networks. Oracle uses this protocol as part of the keep alive heartbeat of a Database RAC configuration. The networking requirements of RAC also require ARP support as well as multicast within the same subnet. This basically means that RAC will never work on an Azure compute cluster because ICMP and multicast are not supported. Amazon has written a whitepaper on running RAC on AWS but it is not recommended to run RAC with a multi-host configuration that simulates shared storage. This looks like a good science experiment but not a production solution. Amazon basically engineered around the multicast requirements needed for shared storage by creating a message protocol at the operating system layer.ICMP is also good for network discovery and monitoring network integrity. Oracle also uses this protocol with the Oracle Advanced Support Gateway. The support gateway is a tool that Oracle uses to manage services inside a customer data center. The Oracle Cloud Machine and Exadata/SuperCluster Managed Services products also use ICMP to verify the integrity of the network and report timing and connectivity issues when something happens on the network. The typical structures that are used with the gateway are looking for a message 0 request (standard ping echo request) and a message 8 request (ping echo reply and trace route data) to show network viability. Error codes returned in message 3 are also inspected to see if network configurations have changed or been modified. Oracle Enterprise Manager also uses ICMP packets as part of a beacon communication and network discovery operations. This does imply that you can not use Enterprise Manager to manage host targets on Azure but you can on AWS. I have successfully configured Linux hosts as well as database instances in Amazon RDS and connected them to Enterprise Manager. This is a very powerful feature that allows you to schedule backups, replicate data from on premise to the cloud, and examine changes in a dev/test environment in the cloud and apply them to your production environment. Without ICMP you loose the ability to connect Enterprise Manager and these higher level functions. According to some Microsoft msdn blogs you can add ICMP to a Windows VM inside their firewall so you can ping between VMs but going across the internet is not necessarily supported. A second msdn blog suggests using alternate applications like TCPing and NMap to work around not having ICMP. This does not solve connecting with Enterprise Manager unless you run an OEM instance in Azure to monitor and measure all servers running in Azure. The second protocol that Oracle supports that Amazon and Microsoft do not support is the GRE protocol. GRE stands for Generic Routing Encapsulation and is defined in RFC 2784 and RFC 2890. This protocol is used for point to point tunneling and IPSec for passing routing information between connected networks. Oracle currently uses this protocol to create Corente VPN services. The protocol was designed by Cisco Systems so connecting a Cisco router for a VPN connection is relatively easy with this protocol. You can connect with other protocols for a VPN connection but this layer has the mechanisms to keep routing tables in sync without having to run applications to talk to routers and update maps. This is more of an efficiency of networking than a functionality of networking that we saw with the ICMP support. We will talk about VPN services in a later blog. The general concept behind VPN is you would like the computers in your data center to talk to computers in the cloud as if they were on your corporate network. If you create a virtual private network the ip addresses that the computers in your data center use are the same ip addresses that are used in the cloud. The VPN creates a routing protocol that translates the virtual ip addresses which are typically a non routable address to an actual address that gets routed across the internet to the cloud provider. A VPN server on the cloud side then translates these packets to the non routable ip address and it looks like the request came from a machine on the local network. This simplifies network topology and configurations if we can extend our corporate network into a cloud network and have them operates as if they looked like they were on the same network. The tricks with with solution is that network changes need to be constantly updated and latency between your data center and the cloud data center can kill performance. The GRE protocol helps solve route table updates. Products like FastConnect help reduce latency by providing a fast path across the internet that you pay for on a monthly basis.In summary, the protocols that cloud vendors support have important considerations in architecting a solution. Going with Azure basically prohibits you from using all of Enterprise Manager to manage servers in the Microsoft cloud. You can look at the database by connecting to port 1521 (with firewall rules set properly) but you will get host down when looking for operating system and host information. You can also see higher level services like WebLogic servers or App services because these protocols all run on TCP and connect to ports. The basic host information will not be available. Not supporting the GRE protocol is less of a functionality issue and more of a performance issue. Many Oracle customers are looking at FastConnect as a way of getting 1 GigE or 10 GigE connectivity but for simple workloads like database backup to storage operate good enough without having to go with the additional network cost. Again, this blog is not intended to say that one cloud vendor is superior to another. It is intended to help you decide which cloud provider will give you the service that you want. Feedback and comments are welcome.

Today we are going to wrap up our detailed discussion with networking. We will probably revisit network performance at a later time but we are going to get out of the weeds today and talk about things...

Iaas

networking - practical

Today we are going to explore how to configure and setup the basics of networking for a Linux compute instance in the Oracle Cloud. If you would like to read more about network configurations you should refer to the Oracle Compute Cloud Services (IaaS) documentation. We are specifically interested in chapter 7, Configuring Network Settings. Terms like security list, security rules, and roles play a part of the configuration. By default security is locked down and no traffic can be received from outside the host. It is important to note that the demo accounts that you get when you click the Try Me button on http://cloud.oracle.com do not allow you to create a list of valid ip addresses but allow you to either share ports with the public internet or not. This is mainly for simplicity when running and configuring the demo accounts. When you get a commercial paid account you get full access to restrict access by ip address, ip range, or list of computers. If we log into our compute console we can see a list of instances that exist for an account. In our example we have four servers defined. One is Oracle Linux, one is CentOS7, one is a database server, and the fourth is a WebLogic server. If we click on the Network tab we see the Security Rules, Security Lists, Security Applicaitons, and Security IP Lists. It is important to realize that Oracle takes a different approach when provisioning servers. The server is first provisioned with only SSH or RDP as the default rule or a security list that you create. In this example we see four different lists. The bitnami-moodle definition on the list opens up port 80 and 443 for a web server. The database definition prs12cHP opens up port 1521. The prsJava definitions open up administration ports as well as ports 80 and 443. The default security list only opens up port 22 for ssh connections. If we look at the default security list the default operation is to deny all inbound traffic for computers not in the security list and drop the packets with no reply. We could configure reject with reply but this might lead to a denial of service attack with someone constantly sending TCP/IP requests to our server just to overload the server and network with TCP ack packets. By default the configuration is to drop packets and this typically happens at the border gateway rather than at our compute server. The outbound definition gives you the option of allowing packets, rejecting packets with an ack, and dropping packets with no ack. It is important to communicate to your users how you configure your server. If you configure outbound for deny with no reply they might end up troubleshooting network connection issues when it is by design dropping packets and it is not a router or connection issue. Note that the concept of security list is a little misleading. For all of our instances we have an inbound policy of deny and an outbound policy of permit. Why not go with one security list and map all instances to this configuration? The key is in the security rules definition. We create a definition of a rule that maps security applications to a source and destination. By application we really mean a port number for an application. The source is where the packet is coming from and the destination is where the packet is going to. Since we have a permit all outbound traffic we only need to define the exceptions to the rule for inbound traffic. If, for example, we defined a deny inbound and deny outbound we would need to define the exception for both directions. If you look at the security rule definitions we are defining the source as the public-internet and the destination as each of our servers. Security rules are essentially firewall rules. This permits traffic from your compute instance and can be used in different security lists as well as specific definitions between instances and external hosts. Yesterday we talked about turning off public ssh for a database server and only allowing ssh into the database server from our Java server. We would do this by turning off public-internet access over port 22 into the database server and allowing port 22 from our Java server to our database server. To access the database we would have to have public access of port 22 into the Java server, require the user to log in to the command line then ssh across to the database server using port 22 from the Java server to the database server. With this we can hide our database instance from the public internet but still allow access to the console to manage it. We will need to define an outbound rule that allows the database server to reach out and pull down patches if we want or require staging patches from the Java server to the database server by turning off all outbound traffic and only allowing port 1521 to and from the Java server. Note that we create a rule association by defining the security application and associating it with a source and destination. When we create a security rule we define if it is enable or disable as well as the port or port ranges that we want to open. We can identify the source either with a security list or specific ip lists. If we go with a Security IP List we can define a specific instance, a subnet (site), or the public internet. We can do the same for the destination and specify a security list or specific ip lists. This effectively creates a virtual software defined network that maps packet routing to and from your instance. If we look at the moodle server that we have running, for example, we have three security applications open. The first is ssh which allows us to connect to a shell and configure services. The second is http which maps to port 80 if we look at the Security Applications. The third is https which maps to port 443. These three ports are the only ports that are open and they are open to the public-internet as the source. We have a permit outbound rule so that the moodle server can pull in images from our storage servers, get updates with command line tools from other web servers, and download updates to the moodle server as needed from bitnami. We could just as easily have set the outbound policy to deny and only allow http, https, and ssh connections to this server inbound and outbound. Note that this process and procedure is very similar to the way that Amazon AWS and Microsoft Azure define network rules. With AWS you go through the VPC Dashboard and define a VPN Connection. You create Security Groups that defines the ports and access rights. For example the bch-shared-firewall-34877 opens up ports 22, 80, and 443 to the public internet. The source of 0.0.0.0/0 is associated with the public internet. Note that we also have another rule that maps us to the 184.72.221.134 server for management. Once we define the inbound rules we can associate it with a VPN connection or gateway and define the inbound and outbound rules as we do on the Oracle Compute Cloud. Azure does something similar and allows you to define ports or sets of ports when you create the instance. Note that TCP and UDP are the protocols that are allowed. This tends to imply the ICMP and other protocols are restricted in the Microsoft network. This typically is not a big deal but does have implications on how and what you can deploy in the Microsoft network. Amazon appears to allow ICMP as a rule definition as well as Oracle. In summary, it appears that all three cloud vendors provide basic inbound and outbound rules. Microsoft limits the protocols to TCP and UDP and does not allow ICMP rules. This might or might not matter when selecting a cloud vendor. Once you have the rules defined you effectively have a secure system and flexibility to define subnets, netmasks, router tables, and layers of security with software defined networks. All three vendors appear to address this basic networking issue the same with one small difference with Azure. Now that we know how to configure networks it might be important to talk about speed, blocking, and throttling of networks. More tomorrow.

Today we are going to explore how to configure and setup the basics of networking for a Linux compute instance in the Oracle Cloud. If you would like to read more about network configurations you...

Iaas

TCP, UDP, and IP

For the last two days we have been going through TCP/IP Illustrated Volume I. We are going to shift gears a little bit and look at the OSI stack from the perspective of another book. Today we are going to look at VPN Illustrated: Tunnels, VPNs, and IPSec by Jon C. Snader. We are shifting to another book because the TCP/IP Illustrated is an excellent book for Computer Scientists who want to know the nuts and bolts of how computers talk to each other. Things have changed since this book was published. We are no longer bound to a computer with an ethernet connection or two and network connections have become more of a virtual connection and less of a physical connection. When we talk about a network connection in the cloud we are not talking about a physical wire connected to a physical computer connected to a physical router. We are typically talking about a software defined network (SDN) where we have a virtual network connected to a virtual router that goes through a boarder router and gets us to the internet. We have covered the IP protocol in a pervious blog where we are concerned with a source and destination address and talked about different classes of networks, subnets, and netmasks. We skipped what it takes to figure out a routing map and shortest hop connection between the two computers. For most deployments we point to a default router and the default router deals with this complexity. If we look at the layer communication (Figure 2.2 from VPN Illustrated) we see the different layers of the OSI layer represented Today we are going to be talking about the Transport Layer or Layer 3. An example of an application would be a web browser communicating to a web server. The web browser would connect to the ip address of the web server and make an http request for a file. The http request is an example of an application layer request. At the TCP layer we have to define the handshake mechanism to request and receive the file as well as the port used for the request. Ports are a new concept where we not only talk to the ip address of a server but we specifically talk to it through a specific protocol that the server has a listener ready and available for requests. In our web browser example we read clear text web pages on port 80 typically and secure web pages on port 443. The secure web page not only can accept a file download request but does it without anyone else on the network knowing what is being asked because the communication between the web browser and web server is encrypted and encoded to prevent anyone from snooping traffic that is being exchanged. This is needed if you want to transmit secure information like credit card numbers, social security numbers, or any other financial related keys that assist in doing commerce across the internet.Tools likeifconfig on Linux allow you to control the status of an internet connectionipconfig on Windows is a similar toolallow you to bring up and down network interfaces. You can go to a higher level with commands like ifup and ifdown on Linux to do more than just bring an ether connection up or down by reading the configuration files for netmasks, firewall, and network services to start. Other tools like We just mentioned a new term here, a firewall. A firewall is a program that runs on a server and either allows traffic through or disables traffic on a specific port. For example, if we want to allow anyone on our subnet access to our web page, we open up port 80 to the same subnet that we are on. If our corporate subnet consists of more than just one subnet we might want to define an ip address range that we want to accept requests from. A firewall takes these connection requests at the TCP layer and opens up the TCP header, inspects it looking at the source and destination address as well as the port that is used for communications. If the port is open and allowing traffic from a subnet or ip address range, the request is then passed to the web server software. If the port is open but the traffic is coming outside of the ip address range, the request is dropped and an error is returned or the tcp/ip packet is dropped based on our firewall rules. The same is true for all ports that attach to compute engines on the internet. By default most cloud vendors open up port 22 which is the ssh port that allows you to connect to a command line on a Linux or Unix server. Microsoft Azure typically opens up port 3389 which is the remote desktop connection port. This allows you to connect to a Windows desktop using the RDP application on Windows desktops. It is typically a good idea to restrict the ip address that you can connect to your compute cloud server from an ip address rather than from any address. We could consider a router to be an implementation of a firewall. A router between your subnet and the corporate network would be a wide open firewall. It might not pass UDP headers and most likely does not pass multicast broadcasts. It will not typically pass non routable addresses that we talked about yesterday. If we have a 192.168.1.xxx address we typically don't route this outside of our local network by definition since these are local private addresses. A router can block specific addresses and ports by design and act like a firewall. For example, Oracle does not allow ftp access from inside of the corporate network to outside servers. The ftp protocol transmits user names and passwords in the clear which means that anyone using tools like tcpdump, ettercap, and ethereal can capture and display the passwords. There are more secure programs like sftp that performs the same function but not only encrypts the username and password but each data byte transmitted to and from the server. Many routers like wifi routers that most people have in their homes allow for network address translation (NAT) so that you are not presenting the 192.168.1.xxx address to the public internet but the address of your router/modem that connects you to the internet. Your desktop computer is at address 192.168.1.100, for example, but resolves to 66.29.12.122 with your internet provider. When you connect to port 80 at address 157.166.226.25 which correlates to http://cnn.com you connect to port 80 with a TCP/IP header source address of 66.29.12.122 and a destination address of 157.166.226.25. When your router gets a response back it knows that it needs to forward the response to 192.168.1.100 because you connected with a header value that said you were using a NAT connection. The router bridges this information back to you so that you don't need to consume more ip addresses on the internet for each device that you connect with from your home. The router/modem translates these requests using NAT, Bridge links, or actual IP addresses if you configure your back end server to request a direct mapping. If we put all of this together along with the route command on Windows or Linux, we can define a default router that will take our IP packets and forward them to the right path. It might take a hop or two to get us to our eventual destination but we should be able to use something like Figure 2.9 from VPN Illustrated to represent out access to the Internet and use tool like traceroute to look at the hops and hop cost for us to get to the different cloud servers. Note in this diagram if we are on Host 4 we set our default router to be router 2. We then trust that router 2 will know how to get to router 1 and router 1 will take us to our desire to look at cnn.com or whatever web site we are trying to connect to. All cloud vendors provide a default router configuration. All cloud vendors will give you a way of connecting to the internet. All cloud vendors will give you a way of configuring a firewall and subnet definitions. We might want to create a database server that does not have an internet connection and we need to connect to our application server through ssh then ssh into our database server through a private network. We might not have a public internet connection for our database but hide it in a subnet to keep it secure. In our routing map from VPN Illustrated we might want to put our database on host 4 and disable any connection to the internet. We might only want to allow traffic from the 200.10.4.xxx network to connect to the database. We might want to allow ssh, port 80, and port 443 connection to host 1 and allow only host 1 to connect ssh to host 4. All could vendors allow you to do this and configure virtual networks, subnets, firewalls, and netmasks. We recommend that you get an IaaS account on AWS, Azure, and Oracle IaaS and play. See what works. See what you can configure from the command line. See what requires console configuration and your options when you provision a new operating system. See what you can automate with Orchestration scripts or python scripts or chef/puppet configurations. Automation is the key to a successful deployment of a service. If something breaks it is important to be able to automate restarting, sizing up, and sizing down services and this begins at the compute layer. It is also important to see if you can find a language or platform that allows you to change from one cloud vendor to another. Vendor lock in at this level can cause you to stick with a vendor despite price increases. Going with something like bitnami allows you to select which vendor is cheapest, has the best network speeds and options, has the fastest chips and servers, as well as the best SLAs and uptime history. We didn't dive much into UDP. The key difference between TCP and UDP is the acknowledgement process when a packet is sent. TCP is a stateful transmission. When a web request is asked for by a browser the client computer sends a TCP/IP packet. The web server responds that it got the request and sends an acknowledgement packet that it received the request. The web server then takes the file that was requested, typically something like index.html, and sends it back in another TCP/IP packet. The web browser responds that it received the file with an acknowledgement packet. This is done because at times the Internet gets busy and there is a chance for collision of packets and the packet might never get delivered to the destination address. If this happens and the sender does not receive an acknowledgement it resends the request again. With a UDP packet the handshake does not happen. The sender sends out a packet and assumes that it was received. If there was a collision and the packet got dropped it is never retransmitted. Applications like Internet Radio and Skype use this type of protocol because you don't need a retransmission of audio signals if the time to listen to it has passed. The packet is dropped and the audio is skipped and picked up at the next packet transmitted. Most cloud vendors support UDP routing and transmission. This is optional and typically a firewall configuration. It might or might not make sense for a database to send and receive using UDP so it might not be an option when you get the Platform as a Service. Most Infrastructure as a Service vendors provide configuration tools to allow or block UDP. In summary, we have covered basic addressing, routine, firewalls, and touched briefly on the TCP and UDP headers. We don't really need to get into the depths of TCP and how packets are transmitted, how congestion is handled, and how collisions are compensated for. In a cloud vendor you typically need to ask if the network is oversubscribed or bandwidth limited. You also need to ask if you have configuration limitations and restrictions on what you can and can not transmit. One of the risks to an unlimited network is noisy neighbor and getting congestion from another virtual machine that you are provisioned with. On the other hand if your network is oversubscribed you have to be bandwidth limited and accessing your storage can limit your application speed. Our advice is know your application, know if you are network limited, and know the security model and network configuration that you want ahead of time. Every cloud vendor differentiates their services but few offer service level agreements on bandwidth and compute resources. Read the fine details and play with all options.

For the last two days we have been going through TCP/IP Illustrated Volume I. We are going to shift gears a little bit and look at the OSI stack from the perspective of another book. Today we are...

Iaas

link layers 2 and 3

We are going through the OSI 7 layer stack and looking at the different layers. Yesterday we stared the discussion by looking at Kevin Fall and Richard Steven's book TCP/IP Illustrated Volume 1. In this book they describe the different layers and look at the how, what, and why of the design. Today we will focus on layers 2 and 3 the link layer and network layer. Alternate sources of information about these layers can be found athttp://tcpipguide.comHow Stuff Works - a great podcast in my opinionietf.org Tutorial on Layers 2 and 3Layer 2 is basically a way of communicating between two neighbors. How many milliseconds a bit of data is kept on the wire, physical addressing, and aggregation of data packets are defined here. If you have ever wondered what a MAC Address is, this is where it is defined. Vendors are given a sequence of bits that indicate the address of a device that they create. Note that this is not your ip address but a physical sequence of bits as defined by the Institute of Electrical and Electronic Engineers (IEEE) 802 definition. The data packet consists of six octets of data with the first three octets identifying a corporation or manufacturer and the second three octets representing a unique sequence number of a device that the vendor manufactured. An example of this would be the MAC address on my MacBook Pro, 00:26:b0:da:c8:10. Apple is assigned 00:26:b0 as the identifier for their products. My specific laptop gets the identifier da:c8:10. When a data packet is placed on the internet through a hard wired cable or wifi it is placed there with the unique MAC Address of my laptop. When data was generated and consumed by physical hardware these addresses meant something. With virtualization and containers the MAC Address has become somewhat meaningless because these values are synthetic. You really can't determine if something came from an Apple product because we can map the above MAC address to a virtual machine by defining it as a parameter. It is best practice not to use the same MAC address is a physical network because all of the computers with that address will pick up the packet off the wire and decode it. Layer 3 is the communication protocol that is used to create and define packets. With Apple for example, they defined a protocol called Appletalk so that you could talk between Apple computers and devices. This protocol did not really take off. Digital Computers did something similar with VAX/VMS and DecNET. This allowed their computers to talk to each other very efficiently and consume a network without regard for other computers on the network. Over the years the IP protocol has dominated. The protocol is currently in transition from IPv4 to IPv6 because the number of devices attached to the internet have exceeded the available addresses with the protocol. The IPv4 protocol consists of a dotted-quad or dotted-decimal notation with four fields that denote networks. For example, 129.152.168.100 is a valid ip address. All of the four fields can range from 0 to 255 with some of the values reserved. For example, 0.0.0.0 is not considered to be a valid address and neither is 255.255.255.255 because they are reserved for special functions. IPv6 uses a similar notation but addresses are denoted as eight blocks of 16 bit values. An example of this would be 5f05:2000:80ad:5800:58:800:2023:1d71. Note that this give us 128 bits rather than 32 bits to represent an address. IPv4 has 4,294,967,296 possible addresses in its address space, and IPv6 has 340,282,366,920,938,463,463,374,607,431,768,211,456.With IPv4 addressing there is something called classes of networks. A class A network consists of a leading zero followed by seven bits to define a network and 24 bits to define a specific host. This is typically not used when talking about cloud services. A class B network consists of a leading 1 and 0 followed by 14 bits to define a network and 16 bits to define a host. Data centers typically use something like this because they could have thousands of servers in a data center. A class C network consists of a leading 110 followed by 21 bits to define the network and 8 bits to define a host. This allows 256 computers to be on one network which could be a department or office building. A class D network starts with 1110 and is considered to be a multicast broadcast. If something is written with this sequence, the packets are written to all hosts on the network. All hosts should but are not mandated to pick up this packet and look at the data element. A class E network starts with 1111 and is considered to be reserved and not to be used. The image from Chapter 2 of TCP/IP Illustrated Volume I shows the above visually. This comes into play when someone talks about netmasks. If you are talking about a 0.0.0.0/16 it means that you are ignoring the leading 16 bits and looking at the remaining 16 bits to use for routing. You might also see 0.0.0.0/24 which means that you use the last 24 bits to route the data. If you set your netmask to be 255.255.255.0 it means that you are using a class B network with the first 16 bits defining the corporate network, the next 8 bits defining the subnet in the company, and the last 8 bits to define the specific host. This means that you can have 255 subnets in the company and 255 computers on each network. A netmask of 255.255.255.0 suggests that you are not going to route outside of your subnet if the first three octets are the same. What this means is that a router either passes the packets through or does not pass the data through based on the netmask and ip address of the destination. You might hear the term CIDR (Classless inter-domain routing). This term refers to how to get to and from a host if there are multiple ways of traversing the network. We will not get into this but netmasks are good ways of limiting routing tables and spanning trees across networks. This is typically a phrase that you need to know about if you are looking at limiting communication and flow of addresses across a data center. Earlier we talked about reserved networks and subnets. Some of the network definitions for IPv4 are defined as private and non-routable networks. A list of these addresses include0.0.0.0/8 Hosts on the local network. May be used only as a source IP address.10.0.0.0/8 Address for private networks (intranets). Such addresses never appear on the public Internet.127.0.0.0/8 Internet host loopback addresses (same computer). Typically only 127.0.0.1 is used.169.254.0.0/16 “Link-local” addresses—used only on a single link and generally assigned automatically.172.16.0.0/12 Address for private networks (intranets). Such addresses never appear on the public Internet.192.168.0.0/16 Address for private networks (intranets). Such addresses never appear on the public Internet.224.0.0.0/4 IPv4 multicast addresses (formerly class D); used only as destination addresses.240.0.0.0/4 Reserved space (formerly class E), except 255.255.255.255.255.255.255.255/32 Local network (limited) broadcast address.Multicast addressing is supported by IPv4 and IPv6. An IP multicast address (also called group or group address) identifies a group of host interfaces, rather than a single one. Most cloud vendors don't allow for multicast and restrict use of communications to unicast from one server to another. Some of the additional terms that come up are network address translation (NAT), border gateway router (BGP), and firewalls come up around networking discussions. We will defer these conversations to higher layer protocols because they involve more than just the ip address. BGP can be a simple definition that just drops ip addresses and does not pass them outside the corporate network independent of the netmask that the source host uses. If, for example, we want to stop someone from connecting to an ip address outside of our network and force it to go through a firewall or packet filter device a BGP can redirect all traffic through these devices or drop the packets. In summary, we skimmed over routing. This is a complex subject. We mainly talked about layers 2 and 3 to introduce the terms MAC address, IP address, IPv4, and IPv6. We touched on CIDR and routing tables as well as reserved addresses and BGP and NAT. This is not a complete discussion on these subjects but an introduction of terms. Most cloud vendors do not support multicast or anycast broadcasts inside or outside of their cloud services. Most cloud vendors support IPv4 and IPv6 as well as subnet masking and multiple networks for servers and services. It is important to understand what a router is, how to configure a routing table, and the dangers of creating routing loops. We did not touch on hop count and hop cost because for most cloud implementations the topology is simple and servers inside a cloud implementation is rarely a hop or two away unless you are trying to create a highly available service in another data center, zone, or region. Up next, the data layer and the IP datagram.

We are going through the OSI 7 layer stack and looking at the different layers. Yesterday we stared the discussion by looking at Kevin Fall and Richard Steven's book TCP/IP Illustrated Volume 1. In...

Iaas

TCP/IP Illustrated Vol 1

Back in 1998 I was working for Sun Microsystems and took an introductory class on networking. One of the big benefits of working for Sun is that it had a very strong affiliation with Stanford University and employees could take classes at no cost. An early rumor was that Sun really stood for Stanford University Networking since two of the founders of the company were living in the Stanford dorms during the early years of Sun. Stanford for years has offered CS 144 - Introduction to Computer Networking. The class is based on Kevin Fall and Richard Steven's book TCP/IP Illustrated Volume 1. I was in an internal training class about cloud services last week and terms and phrases that I remotely remembered kept coming up. As I talked to more and more people, they also knew most of the terms but not all of them. In the next few days we will go through TCP/IP Illustrated and provide a quick tutorial on networking for those of us that have been out of college more than ten years (much more for some of us) and don't work with this on a daily basis. TCP/IP Illustrated starts out by talking about the history of computer connectivity and the evolution of the 7 layer OSI stack. The seven layers consist of physical (1), link (2), network (3), transport (4), session (5), presentation (6), and application (7). Each of these layers have different protocols, methodologies, and incantations that make them unique and worthy of selection for different problems. The physical layer is the actual connection between two computers. This might be a copper cable, fiber optic cable, or wireless network. The physical connection media is the definition for this layer. Most of us are familiar with a cable that comes out of the wall, switch, or router and plugs into our server or wifi hub. We are also familiar with a wifi or bluetooth connection that allows us to connect without a physical wire connecting us to other computers. We are not going to focus on this layer but assume that we are wirelessly or ethernet connected to the internet and the cloud servers that we are connecting to are wired to an internet connection. We then use the nebulous internet to route our requests to access our cloud server and responses back to us. This will require higher layers of the stack to make this happen but the default is that we are connected to a network in some manner as well as the server that we want to connect to.The link or data link layer include protocols for connecting to a link layer and exchanging data. Links can be multi-access layers with more than just two computers talking to each other. WiFi and Ethernet networks are examples of a multi-access layer. We can have more than two computers on these networks and all of them can operate at the same time on the network. Not all of the computers can talk at once but they can time slice the network and share the common physical layer together.The network or internetwork layer (layer 3) is the protocol layer where we frame packets of information and define communication protocols. Protocols like TCP/IP is defined at this layer. We can put a data analyzer on the physical cable and look at bits streaming by on the wire (or wifi) and decode these packets into data and control blocks. The IP or internet protocol layer is defined here as well as other protocols for creating data packets.The transport layer (layer 4) is the layer where we describe how data is exchanged and deal with collisions, addresses, and different types of services. TCP, for example, exists at this layer and has protocols for dealing with collisions on the network. If two computers are talking at the same time, bits get overwritten and listeners can not properly read the packets. The TCP layer defines how to request retransmission of data as well as how to avoid collisions in the future for short term. Other protocols like UDP and multicast are defined at this layer that allows us to do things like broadcast messages to all hosts on a network and not wait for a response or acknowledgement. We might want to do this for a video broadcast from a single source where we know that we have one transmitter and multiple receivers on a network. The session layer (layer 5) are handshaking mechanisms to maintain state between data packets. An example of this would be a cookie in a web browser to maintain a relationship between a client and web server. Server affinity and route preferences are also defined at this layer. If we have a pool of web servers and want to send a client back to the web server that it went to previously, this layer helps create this affinity.The presentation layer (layer 6) is responsible for format conversions and is typically not manipulated or used for internet protocols or communications. The application layer (layer 7) is where most of the work is done. A web server, for example, uses http as the communication protocol and defines how screens are painted inside a browser and what files are retrieved from a web server. There are hundreds of layers defined here and we will go into a few examples in future blogs.If we take an overview of TCP/IP Illustrated Volume I we see that chapter 1 covers the OSI stack and introduces networking and the history of networking as well as layer 1 options. Chapter 2 covers layer 3 and all networking options and touches on the differences between IPv4 and IPv6. Chapter 3 covers the link layer or layer 2 focusing on ethernet, bridges, switches, wireless networks, point to point protocols, and tunneling options. Chapter 4 dives into the ARP protocol which is an implementation of layer 3 used to deal with addressing and computers on a network. Chapter 5 covers the IP definition and discusses packet headers and formats. Chapter 6 goes into addressing more and talks about dynamic host configuration protocol (DHCP) for assigning addresses dynamically. Chapter 7 discusses firewalls and routers as well as network address translations (NAT) concepts. This is the layer that typically gets confusing for cloud vendors and leads to different configurations and options when it comes to protecting servers in the cloud. Chapters 8 and 9 deal with internet control message protocol, broadcasting, and multicasting. Most cloud vendors don't deal with this layer and just prohibit the use of this layer. Chapter 10 focuses on UDP and IP fragmentation. Chapter 11 centers on Domain Naming Services. Each cloud vendor addresses this differently with local and global naming services. We will look at the major cloud vendors and see how they address local naming and name resolution. Chapters 12 through 17 deal with the TCP structure, management, and operation. The Stanford class spent most of the semester on this and ways of optimizing errors and issues. Most cloud vendors do this for you and don't really let you manipulate or modify anything presented in these chapters. The book finishes with Chapter 18 by talking about security in all of its flavors and incantations. We will spend a bit of time talking about this layer since it is of major concern for most users.In review, we are going to go back and look at networking terms, concepts, and buzzwords so that when someone asks us does this cloud service provide xyz you have a strong context of what they are asking. We are not trying to make everyone a networking expert, just trying to level set the language so that we can compare and contrast services between different cloud vendors.

Back in 1998 I was working for Sun Microsystems and took an introductory class on networking. One of the big benefits of working for Sun is that it had a very strong affiliation with Stanford...

Iaas

Orchestration vs CloudFormation

Today we are going to do a compare and contrast with Oracle Orchestration and Amazon CloudFormation. The two have the same objectives and can perform the same operations when provisioning an instance. The key difference is the way that they both operate and define the elements needed to create an instance. In the past few days we have gone through and looked that the three files needed to provision a WordPress blog. Information on Oracle Orchestration can be found in the documentation section and tutorial section. Information on Amazon CloudFormation can be found at the home page and tutorial section. We will dive into the WordPress example and look at the json file that is used to provision the service. The key components to the json file are{ "AWSTemplateFormatVersion" : "2010-09-09", "Description" : " ... ", "Parameters" : { ... }, "Mappings" : { ... }, "Resources" : { ... }, "Outputs" : { ... }}We can create a simple storage element in S3 with the following file{ "Resources" : { "HelloBucket" : { "Type" : "AWS::S3::Bucket" } }}Note that the only thing that we truly need is the definition of a resource. The resource has a label of "HelloBucket" and the resource consists of an element defined as "AWS::S3::Bucket". Note that the Type is very specific to AWS. We could not take this generic definition and port it to any other platform. We don't know how much storage to allocate because S3 is typically an open ended definition. This is radically different from out storage creation from a few days ago where we had to define the storage_pool, size of the disk, and properties of the instance like is it bootable, what image to boot from, and what account it is associated with. The CloudFormation interface assumes account information because it is run from a web based or command line based interface that has your account information embedded into the user interface. We could get a little more complex and define an instance. With this instance we reference an AMI that predefines the content and operating system. We also define the security ports and connection keys for this instance in the definition.{ "Resources" : { "Ec2Instance" : { "Type" : "AWS::EC2::Instance", "Properties" : { "SecurityGroups" : [ { "Ref" : "InstanceSecurityGroup" }, "MyExistingSecurityGroup" ], "KeyName" : "mykey", "ImageId" : "ami-7a11e213" } }, "InstanceSecurityGroup" : { "Type" : "AWS::EC2::SecurityGroup", "Properties" : { "GroupDescription" : "Enable SSH access via port 22", "SecurityGroupIngress" : [ { "IpProtocol" : "tcp", "FromPort" : "22", "ToPort" : "22", "CidrIp" : "0.0.0.0/0" } ] } } }}In this example we are going to provision an EC2 instance from ami-7a11e213. We will be using the security credentials labeled MyExistingSecurityGroup and open up port 22 for ssh access. We don't know what version the operating system unless we look up the characteristics of the ami. This is different from the Oracle Orchestration where we define the storage element and what operating system to boot from. They both define the security groups but do it a little differently but have the same effect. We can also define some of the characteristics into the application. For CloudFormation we can configure WordPress with the following parameters "Parameters": { "KeyName": { "Description" : "Name of an existing EC2 KeyPair to enable SSH access into the WordPress web server", "Type": "AWS::EC2::KeyPair::KeyName" }, "WordPressUser": { "Default": "admin", "NoEcho": "true", "Description" : "The WordPress database admin account user name", "Type": "String", "MinLength": "1", "MaxLength": "16", "AllowedPattern" : "[a-zA-Z][a-zA-Z0-9]*" }, "WebServerPort": { "Default": "8888", "Description" : "TCP/IP port for the WordPress web server", "Type": "Number", "MinValue": "1", "MaxValue": "65535" } },Note that we define these parameters based on the application and pass into the operating system as it is booted. Oracle Orchestration takes a different tactic when it comes to adding parameters to a configuration. Rather than having parameters defined for each application, customizations like this are done with a post install script that is executed at boot time. These configurations can be done from a snapshot or from a post install script based on how you like to initialize systems. This functionality started with Enterprise Manager and the scripts that you use for in house systems can be ported to the cloud without changing or updating. In summary, the Amazon CloudFormation and Oracle Orchestration are very similar. The components that you use to define a system are done similarly. Amazon makes assumptions that you are running on AWS and gives you short cuts and shorthand that allows you to create predefined components quickly and easily. Unfortunately this configuration does not translate to any other cloud provider or an in house solution. Oracle Orchestration is a little more nuts and bolts but is designed to help you create everything from scratch and build upon the foundation for system definitions. CloudFormation has a graphical user interface that generates json files for you based on dragging and dropping components into a design pallet. Oracle takes a slightly different approach and uses the Oracle Marketplace to automatically generate the json files. There is not a graphical design tool that allows you to drag and drop components but there are tools to take a configuration that is in your data center and generate the parameter list that can be used to generate the json files for Orchestration. We are not saying that one is better than the other in this blog. We are mainly pointing out that they two tools and utilities have different target audiences and functionality. Unfortunately, you can't take one configuration and easily map it into the other configuration. Hopefully someone at some point will take these files and create a translator.

Today we are going to do a compare and contrast with Oracle Orchestration and Amazon CloudFormation. The two have the same objectives and can perform the same operations when provisioning an instance....

Iaas

Orchestration 2.0 - creating an instance

Today we will continue our evaluation of Oracle Orchestration by looking at how to define an instance. Yesterday we looked at creating a WordPress instance from an Oracle Marketplace image. We started by going down the structure of a storage element. Today we are going to continue with our Bitnami WordPress instance provisioning by looking at the bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance.json file which describes the compute instance. The default install from Bitnami creates this file. In our example we are going to create an instance called WordPress_4_5_3 rather than the default and change the default storage from 10 GB to 60 GB as the Marketplace suggests is the minimum. bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance.jsonIf we look at the orchestration file we see the definition{ "relationships" : [ ], "account" : "/Compute-metcsgse00028/default", "description" : "", "schedule" : { "start_time" : "2016-07-21T19:46:32Z", "stop_time" : null }, "oplans" : [ { "obj_type" : "launchplan", "ha_policy" : "active", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance", "objects" : [ { "instances" : [ { "networking" : { "eth0" : { "seclists" : [ "/Compute-metcsgse00028/default/default" ], "nat" : "ippool:/oracle/public/ippool" } }, "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012/b531af9b-e075-4a51-a867-449fd948d374", "storage_attachments" : [ { "volume" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "index" : 1 } ], "boot_order" : [ 1 ], "hostname" : "a2d1e4.compute-metcsgse00028.oraclecloud.internal.", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012", "shape" : "oc3", "attributes" : { "userdata" : { }, "nimbula_orchestration" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance" }, "imagelist" : "/Compute-metcsgse00028/marketplace01-user@oracleads.com/bitnami-wordpress-4.5.3-0-linux-oel-6.7-x86_64", "sshkeys" : [ "/Compute-metcsgse00028/cloud.admin/2016" ], "tags" : [ ] } ] } ] } ], "user" : "/Compute-metcsgse00028/cloud.admin", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance"}If we break down this file it can be abbreviated as { "relationships" : [ ], "account" : "/Compute-metcsgse00028/default", "description" : "", "schedule" : { … }, "oplans" : [ … ], "user" : "/Compute-metcsgse00028/cloud.admin", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage"}Note that this is exactly what it looked like for the storage definition. The differences happen in the oplans parameter field. For the instance definition we define the obj_type as launchplan. The launchplan parameters are detailed in the documentation. The required field for the launchplan object type is the instances parameter. All of the other fields are optional. The instances fields are defined in the documentation on instances. The shape parameter is the only required parameter with all other parameters being optional. The oplans field for our instance looks like "obj_type" : "launchplan", "ha_policy" : "active", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance", "objects" : [ { "instances" : [ { "networking" : { "eth0" : { "seclists" : [ "/Compute-metcsgse00028/default/default" ], "nat" : "ippool:/oracle/public/ippool" } }, "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012/b531af9b-e075-4a51-a867-449fd948d374", "storage_attachments" : [ { "volume" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "index" : 1 } ], "boot_order" : [ 1 ], "hostname" : "a2d1e4.compute-metcsgse00028.oraclecloud.internal.", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012", "shape" : "oc3", "attributes" : { "userdata" : { }, "nimbula_orchestration" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance" }, "imagelist" : "/Compute-metcsgse00028/marketplace01-user@oracleads.com/bitnami-wordpress-4.5.3-0-linux-oel-6.7-x86_64", "sshkeys" : [ "/Compute-metcsgse00028/cloud.admin/2016" ], "tags" : [ ] } ] } ]In this definition the object type is launchplan. The ha_policy is active which means that the instance will be rebooted if it fails. The label is the name of the compute instance. The default installed by Bitnami is "bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance" but our example would be WordPress_4_5_3_instance. The objects parameter holds the meat of the definition. We have the required field shape that defines our instance as an oc3 size instance. We give it a name as well as network connection using the default security list association. The default security list opens up port 22 that allows us to ssh into the Linux instance to configure the application. We have one storage volume attached that we looked at in the previous blog post. Note that we attach the volume by name and associate it with logical unit 1. We also define some optional attributes that give the boot loader a hint so that we can not only boot this image from a boot file but associate ssh keys to the instance so that we can log in once the system is up and running. The only additional information that we see in this file is the hostname used to address this system. In this example we use the hostname of a2d1e4.compute-metcsgse00028.oraclecloud.internal. This is the internal name that can be used through the DNS service provided by the Oracle Cloud services.bitnami-wordpress-4.5.3-0-linux-20160721144012_master.jsonThe third file that is defined for our instance is the master json file. The default file name created by Bitnami is bitnami-wordpress-4.5.3-0-linux-20160721144012_master.json for our instance. The key differences in this file is the oplan obj_type being orchestration and the relationships entry being non null. The obj_type of orchestration tells the Oracle Cloud service that this is the file that is used to describe the instance and how to load and boot it. The relationships parameter shows that there are two plans that we need to reference for booting. We require the "bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance" file to describe the instance and this file depends upon the file "bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage". The storage file defines the disk that we will boot from and use to run our instance. If there were multiple disks we would have multiple dependencies listed. The full contents of this file look like{ "relationships" : [ { "to_oplan" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "oplan" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance", "type" : "depends" } ], "account" : "/Compute-metcsgse00028/default", "description" : "", "schedule" : { "start_time" : "2016-07-21T19:40:33Z", "stop_time" : null }, "oplans" : [ { "obj_type" : "orchestration", "ha_policy" : "", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "objects" : [ { "info" : { "errors" : { } }, "status" : "ready", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "status_timestamp" : "2016-07-21T19:45:16Z", "uri" : null } ] }, { "obj_type" : "orchestration", "ha_policy" : "", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance", "objects" : [ { "info" : { "errors" : { } }, "status" : "ready", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_instance", "status_timestamp" : "2016-07-21T19:47:16Z", "uri" : null } ] } ], "user" : "/Compute-metcsgse00028/cloud.admin", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_master"}These two files define the instance and the instance relationship with storage. When we look at the Orchestration part of the Compute Console we note that we can view, start, delete, resize, and download the orchestration. If the instance is running we can stop the instance. If we view the orchestration it allows us to look at the download the json code. The download allows us to download the json code to our desktop. The delete function for an instance has moved from the Instances tab to this tab. To delete an instance you need to stop the instance through the Orchestations tab then delete the components that make up the instance. Note that the Resize Instance only appears for a compute instance. This allows us to size up or size down an instance to a different shape. We can’t size up and down storage but need to edit the orchestration to make it bigger. Making it smaller is currently not supported. It is important to note that editing of the json files is currently not supported from this interface. If you want to edit a json file you need to download it, edit it on your desktop, then upload it with a different name. The upload is done using the Upload Orchestration button on the Orchestrations tab. Yesterday we looked at the storage json file that makes up a storage orchestration. Today we looked at the instance and master orchestration files that define an instance. All three of these files help us define an instance and how to start it. The example that we started with is a simple example that creates an Apache web server with PHP and MySQL on a Linux instance. We then layer WordPress on this configuration and define security ports to allow us to log in and manage the instance as well as see the http and https ports. Automation of instances can be done through REST apis once we have the orchestrations defined. We can use tools like Enterprise Manager to catch errors and exceptions thrown if an instance fails and initiate a restart function. We can also use Enterprise Manager to reconfigure or rescale services if utilization goes above a threshold over a period of time. Amazon uses CloudFormation to make this happen but it is specific for Amazon AWS services only. Oracle uses Enterprise Manager because these scripts and monitoring utilities can be used for on premise servers, virtual instances running in your data center, or compute instances running in the Oracle or other cloud service.

Today we will continue our evaluation of Oracle Orchestration by looking at how to define an instance. Yesterday we looked at creating a WordPress instance from an Oracle Marketplace image. We started...

Iaas

Orchestration 2.0 - creating a storage element

In this blog we will look at an Oracle Orchestration for a Bitnami WordPress instance. We provisioned the instance by going to the http://cloud.oracle.com/marketplace and provisioned a WordPress instance. The instance that we are going to install can be found at https://cloud.oracle.com/marketplace/en_US/listing/4980490?_afrLoop=9282045484046863&_afrWindowMode=0&_afrWindowId=87oc7qxu0_1 and is based on Oracle Linux 6.7. The minimum profile for this instance is an OC3 (1 OCPU, 7.5 GB RAM) and 60 GB of local disk. When we click on the Get App button for this application it takes us through the installation process for cloud compute services. The process that is kicked off when we click on the Get App button downloads a bootable image from the Marketplace and makes it available as an Image that we can create new instances from any time we want. The default size is 10 GB. We need to grow this installation to be 60 GB to allow MySQL and the WordPress application to operate properly.To create an instance we go to the compute console and create instance. We select private images to get to the WordPress bitnami instance that we downloaded to boot from. We enter the network security list, ssh keys, and name for the instance. The default disk size is 10 GB. We will keep this and review the configuration before launching. Once we click on Create the cloud console creates three orchestration files to initialize the WordPress instance. The first file that is created defines the storage. The file is called bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage.json if we accept all of the defaults. In our example it would be called WordPress_4_5_3_storage.json. This file describes the storage that is needed for booting the operating system and is of the format{ "relationships" : [ ], "account" : "/Compute-metcsgse00028/default", "description" : "", "schedule" : { "start_time" : "2016-07-21T19:40:35Z", "stop_time" : "2016-07-21T21:50:02Z" }, "oplans" : [ { "obj_type" : "storage/volume", "ha_policy" : "", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "objects" : [ { "managed" : true, "snapshot_id" : null, "snapshot_account" : null, "machineimage_name" : "/Compute-metcsgse00028/marketplace01-user@oracleads.com/bitnami-wordpress-4.5.3-0-linux-oel-6.7-x86_64", "status_timestamp" : "2016-07-21T19:44:51Z", "imagelist" : "/Compute-metcsgse00028/marketplace01-user@oracleads.com/bitnami-wordpress-4.5.3-0-linux-oel-6.7-x86_64", "writecache" : false, "size" : "10737418240", "storage_pool" : "/compute-us2-z12/cheis01nas100-v1_multipath/storagepool/iscsi/latency_1", "shared" : false, "status" : "Online", "description" : "", "tags" : [ ], "quota" : null, "properties" : [ "/oracle/public/storage/default" ], "account" : "/Compute-metcsgse00028/default", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "bootable" : true, "hypervisor" : null, "uri" : null, "imagelist_entry" : 1, "snapshot" : null } ] } ], "user" : "/Compute-metcsgse00028/cloud.admin", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage"}Let’s walk through this file and look at all the structures. Let’s break down the structure without all of the child components.{ "relationships" : [ ], "account" : "/Compute-metcsgse00028/default", "description" : "", "schedule" : { … }, "oplans" : [ … ], "user" : "/Compute-metcsgse00028/cloud.admin", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage"}Note that there are seven elements that define this object. The first is the relationship. In this file, this is the basis of a configuration and it does not have any dependencies on another object. The second is account. Account defines the owner of the object and also defines security associated with the object. We could have a specific account in our identity domain that can access this object or make it accessible to all users through the default label. The third element is the description. There is no description for this object. We could have added more information when we created this information when we provisioned the disk. This is an informative field and is not critical for creation. The fourth field is schedule. The schedule defines when the object was created and logs the start and stop times for the object. The fifth field is the oplans. Oplans defines the object. We obscured the definition at this point and will dive into that next. The sixth field is the user field. The user is who created the object and who owns the object. The final and seventh field is the name of the object. The name consists of the identity domain, the user that created it, and the name of the object. In this example the name is "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage”. In our example it would be "/Compute-metcsgse00028/cloud.admin/WordPress_4_5_3_storage”.If we dive into the oplans we note that the object is defined with{ "obj_type" : "storage/volume", "ha_policy" : "", "label" : "bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "objects" : [ … ]}The oplans consists of four elements and are defined in the documentation. All of these objects are required with the exception of ha_policy. The first parameter defined is the obj_type. This parameter can be defined asIp-reservationLaunchplanOrchestrationStorage/volumeSecapplicationSeciplist SeclistSecruleWe are defining this element as a storage/volume. We give it the name associated with the label parameter and the characteristics defined in the objects parameters. We have the option of defining the ha_policy as active or monitor. If we define it as active the object is restarted if it is deleted or fails. Active is only available if the obj_type is launchplan. We can set the ha_policy to monitor for obj_type of launchplan, storage/volume, or orchestration. If the object fails an error is thrown and can be thrown to a monitoring software package but the object is not automatically restarted or recreated. For all other objects, the ha_policy must be set to none or set to an empty field. For our example we would set the label to “WordPress_4_5_3_storage” rather than the default label generated by the bitnami installation. The objects field is defined in the documentation. We are going to dive into the storage volume object in this example. The fields that are required for storage is name, size, and properties. The optional fields are description, bootable, and tags. For our example we define the objects parameter as "objects" : [ { "managed" : true, "snapshot_id" : null, "snapshot_account" : null, "machineimage_name" : "/Compute-metcsgse00028/marketplace01-user@oracleads.com/bitnami-wordpress-4.5.3-0-linux-oel-6.7-x86_64", "status_timestamp" : "2016-07-21T19:44:51Z", "imagelist" : "/Compute-metcsgse00028/marketplace01-user@oracleads.com/bitnami-wordpress-4.5.3-0-linux-oel-6.7-x86_64", "writecache" : false, "size" : "10737418240", "storage_pool" : "/compute-us2-z12/cheis01nas100-v1_multipath/storagepool/iscsi/latency_1", "shared" : false, "status" : "Online", "description" : "", "tags" : [ ], "quota" : null, "properties" : [ "/oracle/public/storage/default" ], "account" : "/Compute-metcsgse00028/default", "name" : "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage", "bootable" : true, "hypervisor" : null, "uri" : null, "imagelist_entry" : 1, "snapshot" : null } ] } ]The name of our object is "/Compute-metcsgse00028/cloud.admin/bitnami-wordpress-4.5.3-0-linux-_20160721144012_storage". Note that this consists of the instance domain, username that created the storage, and the label for the storage. In our example it would be "/Compute-metcsgse00028/cloud.admin/WordPress_4_5_3_storage". We define the size as "10737418240" which is in bytes. This correlates to a 10 GB disk for the operating system. Properties is defined as [ "/oracle/public/storage/default" ] which defines the storage as default storage. We could have selected latency rather than default if we required a low latency and high IOPS storage that is typically used for a database. The rest of the fields in this description are optional. The bootable tag defines if this is a bootable image and the status defines if the storage is active or in standby mode. Note that the storage_pool defines that this storage element is an iscsi logical unit number that is made available to the compute node rather than dedicated attached storage. All of these fields define what is required to create the storage that we are going to create to boot our operating system and run our WordPress application. We could just as easily have defined this as a 20 GB disk or 200 GB disk. It is important to note that you get 128 GB free with a compute instance and will have to pay $50/TB/month if you go over 128 GB of storage per instance. Up next, how to take this storage and associate it with an instance.

In this blog we will look at an Oracle Orchestration for a Bitnami WordPress instance. We provisioned the instance by going to the http://cloud.oracle.com/marketplaceand provisioned a WordPress...

Iaas

Orchestration 2.0

In this blog entry we are going to look at how to automate installations and provisioning of compute instances. There are a variety of ways of making this happen with a wide variety of tools and methodologies that you could use. The foundation of all of these tools define how to configure systems that run in the cloud or in your data center. The configuration data includes compute resources, network resources, storage resources, user definitions, packages, processes, and services. These definition languages allow you to not only define one service but a collection of services. For example, if we want to define a WordPress blog, we can define a single server that contains the WordPress blog software, the web container that runs the blog, as well as typically the MySQL database to contain state and user information for the blog server. For a multi-site blog entry you typically want to split the database from the application server and allow you to create a high availability configuration with failover of the database files and multiple web containers on the front end to handle a large number of users and web requests.If we tried to describe this system with sentences it would be relatively complex and confusing. For example, we would like to run the web container on a single processor computer with 8 GB of memory and 20 GB of disk and have it connected to a single processor computer with 16 GB of memory and 40 GB of disk running a MySQL server. The web container should also have PHP installed and be able to answer request for http and secure http. The web container should run the WordPress package and connect to the database instance to store user and web page information. Both systems should run Oracle Linux 6.7 and be configured with all ports locked down other than web services and the ability to remotely log into the web service using a secure shell. We would also like to define what to do if the web container consumes over 80% of the processor assigned to it as well as the database instance as well as what to do if either of the services stops or fails. There are a variety of configuration management tools available to help define what is described in the above paragraph. Most of these tools use JSON to describe the configurations and start up and scaling procedures. The three that we will touch upon in this blog are Chef, Puppet, and Orchestration. PuppetPuppet is a public domain package produced by Puppet Labs and is available on Linux and Windows systems. More information can be found atLearning Puppet 4: A Guide to Configuration Management and AutomationPuppet 4 Essentials - Second EditionLearning PuppetMastering PuppetPuppet Cookbook - Third Editionhttps://puppet.com/https://github.com/puppetlabs/puppet/releaseshttp://www.youtube.com/user/PuppetLabsIncPuppet uses a declarative language to describe a system configuration. These files are called manifests and are typically stored in JSON format or something similar to it. With puppet resources are typically defined using the following format resource_type { ‘resource_title’:ensure:present or absent,attribute :one,attribute :two,attribute :n }ChefChef is a similar configuration tool first released in 2009 by a company called Chef to describe system configurations similar to Puppet which was released five years earlier. More information on Chef can be found in the following links.Learning Chef: A Guide to Configuration Management and AutomationMastering Chef ProvisioningChef Infrastructure Automation Cookbook - Second EditionChef EssentialsInfrastructure as Code: Managing Servers in the CloudCustomizing Chefhttp://www.chef.io/https://github.com/chef/chefChef uses “recipies” to describe application configurations and system configurations with a little more emphasis on the application configurations that Puppet. A typical Chef configuration consists of a set of files rather than putting everything into one json file. The typical folder structure, for example, looks like attributes default.rb files default file.txt recipies default.rb templates default file.erb metadata.rbThe rb files support the Ruby programming language syntax so attributes are defined slightly differently. default[‘Linux’][‘version’] = ‘6.7’ default[‘Linux’][package_name’] = ‘OEL_6_7’ default[‘Linux’][‘dir’] = ‘/home/oracle’The rb files can also have conditional control structures that allows you to do if then else or case selection to change configurations between installations. If, for example, you configure a web container different on Windows or Linux, you can collect all of these recipes in one location and update them as needed. You can also aggregate operating system configurations in one directory and application configurations in another file or directory and create dependencies between the configuration files.Oracle OrchestrationsOracle uses a third configuration management tool call Orchestrations. This is more puppet like than chef like in use and definition. Oracle Orchestrations and Amazon CloudFormation templates are very similar in nature and configuration since they both use JSON as the foundation. From the orchestration documentation, an orchestration defines the attributes and interdependencies of a collection of compute, networking, and storage resources in Oracle Compute Cloud Service. You can use orchestrations to automate the provisioning and lifecycle operations of an entire virtual compute topology. A sample orchestration file would look like { attribute: value, attribute: [ { attribute: subvalue, attribute: subvalue } ] }There are not any books on Oracle Orchestration but there are a few links that can help you understand the technology and this is the second major revision of this technology. The second version is designed more towards cloud automation of systems rather than individual servers. orchestration documentationBuilding Your First OrchestrationCreating Oracle Compute Cloud Service Oracle Linux Instances Using an OrchestrationOrchestrations REST EndpointsOrchestration TemplatesAmazon CloudFormationAmazon CloudFormation uses a similar template and configuration utility. The key differences are what attributes are required and what attributes are optional. More information on CloudFormation can be found in the following linksAWS CloudFormation User GuideDevelop, Deploy, and Manage for Scale with AWS Elastic Beanstalk and AWS CloudFormationAWS CloudFormation - Amazon Web ServicesCreating an Amazon EC2 Instance for AWS CodeDeploy (AWS CloudFormation Template)AWS CloudFormation Masterclass - YouTubeIntroduction to AWS CloudFormation - YouTubeAzure RunbooksMicrosoft Azure takes a radically different approach and uses Runbooks that require Windows PowerShell or PowerShell Workflow to define configurations and parameters for configurations. These tools are Azure specific and operate differently than they do for on premise systems. We will not look at this system during our evaluation because it is too specific to one vendor. In summary, there are a variety of ways to create definitions of what a service is in the cloud. There is not a single tool that performs a configuration for on premise and in the cloud. You can configure tools that will do this but they typically don’t work for all cloud vendors. Each cloud vendor has their own implementation and to get a comparison of what works and does not work do a Google search for “cloud orchestration tools comparison” to get a review of public domain and commercial tools that solve this problem. These tools are all relatively new and in their infancy. There is not a single tool that dominates the market and should be adopted over the other. In later blogs we will dive into the Oracle Orchestration definition and look at how it compares for building a system like WordPress in a single server and highly available cluster on multiple servers. These same principals can be applied to building an E-Business Suite, PeopleSoft, or JD Edwards configuration based not only on IaaS but PaaS.

In this blog entry we are going to look at how to automate installations and provisioning of compute instances. There are a variety of ways of making this happen with a wide variety of tools and...

Iaas

What's new in IaaS

After taking the month of July off (don't ask) we are going to restart our discussion in August with a focus on infrastructure. The best place to start this discussion is to look at what has changed in June and July. The best place to do this is the documentation what's new page. If we look at this page we see that the following are new featuresA new command line interface for launching and controlling computeA new way to backup instance with snapshots was introducedShared cloud storage via NFSv4 was made generally availableSecurity Lists were changed a littleUpdating private machine images was updatedOrchestrations were updated and changedResizing instances has been updatedBacking up storage volumes with snapshots was introducedCompute Instance High Availability was updated with Monitor as an optionYou can select different domains inside your instance for high availability and to reduce latency from the compute consoleYou can use the GET /instanceconsole/name and updates to the monitoring api were made to help troubleshoot booting issuesEnd Points for VPN and Compute were updated to reflect changes in instancesOver the next few weeks we will dive into some of these new changes and look at how to capitalize on these features. We will compare and contrast these features with features available from Amazon and Azure as usual. We will start with Orchestrations since this is probably the biggest impact item for provisioning services as well as scaling and controlling instances. The goal of August is to dive deeper into the infrastructure to look at not only how you can leverage this in the Oracle Cloud but utilize it with the Oracle Cloud Machine if you get one installed in your data center. All of the features and functionality detailed in the IaaS updates are made available a month or two after release in the public cloud in the Oracle Cloud Machine.

After taking the month of July off (don't ask) we are going to restart our discussion in August with a focus on infrastructure. The best place to start this discussion is to look at what has changed...

PaaS

database option - Spatial and Graphics

Today we are going to focus on the Spatial and Graphics option of the Oracle Database. Most business information has a location component, such as customer addresses, sales territories and physical assets. Businesses can take advantage of their geographic information by incorporating location analysis and intelligence into their information systems. The geospatial data features of Oracle Spatial and Graph option support complex geographic information systems (GIS) applications, enterprise applications and location services applications. Oracle Spatial and Graph option extends the spatial query andanalysis features included in every edition of Oracle Database with the Oracle Locator feature, and provides a robust foundation for applications that require advanced spatial analysis and processing in the Oracle Database. It supports all major spatial data types and models, addressing challenging business-critical requirements from various industries, including transportation, utilities, energy, public sector, defense and commercial location intelligence.The Spatial home page is a good starting point to learn more about the technology. Books that cover this topic are Pro Oracle Spatial for Oracle Database 11g Applying and Extending Oracle Spatial Case Study and Map Implementation with Spatial Database: With Oracle Map Builder and Mapviewer Pro Oracle Spatial for Oracle Database 11gNote that most of these books are three years old or older. Spatial has not changed much between 11g and 12c so the older books are still relevant. The key to the Spatial component is being able to define objects using geospatial tags. To achieve this, Oracle extended the database with the SDO_GEOMETRY data type. This is used just like an INTEGER or CHAR declaration for a variable but it contains a latitude and longitude element to define where something is located. Some sample code that we can lift from the Pro Oracle Spatial book looks likeSQL> CREATE TABLE us_restaurants_new( id NUMBER, poi_name VARCHAR2(32), location SDO_GEOMETRY -- New column to store locations);This creates a table that defines an entry that helps us find where the restaurant is located. We can populate this entry withSQL> INSERT INTO us_restaurants_new VALUES( 1, 'PIZZA HUT', SDO_GEOMETRY ( 2001, -- SDO_GTYPE attribute: "2" in 2001 specifies dimensionality is 2. NULL, -- other fields are set to NULL. SDO_POINT_TYPE -- Specifies the coordinates of the point ( -87, -- first ordinate, i.e., value in longitude dimension 38, -- second ordinate, i.e., value in latitude dimension NULL -- third ordinate, if any ), NULL, NULL ));This inserts and entry for restaurant number 1, labeled PIZZA_HUT, and the location is defined by a point located at -87, 38. Note that these are relative locations defined in relation to a map. We use the SDO_GTYPE to define what type of mapping that we are using and how we are describing the location for this store.The key benefit to this is that we can define restaurants and things like interstates. We can query the database by asking for any reference that is half a mile from the interstate. This is done with the following querySQL> SELECT poi_nameFROM ( SELECT poi_name, SDO_GEOM.SDO_DISTANCE(P.location, I.geom, 0.5) distance FROM us_interstates I, us_restaurants P WHERE I.interstate = 'I795' ORDER BY distance )WHERE ROWNUM

Today we are going to focus on the Spatial and Graphics option of the Oracle Database. Most business information has a location component, such as customer addresses, sales territories and physical...

Life in general

Safari Books Diversion

Today I am going to step back and look at something relatively simple but very powerful. One thing that my company provides for employees is a subscription to technology books online. I tend to reference these books in my blog entries mainly because I find them to be good reference material. When I write a blog entry I try to first address the topic from the mindset of a sales rep. I address the simple questions around what is the technology, why is it relevant, and how much will it cost me. If possible the discussion also blends in a compare and contrast with another solution from another vendor. The second post is a technology how to targeted at the pre sales engineer or consultant. To dive deeper typically I use books, whitepapers, hands on tutorials, and demos to pull material from. Safari Books OnLine is my virtual library. Years ago I would go to Barnes and Noble, Fry's, or Bookstop and purchase a book. Safari Pricing starts at $400/year for individuals or teams and is flexible for corporations. If you break this down this means that you need to read about 8-10 books a year to break even. If you read fewer than that, purchase them from Amazon. If you read more than that or just want to read a chapter or two, subscribe to Safari. Two of the features that I like about this interface is the search engine and the index engine. With the search engine, it looks inside of books and allows you to sort by relevance, date, and allows you to search inside a search. For example, if I do a search for jumpstart I get 3070 references. If I add solaris to the search I get 101 results. Note on the left there are three books written in 2016 and two books written in 2015. We can narrow our search and look for recent books that talk about jumpstart technology provided with Solaris. True, this might not be a relevant topic to you but it is an example of how to find a difficult to find topic in your virtual library. We can add this search index to our favorites by clicking on the Add to Favorites button and selecting a topic list to add to. In this example we add a JumpStart book from 2001 to our Solaris book list. We can look at more relevant publications and find something related to Solaris 11.2. We see the relevant information in the search index and when we click on the book it takes us to the relevant chapter. Note the highlighted text from our search. I find this is a good way of researching a topic that I need to learn more about or finding tutorials or examples of how to do something.One of the nice things about search indexes or lists is that you can share this list with other people and look at other peoples lists. This is done by looking at your Favorites and Folders you can look at the topics that interest you with the books you have saved on that effective shelf.One of the nice things is that you can look at shelves of other users. If you click on Shared List and search for your shelf title, you get a list of other users shelves. In this example we searched for Solaris and got five shelves that other users are maintaining. We can subscribe to these shelves and add it to our favorites. This is done by clicking on the Following "+" sign. It adds the shelf to your Favorites list on the left. Note that we are following the "Solaris stuff" folder. We can also add this as an RSS feed to our mail reader and get updates when the shelf is updated. We can then copy the rss feed html and add it to our news reader or Thunderbird email interface.If we add this to our Thunderbird reader we get an email interface showing updates and new books added to the shelf. We don't need to go check the list on a regular basis but look at the newsfeed section of our mail browserI hope this simple diversion was a good break from our dive into DBaaS and PaaS. Being able to do more than just a simple Google search is typically required to find examples and tutorials. Books historically have been a good place to find this and having access to not only my virtual bookshelf but other people's bookshelves where they sort and index things is a good thing. The $400 cost might be a bit prohibitive but the freedom is a good thing. Given that my company provides this subscription at no cost to me, I will continue to use this and read technology books on an airplane in offline mode and search as I am creating blog entries.

Today I am going to step back and look at something relatively simple but very powerful. One thing that my company provides for employees is a subscription to technology books online. I tend...

PaaS

database option - Data Guard part 2

Normally the part two of the option pack has been a hands on tutorial or examples liberally lifted from Oracle by Example, OpenWorld hands on labs, or the GSE Demo environment. I even went to an internal repository site and the database product manager site and all of the tutorials were for an existing database or labs on a virtual machine. None of these were easy to replicate and all of them assumed a virtual machine with a pre installed instance. None of the examples were form 2015 or 2016. If anyone knows of a hands on tutorial that uses the public cloud as at least one half of the example, let me know. There is a really good DR to the Cloud whitepaper that talks about how to setup Data Guard to the cloud but it is more of a discussion than a hands on tutorial. I typically steal screen shots and scripts from demo.oracle.com but the examples that exist in the GSE demo pool use Enterprise Manager 10g, servers inside of Oracle running an early version of 11g, or require a very large virtual image download. The closest thing that I could find as a hands on tutorial is Oracle by Example - Creating a Physical Standby. For this blog we will go through this tutorial and follow along with the Oracle Public Cloud as much as possible.Step One is to create an 11g database. We could do this on 12c but the tutorial uses 11g. If anyone wants to create a 12c tutorial, contact me and we can work on a workshop together. We might even be able to get it into the hands on labs at OpenWorld or Collaborate next year. Rather than going through all of the steps to create an 11g instance I suggest that you look at the May 4th blog entry - Database as a Service. Select 11g and High Performance Edition. We will call this database instance PRIM rather than ORCL. Your final creation screen should look likeWe want to create a second database instance. We will call this one ORCL and select High Performance Edition and 11g. The name does not matter as long as it is different from the first one. I am actually cheating on the second one and using a database instance that I created weeks ago. It is important to note while we are waiting on the database to finish that we can repeat this in Amazon but need to use EC2 and S3. We can also do this in Azure but in Azure Compute. We will need to provide a perpetual license along with Advanced Security and potentially Compression if we want to compress the change logs when we transmit them across the internet. It is also important to remember that there will be an outbound charge when going from one EC2 or Azure Compute instance to the other. If we assume that we have a 1 TB database and it changes 10% per day, we will ship 100 GB daily or being conservative and saying that we only get updates during the week and not the weekend we would expect 2 TB of outbound charges a month. Our cost for this EC2 service comes in at $320/month. If we use our calculations from our Database Options Blog post we see that the perpetual license amortized over 4 years is $2620/month. This brings the cost of the database and only Advanced Security to $2940. If we amortize this over 3 years the price jumps to $3,813/month. When we compare this to the Oracle High Performance Edition at $4K/month it is comparable but with High Performance Edition we also get eight other features like partitioning, compression, diagnostics, tuning, and others. Note in the calculator that the bulk of the processor cost is outbound data transfer. It would be cheaper to run this with un-metered compute services in the Oracle cloud at $75/month.If we follow the instructions in DR to Oracle Cloud whitepaper we see that the steps areSubscribe to Oracle Database Cloud ServiceCreate an Oracle instanceConfigure NetworkEncrypt Primary Database (Optional)Instantiate Data Guard StandbyPerform Data Guard health checkEnable Runtime MonitoringEnable Redo Transport Compression (Optional) So far we have done steps one and two. When the database creation has finished we perform step 3 by going into the compute console and opening up port 1521 or the dblistener service. We do this by going to the compute service and looking for our database instance name. In our example we hover over the service and find the dblistener service for prs11gPRIM. We select update and enable the port. Given that these are demo accounts we really can't whitelist the ip addresses and can only open it up to the public internet or nothing. We do this for the primary and the standby databaseOnce we have this configured we need to look at the ip addresses for prs11gPRIM and prs11gHP. With these ip addresses we can ssh into the compute instances and create a directory for the standby log files. We can create these files with the /u02 partition with the data or the /u03 partition with the backups. I suggest that you put them in the /u04 partition with the archive and redo logs. Once we have created these directories we can follow along with the Oracle By Example Physical Data Guard example starting at step 3. The network configuration is shown on page 12 of DR to Oracle Cloud whitepaper. We can also follow along with this using prs11gPRIM as the primary and prs11gHP as the standby. Unfortunately, after step 5 the instructions stop showing commands and screen shots to finish the configuration. We are forced to go back to the OBE tutorial and modify the scripts that they give and execute the configurations with the new names. Again, I am going to ask if anyone has a full tutorial on this using cloud services. It seems like every example goes half way and I am not going to finish it in this blog. It would be nice to see a 12c example and see how a pluggable database automatically replicates to the standby when it is plugged in. This would be a good exercise and workshop to run. My guess is this would be a half day workshop and could all be done in the cloud.

Normally the part two of the option pack has been a hands on tutorial or examples liberally lifted from Oracle by Example, OpenWorld hands on labs, or the GSE Demo environment. I even went to an...

PaaS

database option - RMAN

Technically, database backup is not an option with database cloud services, it is bundled into the service as it is with on premise systems. Previously, we talked about backup and restore through the database cloud console. Unfortunately, before we an talk about Data Guard and how to set it up we need to dive a little deeper into RMAN. The first step in setting up Data Guard is to replicate data between two database instances. The recommended way of doing this is with RMAN. You can do this with a backup and recover option or duplicate option. We will look primarily at the duplicate option.The topic of RMAN is a complex and challenging subject to tackle. There are many configuration options and ways to set up backups and data sets as well as many ways to recover rows, tables, or instances. Some books on RMAN include Oracle Database 12c Oracle RMAN Backup & RecoveryRMAN for Absolute Beginners Oracle RMAN 11g Backup and Recovery (Oracle Press) RMAN Recipes for Oracle Database 11g: A Problem-Solution Approach (Expert's Voice in Oracle) RMAN Recipes for Oracle Database 12c: A Problem-Solution Approach (Expert's Voice in Oracle) Oracle RMAN Database Duplication Oracle Backup and Recovery: Expert secrets for using RMAN and Data Pump (Oracle In-Focus) (Volume 42) Oracle Backup and Recovery: All about Oracle Backup and Recovery Oracle RMAN Pocket ReferenceFortunately, to setup Data Guard, we don't need to read all of this material but just need to know the basics. Unfortunately, we can't just click a button to make Data Guard work and automatically setup the relationships, replicate the data, and start log shipping. The key command that we need to get the backup from the primary to the standby is the RMAN command. We can execute this from the primary or the standby because RMAN provides a mechanism to remotely call the other system assuming that port 1521 is open between the two database instances$ rman target user1/password1@system1 auxiliary user2/password2@system2In this example user1 on system1 is going to backup the instance that it default connects to and replicates to system2 using the user2 credentials. This command can be executed on either system because we are specifically stating what the source is with the name target and what the standby is with the name auxiliary. Once we connect we can then executerman> duplicate target database for standby from active database;What this will do is replicate the database on system1 and push it to system2. The command will also setup a barrier point in time so that changes to system1 are shipped from this point forward when you enable Data Guard. According to Oracle Data Guard 11gR2 Administration Beginner's Guide (Chapter 2) the output of this command should look something like Starting Duplicate Db at 26-JUL-12using target database control file instead of recovery catalogallocated channel: ORA_AUX_DISK_1channel ORA_AUX_DISK_1: SID=19 device type=DISK...contents of Memory Script:{ backup as copy current controlfile for standby auxiliary format '/u02/app/oracle/oradata/orcl/control01.ctl'; restore clone controlfile to '/u02/app/oracle/flash_recovery_area/orcl/control02.ctl' from '/u02/app/oracle/oradata/orcl/control01.ctl';}executing Memory Script....sql statement: alter database mount standby database...Starting backup at 26-JUL-12using channel ORA_DISK_1channel ORA_DISK_1: starting datafile copyinput datafile file number=00001 name=/u01/app/oracle/oradata/orcl/system01.dbfoutput file name=/u02/app/oracle/oradata/orcl/system01.dbf tag=TAG20120726T160751channel ORA_DISK_1: datafile copy complete, elapsed time: 00:01:04channel ORA_DISK_1: starting datafile copy...sql statement: alter system archive log currentcontents of Memory Script:{ switch clone datafile all;}executing Memory Scriptdatafile 1 switched to datafile copyinput datafile copy RECID=2 STAMP=789667774 file name=/u02/app/oracle/oradata/orcl/system01.dbf...Finished Duplicate Db at 26-JUL-12In this example we copied the system01.dbf file from system1 across the network connection and wrote it to /u02/app/oracle/oradata/orcl/system01.dbf on system2. Let's take a step back and talk about RMAN a little bit to understand what is going on here. If we look at Oracle RMAN Pocket Reference it details some of the benefits of using RMAN over file system copy or disk cloning to backup a database. These includeIncremental backups that only copy data blocks that have changed since the last backup.Tablespaces are not put in backup mode, thus there is no extra redo log generation during online backups.Detection of corrupt blocks during backups.Parallelization of I/O operations.Automatic logging of all backup and recovery operations.Built-in reporting and listing commands.I would addCompression of data as it is writtenEncryption of data as it is writtenTiered storage of backups to disk and secondary target (cheaper disk, cloud, tape, etc)When RMAN executes it creates a recovery catalog database which is basically a table in the sys area that records the schema within the catalog database and thetables (and supporting objects) within the schema that contain data pertaining to RMAN backup and recovery operations performed on the target. It also stores details about the physical structure of the target database,a log of backup operations performed on the target database’s datafiles, control files, and archived redo log files as well asstored scripts containing frequently used sequences of RMAN commandsWhen we execute a backup command we create a backup set that is written to the recovery catalog. The backup set is given a tag that we can reference and restore from. If we do daily incrementals we might want to use a part of the date to create the tag. We can restore to a specific point or date in time from our incremental backups. If we are worried about having usernames and passwords being passed in via the command line or embedded in scripts we could store this password in the database with the orapwd command. This creates a username/password pair and stores it where RMAN can easily pull it from the database. We do need to give the rmanadmin user rights to execute as SYSDBA but this is easily done with a grant command. Once we do this we can drop the username and password from the rman command and only pass in the username@system parameter. The key reason that you might want to do this is invoking the command from the command line with the password exposes the password through the ps command which can be executed by any user. Embedding the password with the orapwd command helps hide this password.The nice thing about RMAN is that you can backup and restore parts rather than all of a database. You can executeRMAN> backup tablespace system, user;RMAN> backup '/u01/app/oracle/oradata/ORCL/system01.dbf';RMAN> backup incremental level 4 cumulative database skip readonly;which will backup the user and system tables, backup the system01.dbf file and all of the tables that it includes, and do a backup of the data that has changed since the last level 4 backup and user previous lower level backups to aggregate changes into the current backup. Note that these three commands do significantly different things. We can look at what we have backed up using the RMAN> list backups;to see our backup set and where they are stored. When we looked at the database backup cloud service we went through a backup and recovery.If we had done a list backups after this backup we would have noticed that the data written to SBT_TAPE was really written to cloud storage and potentially to local disk. We can then point our standby system to this backup set and restore into our database instance. This is done by importing the catalog or registering the target when we do the backup. The registration is done with a command like$ rman target / catalog rman_901/rman_901@rman_catalogwhere we are backing up a local database signified by the "/" and adding the host rman_catalog with username rman_901 and password rman_901. My recommendation is to look at chapter 12 of Oracle Database 12c Oracle RMAN Backup & Recovery because it details how to use the duplicate option for rman. This is key to setting up Data Guard because it replicates a table from a primary system onto a standby system just prior to starting the Data Guard services. The command could be as simple asRMAN> duplicate target database to standby1;This will duplicate your existing instance from your on premise to a cloud or other on premise instance identified by the label standby1. This typically correlates to an ip address of a secondary system and could be a short name like this or a fully qualified domain name. We could get more complex with something likeRMAN> duplicate target database to standby1 pfile=/u02/app/oracle/oradata/ORCL/init.oralog_file_name_convert=('primary','standby');This will do the same thing that the previous command did but read the init.ora file for the ORCL instance and convert anything in the /u02/app/oracle/oradata/ORCL/primary on our existing system to /u02/app/oracle/oradata/ORCL/standby on our target standby1 system. This is an easy way to replicate data from a PDB called primary to a PDB called standby prior to setting up a Data Guard relationship. The steps recommended to create and configure an RMAN copy areOn your standby server, build your auxiliary database directory structures (aka your target directory)On your primary server, make a copy of the target init.ora file so that it can be moved to the standby server. Move the target init.ora file to the auxiliary site with scp or other software to copy files. Start or restart the standby instance in NOMOUNT mode Configure the listener.ora at the standby siteConfigure the tnsnames.ora file at the primary siteCreate a password file at the standby serverMove the FRA files from primary to standbyFrom the primary system, run your duplicate command within RMANYou can add parameters to allow for parallel copies of the data. You probably should not compress or encrypt the data since we will be pulling it from the backup and writing it into a database. We could potentially compress the data but it will not compress the data on the target system, only compress it for transmission across the internet or local network.In summary, we needed to dive a little deeper into RMAN than we did before. RMAN is needed to duplicate data from our primary to the target prior to log shipping. There are some complexities associated with RMAN that we exposed and the steps needed to get a secondary site ready with rman are not trivial and need an experienced operating system admin and DBA to get this working. One of the new features of provisioning a cloud database service is a checkbox to create a Data Guard replica in another data center. One of the new features of installing a 12.2.2 database instance is also rumored to have a clone to cloud with Data Guard checkbox. As you install a new on premise database or in cloud database these complex steps are done behind the scenes for you as you would expect from a platform as a service model. Amazon claims to do this with site to site replication and restarting the database in another zone if something happens but this solution requires a reconnection from your application server and forcing your users to reauthenticate and reissue commands in flight. Using Data Guard allows your application server to connect to your primary and standby databases. If the primary fails or times out, the application server automatically connects to the standby for completion of the request. All of this is dependent upon RMAN working and replicating data between two live databases so that log shipping can assume that both servers are in a known state with consistent data on both systems.

Technically, database backup is not an option with database cloud services, it is bundled into the service as it is with on premise systems. Previously, we talked about backup and restore through the...

PaaS

database option - Data Guard

To steal liberally from Larry Carpenter's book on Data Guard, Data Guard is a product of more than 15 years of continuous development. We can trace the roots of today’s Data Guard as far back as Oracle7 in the early 1990s. Media recovery was used to apply archived redo logs to a remote standby database, but none of the automation that exists today was present in the product.Today we are going to look at the material on Data Guard and discuss the differences between Data Guard, Active Data Guard, and Golden Gate. We are going to look at what it takes to replicate from an on premise system to the cloud and from the cloud to an on premise system. It is important to know that you can also synchronize between two cloud instances but we will not cover this today. If we look at the books that cover this topic they include Oracle Data Guard 11g Handbook (Oracle Press) Oracle Data Guard 11gR2 Administration Beginner's Guide Oracle Database 11g Release 2 High Availability: Maximize Your Availability with Grid Infrastructure, RAC and Data Guard Oracle Dataguard: Standby Database Failover Handbook (Oracle In-Focus series) Oracle Database 10g High Availability with RAC, Flashback, and Data Guard (Osborne ORACLE Press Series) Oracle Database 12c The Complete Reference (Oracle Press)Note that there are not any 12c specific books written on Data Guard. This is primarily due to the technology not changing significantly between the 11g and 12c releases. The key new release in 12c is far sync support. We will cover that more later. There are also books written on Active Data Guard and Golden Gate as well Oracle GoldenGate 11g Handbook Oracle GoldenGate 12c Implementer's Guide Zero Downtime Database Upgrade & Active Active Replication Using Oracle GoldenGate 11g Release 2 Pro Oracle GoldenGate for the DBA Oracle Goldengate 11g Complete Cookbook Expert Oracle GoldenGate (Expert's Voice in Oracle)If we take a step back and look at high availability, Data Guard is used to provide this functionality between systems. Oracle Data Guard provides the management, monitoring, and automation software to create and maintain one or more standby databases to protect Oracle data from failures, disasters, human error, and data corruptions while providing high availability for mission critical applications. Data Guard is included with Oracle Database Enterprise Edition and in the cloud the High Performance Edition. Oracle Active Data Guard is an option for Oracle Database Enterprise Edition and included in the Extreme Performance Edition in the cloud.The home page for Data Guard provides links to white papersTechnical OverviewData Guard to CloudCompare and contrast to disk mirroring12c Data Guard Best Practice - OpenWorld 2015Setting up RACOne and Data GuardThere are also a significant number of blogs covering high availability and Data Guard.Maximum Availability BlogDatabase InsiderRaheel's BlogThe Oracle InstructorPavan's DBA BlogMahir Quluzade's BlogEmere Baransel's BlogMy recommendation would be to attend the Oracle Education Class or follow one of the two tutorials that cover Basic Data Guard Features and Active Data Guard Features. In both of these tutorials you learn how to use command line features to configure and setup an active - standby relationship between two databases. Larry Carpenter has done a really good job of detailing what is needed to setup and configure two database instances with these tutorials. The labs are a bit long (50+ pages) but cover the material very well and work with on premise systems or cloud systems if you want to play.The key concepts around Data Guard are the mechanisms for replication and how logs are shipped between systems. The basic foundation of Data Guard centers around replication of changes. When an insert or update is made to a table, this change is captured by the log writer and replicated to the standby system. If the replication mechanism is physical replication the data blocks changed are copied to the standby system. If the replication mechanism is logical replication the sql command is copied to the standby system and executed. Note that the select or read statements are not recorded and copied, only the commands that write to storage or update information in the database. By capturing the changes and shipping them to the standby system we can keep the two systems in synchronization. If a client is trying to execute a select statement on the primary database and the primary fails or goes offline, the select statement can be redirected to the standby for the answer. This results in seconds of delay rather than minutes to hours as is done with disk replication or recovery from a backup. How the replication is communicated to the standby system is also configurable. You can configure a write and release mechanism or a wait for commit mechanism. With the write and release mechanism, the logs are copied to the standby system and the primary system continues operation. With the wait for commit mechanism the primary stalls until the standby system commits the updates.Significant improvements were made in 11g with the log writer service (LNS) and the redo apply service (RNS). The LNS has the ability to delay the shipping of the logs in the asynchronous update mode and can compress the logs. The RNS knows how the LNS is configured and can get decompress the logs and apply them as was done before. This delay allows for the LNS to look for network congestion and ship the logs when the network is not so overloaded. The compression allows the packet size to be smaller to reduce contention on the network and make the replication more efficient. It is important to note that you can have a single LNS writing to multiple RNS targets to allow for replication not in a one to one configuration but in a one to many configuration. It is also important to note that this technology is different from table cloning or data masking and redaction that we talked about earlier. The assumption is that there is a master copy of the data on the target system and we only ship changes between the systems when an update occurs on the primary. The key difference between Data Guard and Active Data Guard is the state of the target database. With Data Guard, the database can not have any active sessions other than the RNS agent. You can not open the database for read only to do backups or analytics. Having an active sessions blocks the RNS agent from committing the changes into the database. Active Data Guard solves this problem. The RNS agent understands that there are active connections and can communicate changes to the active sessions if they are reading data from updated areas. A typical SQL connection uses buffering to minimize reads from the disk. Reads are done from an in memory buffer to speed up requests. The problem with reading data on a standby system is invalidation of these buffers. With Data Guard, there is no mechanism to invalidate buffers of sessions on other connections. With Active Data Guard, these mechanisms exist and updates are not only written to the disk but the cache for the other connections are updated.Golden Gate is a more generic case of Active Data Guard. One of the limitations of Data Guard is that you must have the same chip set, operating system, and database version for replication. Translations are not done when changes are shipped from primary to standby. You can't for example replicate from a Sparc Server to an X86 server running the same version of the Oracle database. One uses little endian while the other uses big endian to store the bits on disk. Physical replication between these two systems would require a byte translation of every change. Data Guard does not support this but Golden Gate does. Golden Gate allows you to no only ship changes from one database instance to a different chip architecture but a different chip architecture on a different operating system running a different database. Golden Gate was originally crated to replicate between database engines so that you could collect data with SQL Server and replicate the data to an Oracle database or MySQL database so that you could do analytics on a different database engine than your data collection engine. With Golden Gate there is a concept similar to the LNS and RNS but the agents are more intelligent and promote the data type to a master view that can be translated into the target type. When we define an integer it might mean 32 bits on one system but 64 bits on another system. Golden Gate is configured to lead fill from 32 to 64 and truncate from 64 to 32 appropriately based on your use cases and configurations. To replicate between two systems we basically need an open port from the primary system and the standby system to ship the logs. We also need a landing area to drop the change logs so that the RNS can pick up the changes and apply them. This prohibits Amazon RDS from enabling Data Guard, Active Data Guard, or Golden Gate since you do not have file system access. To run Data Guard in Amazon or Azure you need to deploy the Oracle database on a compute or IaaS instance and purchase the perpetual license with all of the options associated with the configuration. The beautiful thing about Data Guard is that it uses the standard port 1521 to communicate between the servers. There are special commands developed to configure and setup Data Guard that bridge between the two systems. As data is transmitted it is done over port 1521 and redirected to the RNS agent. We can either open up a network port in the cloud or create an ssh tunnel to communicate to our standby in the cloud. The communication works in both directions so we can flip which is primary and which is standby with a command or a push of a button with Enterprise Manager.The important conversation to have about data protection is not necessarily do I have a copy of it somewhere else. We can do that with RMAN backups or file backups to replicate our data in a safe and secure location. The important conversation to have is how long can we survive without access to the data. If an outage will cost us thousands per minute, we need to look at more than file replication and go with parallel database availability. Data Guard provides this mechanism to keep an active database in another location (on premise or in the cloud) and provides for not only a disaster recovery solution but a way or offloading services from our primary production system. We can break the replication for a few hours and stop the redo apply on the standby while we do a backup. The logs will continue to be shipped just not applied. When the backup is finished we grind through the logs and apply the changes to the standby. We have a window of vulnerability but we have this while we are running backups on our primary system as well. We can now offload the backups to our standby system and let the primary continue to run as needed without interruption. In effect what this does is take all of the small changes that happen throughout the day and ship them to a secondary system so there is a trickle effect on performance. If we do an incremental backup at night we basically block the system while we ship all these changes all at once.In summary, Data Guard is included with the High Performance Edition of the database and a free part of any on premise Enterprise Edition database. Active Data Guard is included with Extreme Performance Edition of the database and can be matched to synchronize an on premise or in cloud database that is also licensed to run Active Data Guard. There is a ton of reference material available on how Data Guard, Active Data Guard, and Golden Gate works. There are numerous tutorials and examples on how to configure and setup the service. It is important to know that you can use the cloud for this replication and a Dr to the Cloud whitepaper is available detailing how to do this.

To steal liberally from Larry Carpenter's book on Data Guard, Data Guard is a product of more than 15 years of continuous development. We can trace the roots of today’s Data Guard as far back as...

PaaS

database option - multi tenant part 2

Yesterday we looked at the concept behind multi-tenant and talked about the economics and operational benefits of using the option. Today we are going to look at examples and technical details. Fortunately, this is a hot topic and there are a ton of interviews, tutorials, and videos showing you how to do this. The best place to start is Oracle Technology Network - multitenant. This site lists seven offerings from Oracle Education, six online tutorials, and four video demos. Unfortunately, most books are a little light on this topic and cover it lightly in a chapter buried in the high number chapters. The most recent books cover this topic directlyKeeping Up With Oracle Database 12c Multitenant - Book OneOracle Database 12c Release 2 MultitenantBuilding Database Clouds in Oracle 12cTwo of these books are pre-order and the third is only a few months old. The other books talk about it as an abstract term with little or no examples. Safari Books does not have many that cover this subject because the topic is so new and few books have been published on the topic. The Oracle Base Blog has a series of postings about multitenant and does a really good job of showing diagrams and sample code. There is a significant amount of information at this site (24 posts) looking at the subject in depth. I normally provide a variety of links to other bloggers but I think that this work is good enough to deserve top billing by itself.Internal to Oracle the GSE Demo pages have a few demos relating to multi-tenant.PaaS - Data Management (Solutions Demo) has a hands on tutorial in the cloudDB12c Multi-Tenant Workshop by William Summerhill is on retriever.us.oracle.comOracle Database 12c multitenant and in-memory workshop using OPC by Ramulu Posham on retriever.us.oracle.comFor the rest of this blog I am going to go through the workshop by Ramulu Posham because it is the most complete and does everything 100% in the cloud. We could do this on the Oracle Public Cloud using DBaaS, or a database installed in IaaS on Oracle, Amazon, or Azure. We can not do this on Amazon RDS because they disable multi-tenant and prohibit it from working.The schedule for the workshop is 9:00 - 9:15 intro 9:15 -10:15 cloud intro10:30 - 2:00 multi-tenant workshopThe workshop consists of creating two pluggable database instances in the cloud and look at pluggable creation, cloning, and snap cloning. The assumption is that you have a public cloud account and can create two 12c databases in the cloud with less than 100 GB of disk space. You can do this on two 1 OCPU 7.5 GB instance but require High Performance or Extreme Performance Edition to get multi-tenant to work. The only software that we will need for our Windows 2012 IaaS instance will be Swing Bench which helps put a load on the database and allows us to look at utilization of resources for a container database and our pluggable instances.The flow of the workshop is shown in the following slide. We are going to create a database with data in the container and another database and put both instances in a pluggable database on one instance. Some of the more interesting slides from the presentation are shown below. The file location slide helped me understand where resources get allocated. The redo logs, for example, are part of the container database and not each pluggable. You setup Data Guard for all pluggables by configuring the container and replication happens automatically. The discussion on cloning a database is interesting because you don't need to copy all of the data. You only copy the header information and reference the same data between the original and the clone. Changes are tracked with file links as they are updated on both sides. The managing slide helped me understand that there is still a DBA for each pluggable as well as a master DBA for the container. Seeing that in a picture helped me understand it better. There are also multiple slides on resource management and shares. I pulled a representative slide as well as the summary benefits slide. This is what is covered in the first hour prior to the hands on labs starting. To start the lab, we create a 12c High Performance instance called salessvc$GC where $GC is a substitute for each lab participant. We will use 01 as our example so we will create salessvc01. Note that we call the database instance salessvc01, the ORACLE_SID salesc01, and the pluggable container sales01. We can not have the ORACLE_SID and the pluggable instance the same because it will confuse the listener so those names must be different. The creation takes between 30 minutes and an hour on our demo accounts. While we are waiting we will create a second instance with the name cmrsvc01 with similar parameters using the SID of crms01 and a pluggable container of crm01.Once we have the ip address of the two instances we can create an ssh tunnel for ports 443, 5500, 80, and 1521. This is done by creating an ssh tunnel in our putty client. The presentation material for the labs go through with very good screen shots for all of these steps. We have covered all of this before and are just summarizing the steps rather than detailing each step. Refer to previous blog entries or the workshop notes on how to do this.The sales instance looks like the screen shots below. We configure the ports and look at the directory structure in the /u02/app/oracle/oradata directory to verify that the sales01 pluggable database was created under the container database salesc01.Once we have the database created and ports configured we download and launch SwingBench to load data into the database and drive loads to test response time and sql performance. We need to download SwingBench and Java 8 to execute the code properly. Once we download SwingBench we unzip it and install it with a java command.The only true trick in the install is that we must execute lsnrctl status on the database to figure out what the listener is looking for in the connection string. We do this then type in localhost:1521 and the connection string to populate the database with SwingBench.We repeat this process for the cmrc01 instance, repeat the SwingBench install, and unplug the soe database from the crmc01 pluggable database to the salessvc01 database service and plug it in as a pluggable. The big issue here is having to unplug and copy the xml file. It requires uploading the private key and allowing ssh between the two instances. Once this is done the SwingBench is run against both instances to see if performance improves or decreases with two pluggables on the same instance. The instructions do a good job of walking you through all of this. Overall, this is a good workshop to go through. It describes how to create pluggable containers. It describes how to unplug and clone PDBs. It is a good hands on introduction and even brings in performance and a sample program to generate a synthetic load.

Yesterday we looked at the concept behind multi-tenant and talked about the economics and operational benefits of using the option. Today we are going to look at examples and technical...

PaaS

database option - multi tenant

Before we dive into what multi-tenant databases are, let's take a step back and define a few terms. With an on premise system we can have a computer loaded with a database series of databases. Historically the way that this was done was by booting the hardware with an operating system and loading the database onto the operating system. We load the OS onto the root file system or "/" in Unix/Solaris/Linux. We create a /u01 directory to hold the ORACLE_HOME or binaries for the database. Traditionally we load the data into /u02 or keep everything in /u01. Best practices have shown us that splitting the database installation into four parts is probably a good idea. Keeping everything in the root partition is not a good idea because your can fill up your database and lock the operating system at the same time. We can put the binaries into /u01 and do a RAID-5 or RAID-10 stripe for these binaries. We can then put all of our data into /u02 and name the /u02 file system a flash disk or high speed disk to improve performance since this has a high read and write performance requirements. We can RAID-5 or RAID-10 this data to ensure that we don't loose data or will use a more advanced striping technology provided by a hardware disk vendor. We then put our backups into /u03 and do a simple mirror for this partition. We can go with a lower performing disk to save money on the installation and only keep data for a few days/weeks/months then delete it as we get multiple copies of this data. We might replicate it to another data center or copy the data to tape and put it into cold storage for compliance requirements as well as disaster recovery fall backs. If we are going to replicate the data to another data center we will create a /u04 area for change logs and redo logs that will be shipped to our secondary system and applied to the second system to reduce recovery time. Backups give us recovery to the last backup. A live system running Data Guard or Active Data Guard gives us failure back to a few seconds or a transaction or two back rather than hours or days back. The biggest problem with this solution is that purchasing a single system to run a single database is costly and difficult to manage. We might be running at 10% processor utilization the majority of time but run at 90% utilization for a few hours a week or few days a month. The system is idle most of the time and we are paying for the high water mark rather than the average usage. Many administrators overload a system that have different peak usage times and run multiple database instances on the same box. If, for example, our accounting system peaks on the 25th through the 30th and our sales system peaks on the 5th through the 10th, we can run these two systems on the same box and resource limit each instance during the peak periods and let them run at 20% the rest of the month. This is typically done by installing two ORACLE_HOMEs in the /u01 directory. The accounting system goes into /u01/app/oracle/production/12.1.0/accounting and the sales system goes into /u01/app/oracle/production/12.1.0/sales. Both share the /u02 file system as well and put their data into /u02/app/oracle/oradata/12.1.0/accounting and /u02/app/oracle/oradata/12.1.0/sales. Backups are done to two different locations and the replication and redo logs are similarly replicated to different locations. Having multiple ORACLE_HOMEs has been a way of solving this problem historically for years. The key drawback is that patching can get troublesome if specific options are used or installed. If, for example, both use ASM (automated storage management) you can't patch one database without patching ASM for both. This makes patch testing difficult on production systems because suddenly sales and accounting are tied together and upgrades have to be done at the same time. Virtualization introduced a solution to this by allowing you to install different operating systems on the same computer and sublicense the software based on the virtual processors assigned to the application. You suddenly are able to separate the storage interfaces and operating system patches and treat these two systems as two separate systems running on the same box. Unfortunately, the way that the Oracle database is licensed has caused problems and tension with customers. The software does not contain a license key or hardware limit and will run on what is available. Virtualization engines like VMWare and HyperV allow you to soft partition the hardware and dynamically grow with demand. This is both good and bad. It is good because it makes it simpler to respond to increase workloads. It is bad because licensing is suddenly flexible and Oracle treats the maximum number of cores in the cluster as the high water mark that needs to be licensed. This is called soft partitioning. Operating systems like Solaris and AIX have hard partitions and virtualization engines like OracleVM and ZEN provide hard partitions. Customers have traditionally solved this by running an Oracle instance on a single socket or dual socket system to limit the core count. This typically means that the most critical data is running on the oldest and slowest hardware to limit price. Alternatively they run the database on a full blade and license all cores in this blade. This typically causes a system to be overlicensed and underutilized. The admin might limit the core count to 8 cores but there could be 32 cores in the blade and all 32 cores must be licensed. Using a virtualization engine to limit the resources between database instances is not necessarily practical and not fine enough resolution. Going with multiple ORACLE_HOME locations has been a growing trend since you have to license all of the cores based on current licensing policies. Another big problem with the multiple ORACLE_HOME or multiple operating system approach is that you have multiple systems to manage and patch. If we use the 32 core system to run four instances of application databases we have four patches to make for the virtualization engine, the operating systems, and the databases. An optimum solution would be to run one operating system on all 32 cores and spread the four databases with one ORACLE_HOME across each and resource limit each instance so that they don't become a noisy neighbor for the other three. We can then use resource manager to assign shares to each instance and limit the processor, memory, and network bandwidth based on rules so that noisy neighbors don't stop us from getting our job done. We get our shares and can be the noisy neighbor if no one else is using resources.With the 12c instance of the database, Oracle introduced an option called multi-tenant. Let's think of a company like SalesForce.com. They don't spin up a new instance for each company that they do business with. They don't install a new ORACLE_HOME for each company. They don't spin up a new operating system and install a new database instance for each company. This would not make economic sense. A five person company would have to spend about $3K/month with SalesForce to cover just the cost of the database license. On the flip side, custom code must be written to isolate user from company A from reading customer contact information from company B. A much simpler way would be to spin up a pluggable database for company A and another for company B. No custom code is required since the records for the two companies are stored in different directories and potentially different disk locations. If we go back and look at our partitioning blog entry we notice that we have our data stored in /u02/app/oracle/oradata/ORCL/PDB1. The ORCL directory is the location of our container database. This contains all of the configuration information for our database. We define our listener at this location. We create our RMAN backup scripts here. We define our security and do auditing at this level. Note that we have a PDB1 subdirectory under this. This is our pluggable database for company A. We would have a PDB2 for company B and the system01.dbf file in that directory is different from the system01.dbf file located in the PDB1 directory. This allows us to create unique users in both directories and not have a global name issue. With SalesForce all usernames must be unique because users are stored in a master database and must be unique. I can not, for example, create a user called backupadmin that allows users to log in to company A and backup the data set if there is a user defined by that same name for any other company world wide. This creates script issues and problems. We can't create a single backup script that works across all companies and must create a unique user and script for each company. The main concept behind the multi-tenant option is to allow you to run more databases on the same box and reduce the amount of work required to support them. By putting common tasks like backup and restore at the container level, all pluggables on this system are backed up in a central location but separated by the pluggable container so that there is no data mingling. Data can be replicated quickly and easily without having to resort to backup and restore onto a new instance. The system global area (SGA) is common for the container database. Each pluggable container gets their own personal global area (PGA) that manages I/O buffers, compiled sql statements, and cached data. Note that we have one redo log and undo log area. As changes are made they are copied to a secondary system. We don't have to configure Data Guard for each pluggable instance but for the container database. When we plug a instance into a container it inherits the properties of the container. If we had a container configured to be RAC enabled, all pluggables in the database instance would be RAC enabled. We can use the resource manager in the container database to limit the shares that each pluggable instance gets and reduce the noisy neighbor overlap that happens on a virtual machine configuration. We also reduce the patching, backup, and overall maintenance required to administer the database instance.To create a pluggable instance we need to make sure that we have requested the High Performance or Extreme Performance Edition of the database. The Standard Edition and Enterprise Edition do not support multi-tenant. It is important to note that to get this same feature on Amazon you can not use RDS because they prohibit you from using this option. You must use IaaS and go with Amazon EC2 to get this feature to work. Microsoft Azure does not offer the Oracle database at the platform level so your only option is Azure Compute. The pluggable creation is simple and can be done from the command line through sqlplus. The 12c Database Documentation details this process.CREATE PLUGGABLE DATABASE salespdb ADMIN USER salesadm IDENTIFIED BY password;orCREATE PLUGGABLE DATABASE salespdb ADMIN USER salesadm IDENTIFIED BY password ROLES=(DBA);or more complexCREATE PLUGGABLE DATABASE salespdb ADMIN USER salesadm IDENTIFIED BY password STORAGE (MAXSIZE 2G MAX_SHARED_TEMP_SIZE 100M) DEFAULT TABLESPACE sales DATAFILE '/disk1/oracle/dbs/salespdb/sales01.dbf' SIZE 250M AUTOEXTEND ON PATH_PREFIX = '/disk1/oracle/dbs/salespdb/' FILE_NAME_CONVERT = ('/disk1/oracle/dbs/pdbseed/', '/disk1/oracle/dbs/salespdb/');Note that we can make the creation simple or define all of the options and file locations. In the last example we create the pluggable instance by cloning the existing pdbseed. In our example this would be located in /u02/app/oracle/oradata/ORCL. We would pull from the pdbseed directory and push into the salespdb directory. All three examples would do this but the third details all options and configurations.When we create the instance from the sql plus command line, it could assume a PDB name for the file system. We might want to use the more complex configuration. When we executed this from the command line we got a long string of numbers for the directory name of our new pluggable instance called salespdb. We could do the same thing through sql developer and have it guide us through the renaming steps. It prompts us for the new file name showing where the seed is coming from. We could have just as easily have cloned the salespdb and used it as our foundation rather than creating one from the pdbseed. We right click on the container database header and it prompts us to create, clone, or unplug a pluggable. If we select create we see the following sequence. One thing that we did not talk about was economics. If you wanted to run multi-tenant on premise you need to purchase a database license at $47.5K per two processors and the multi-tenant option at $23K per two processors as well. This comes in at $60.5K for the license and $13,310 per year for support. Using our four year cost of ownership this comes in at $2,495 per month for the database license. The High Performance edition comes in at $4K per month. Along with this you get about $5K in additional features like diagnostics, tuning, partitioning, compression, and a few other features that we have not covered yet. If you are going to run these options on Amazon or Azure you will need to budget the $2.5K for the database license and more if you want the other features on top of the processor and storage costs for those cloud services. You should also budget the outgoing data charges that you do not have to pay for with the non-metered database service in the Oracle Cloud. Going with the multi-tenant option is cheaper than running the database on two servers and easier than running two ORACLE_HOME instances on the same machine. Going with the High Performance Edition gets you all of these options and offloads things like scale up, backup, initial configuration, and restart of services if a process fails.In summary, multi-tenant is a good way of overloading services on a single server. The resource management features of the container allow us to dynamically change the allocation to a pluggable database and give more resources to instances that need it and limit noisy neighbors. With the High Performance edition and Extreme Performance Edition we get multi-tenant as a foundation for the service. Our primary interface to create a pluggable instance is either SQL Developer, Enterprise Manager, or sqlplus. We can easily clone an existing instance for a dev/test replica or export an instance and plug it into another system. We will look at this more in depth tomorrow.

Before we dive into what multi-tenant databases are, let's take a step back and define a few terms. With an on premise system we can have a computer loaded with a database series of...

PaaS

database options - review and looking forward

For the past two weeks we have been looking at database as a service (DBaaS) offered as platform as a service (PaaS) on the Oracle Public Cloud. We started this journey on the 2nd of this month by looking at the differences between Standard Edition, Enterprise Edition, High Performance Edition, and Extreme Performance Edition. To quickly review, Standard Edition is the basic database with table encryption as the only option. This is a full feature database without the ability to replicate data with any tool other than copying files and RMAN backup. You can't do things like transportable table spaces, streams, Data Guard, or any other replication technologies to make a system more highly available. Enterprise Edition is a more full featured database that allows for data replication from the installation and comes with Advanced Security (TDE) as a basis for the installation. This edition does not come with Data Guard but does have the option for transportable tablespaces, external references, and ways of replicating data manually from the database that Standard Edition does not contain. We then looked at the High Performance Edition which comes with Transparent Data EnctyptionDiagnosticsTuningPartitioningAdvanced CompressionAdvanced SecurityData GuardLabel SecurityMultitenantAudit VaultDatabase VaultReal Application TestingOLAPSpatial and GraphicsWe then looked at Extreme Performance Edition that contains all of the above plusActive Data GuardIn MemoryReal Application Clusters (RAC)RAC OneWe then went into a simple description of each option and prepared for the following blogs that will go into more detail and code samples of not only what the options are but look at how to use them and try to tie them back to business benefits. Part of our description was a financial analysis of running a database in infrastructure as a service (IaaS) vs PaaS and the time and efficiency benefits that we get from PaaS over IaaS. We wrapped up the week on the 3rd with a blog detailing what it takes to get a Windows desktop prepared to use a database in the cloud. The list of software is relatively generic and is not unique to Windows. We could just as easily have selected MacOSX or Linux but selected a Windows 2012 Server running in the Oracle Public Compute Cloud as IaaS. We did this primarily so that we would have a teaching platform that can be saved with a snapshot, reloaded for hands on classes, and accessible from a customer site to demonstrate cloud services. The software that we loaded on the Windows platform includesTo access cloud storageMozilla FirefoxRestClient extension for FirefoxGoogle ChromePostman extension for ChromeCloudBerry for OpenStack (Windows only right now)To access files in our instancePuttyFilezillaCygwin (only needed on Windows)To access our database instanceSQL DeveloperMicrosoft Visual C++ libraries (only needed on Windows) We installed this on the Oracle Public Cloud because we have free accounts on this platform and we can keep them persistent. It would normally cost $150/month to keep this instance active if we purchased it as IaaS. We could just as easily have done this on Amazon EC2 or Microsoft Azure at a similar cost. We provisioned 30 GB of disk for the operating system and 10 GB for the binaries. We requested a 2 vCPU with 30 GB of RAM. If we were doing this on Amazon or Azure we probably would have gone for a smaller memory footprint but this is the base configuration for IaaS and Windows with 2 vCPUs in the Oracle Public Cloud. The idea is that a class of 15-30 people can log into this system with different user names and do minimal configuration to get started on workshops. We can refresh the users but not refresh the system for classes the following day or week. To provision this instance we went to the Oracle Cloud Marketplace to get a pre-configured Windows 2012 Server instance. We then downloaded the list of software and install them on the desktop.On the 6th we dove into database partitioning to figure out that we can reduce storage costs and improve performance by fitting active data into memory rather than years or months of data that we typically throw away with a select statement. We talked about using partitioning to tier storage on premise and how this makes sense in the cloud but does not have as much impact as it does on premise. We talked about different partitioning strategies and how it can be beneficial to use tools like Enterprise Manager and the partition advisor to look at how you are accessing the data and how you might partition it to improve performance. On the 7th we looked at code samples for partitioning and talked about tableextents and file system storage. We talked about the drawbacks to Amazon RDS and how not having file system access, having to use custom system calls, and not having sys access to the database causes potential issues with partitioning. We walked through a range partition example where we segmented the data into dates and stored the different dates into different tablespaces. On the 8th we focused on database compression. This in conjunction with partitioning allows us to take older data that we typically don't access and compress for query or historical storage. We talked about the different compression methodologiesusing historic access patterns (heat map and ADO options)using row compression (by analyzing update and insert operations as they occur)file compression (duplicate file links and file compression of LOBS, BLOGS, and CLOBS)backup data compressionData Guard compression of redo logs before transmissionindex compressionsnetwork transmission compression of results to client systemshybrid columnar compression (Exadata and ZFS only)storage snapshot optimization (ZFS only)We did not really dive into the code for compression but referred to a variety of books and blogs that have good code samples. We did look at the compression advisor and talked about how to use it to estimate how your mileage could potentially vary. On the 9th we dove into an Oracle by Example tutorial on compression and followed the example using DBaaS. The examples that we followed were for an 11g instance but could have been done in a 12c instance if we had the demo tables installed on the 12c instance. On the 10th we focused on database tuning options and dove into how to use SQL Tuning Advisor. In the example that we used we referenced an external table that was not part of the select statement which caused an unnecessary table index and full table scan. The example we used was again for 11g to utilize the sample database that we have installed but could just as easily have worked with 12c. On the 13th we dove a little deeper into tuning with a focus on Enterprise Manager and the Performance Advisor utilities. We followed the Oracle by Example Tuning Tutorial to diagnose a performance problem with a sql statement.On the 14th we looked at transparent data encryption (TDE) and how to enable and disable table compression in the cloud. We talked about the risks of not encrypting by default and tried to draw lesions from Target Corporate and how failure to protect credit card data with encryption led to job losses across the company. On the 15th we looked at the backup and restore utilities in the cloud and how they differ from traditional RMAN utilities. You can use RMAN just like you do today and replicate your backup and restore as you do today but there are automation tools that monitor RMAN and kick off alternate processes if the backup fails. There are also tools to help restore for those unfamiliar with RMAN and the depths and details of this powerful package. Today we are reviewing what we have done in the last week and a half and are looking forward to the next week or two. We are going to finish out the database options. On Monday we are going to dive into multi tenant and talk about pluggable databases. This discussion will probably spill over into Tuesday with an overview happening on Monday and code samples and demos on Tuesday. We will need to use a 12c database since this is a new feature that is specific to 12c only. We might split our code samples into using SQL Developer to clone and manage PDBs on Tuesday and cover the same functionality with Enterprise Manager on Wednesday. Following this discussion we will do a high level discussion on Data Guard and look at the change log and log replication strategies that can be used for physical and logical replication. The following days we will look at code samples and configurations from the command line, enterprise manager, and sql developer. We will look at what it will take to setup a primary on premise and standby in the cloud. We will also look at what is required to have both in the cloud and what it takes to flip primary and standby to emulate a failure or maintenance action then flip the two back.Once we cover Data Guard we will be in a position to talk about real application testing. In essence Data Guard copies all of the writes that happen on the primary and replay them on the standby. Real Applicaiton Testing records the reads as well and replays the reads and writes to help measure performance differences between configurations. This is good for code change testing, patch testing, configuration change testing, and other compare/contrast changes to your production system in a safe environment. Once we finish the high availability and data replication options we will dive into OLAP and Spatial options. OLAP reorganizes the data for data warehouse analysis and spatial allows you to run geographical select statements like show me all crimes that happened within a mile of this address. Both are optimizations on select statements to help optimize usage of the database in specific instances.We will wrap up our coverage by looking at Audit Vault and Database Vault. Both of these options are additional levels of security that not only help us protect data but restrict and track access to data. Many financial and healthcare institutions require interfaces like this to show separation of duty as well as traceability to see who accessed what when.Once we finish the High Performance Edition options we will dive into the Extreme Performance Edition options looking at Active Data Guard, In Memory, RAC and RAC One. Going through all of the options will probably take us through the month of June. We will probably look at the other PaaS options listed in cloud.oracle.com starting some time in July and see how they relate or differ from the DBaaS services that we are currently covering.

For the past two weeks we have been looking at database as a service (DBaaS) offered as platform as a service (PaaS) on the Oracle Public Cloud. We started this journey on the 2nd of this month by...

PaaS

database option - backup and restore

Backup and recovery abilities are arguably the most critical skills required of a database administrator. Recovery Manager (RMAN) is Oracle’s standard backup and recovery tool; every Oracle DBA should be familiar with utilizing RMAN. Some DBAs use alternate tools since RMAN is an Oracle specific tool to backup data in a database. Alternatives include Veritas Backup Exec, Comvault Sympana, Legato Networker, EMC and NetApp tools, and other packages. I am not going to list books and software packages in this blog. When I did a search on Safari Books search for rman we get 9 books published in 2016, 16 in 2015, and 20 in 2014. There are also a ton of blogs so I suggest that you go with your favorite blog and search for rman or backup. There are hundreds of different ways to backup a database and restore the database as well as optimize how much space the database takes up in cold storage. The important things to consider when you look at backup and recovery arefull or incremental backupsbacking up to disk, tape, or cloudreplicating data to another site with disk mirroring, Data Guard, or Golden Gatehot backup or cold backup along with middleware and file system backup synchronizationrecovery point objective and recovery time objectivecompression, deduplication, and encryption of data at restbackup windows and restore windowsIt is important to note that when you purchase DBaaS, independent of any edition, you get backup done for you based on the options you select at creation. When you create a database you can opt for no backup, local backup, and full backups. The no backup can be used for development and test instances. We might just want a sandbox to play in and don't care about keeping this data so that we can restore. If we loose the data for any reason we can recreate it from our production system. When you select local backups you get incremental backups daily at 2am and a full backup Sunday morning at 2am. This gives you a seven day window for recovery so that you can restore data back to your last full backup with daily incrementals. These backups go to the /u03 file system on the instance that you create. Nothing is copied off the instance that you create so a complete failure of the system could result in potential data loss. The full backup does an incremental to /u03 daily and a full backup Sunday morning at 2am to /u03 as well. Prior to the full backup, the backup file is copied to the Cloud Object Storage Container that you created prior to creating the database. When you created the database you specify the days that you want to retain backups. If, for example, you ask for a 90 day backup you get 13 full backups copied to the object storage. If you have a Java Cloud Service connected to this database, the Java configuration and war files are also copied to this same area. The backup is automatically done for you and can be reconfigured and done manually using the cloud specific commands. Refer to the

Backup and recovery abilities are arguably the most critical skills required of a database administrator. Recovery Manager (RMAN) is Oracle’s standard backup and recovery tool; every Oracle DBA...

PaaS

database option - diagnostics

Keeping a database tuned is a full time job. Automating some of these tasks helps a DBA support more databases and reduce the time required to generate the data. Automatic report generation is a good way of getting these reports. One of the questions I constantly get asked is what is included with PaaS or DBaaS and what is done for me. With DBaaS, database tuning and diagnostics are not part of the services provided. The DBA still needs to look for processes that are holding locks. The DBA still needs to look for run away sql statements. The DBA still needs to look for alternate execution plans and sql tuning to make the database run faster. The Diagnostics Pack is a key tool to help with this. The services that are included with DBaaS include database and operating system patching, making sure that the database is restarted if it stops (unless you issue a shutdown command), and performs automated backups that you can tweak and tune in frequency and amount of data stored. Tools like ADDM, ASH, and AWR are still required by the DBA and can be accessed from the command line through sqlplus, using SQL Developer, or Enterprise Manager. In 10g, many diagnostic tools like ASH and AWR were embedded into the database. In 11g they were automated to collect the data into a central location. In 12c reports were automated so that DBAs did not need to schedule the jobs and generate reports late at night to look at in the morning. Many of these features started in Enterprise Manager but got migrated into the database. There was some controversy with the 10g release because it did impact performance compared to the 9i release but that seems to have gone away with the 11g and 12c releases. An overall architecture of the performance collection can be seen in the diagram below. The key features to the Diagnostic Pack for the database includeActive Session History (ASH)Automated Workload Repository (AWR)Automatic Database Diagnostic Monitor (ADDM)Enterprise Manager Performance reportingSQL Developer Performance reportingMore information on all of these topics can be found in a variety of locations. Most of the information in this blog can be found atOCA Oracle Database 12c Installation and Administration Exam GuideOracle Database Problem Solving and Troubleshooting HandbookOracle SQL DeveloperExpert Oracle Enterprise Manager 12cOracle Database 12c The Complete ReferenceDiagnostic Pack presentationOracle Tips by Burleson Consulting Blogunfortunately, most of the blog posts are centered around 10g and OEM 10g since ADDM and AWR were new with this release.ASHASH statistics are enhanced to provide row-level activity information for each captured SQL statement. This information can be used to identify which part of the SQL execution contributed most significantly to the SQL elapsed time. The ASH views, accessible to DBAs, provide historical information about the active sessions in the database.AWRThe Automated Workload Repository (AWR) reports provide baseline statistics for the database and show the database performance during specified intervals. A baseline contains performance data from a specific time period that is preserved for comparison with other similar workload periods when performance problems occur. Several types of baselines are available in Oracle Database: fixed baselines, moving window baselines, and baseline templates.ADDMDBAs and developers can use ADDM to analyze the database and generate recommendations for performance tuning. In Oracle 11g, ADDM was enhanced to perform analysis on database clusters on various levels of granularity (such as database cluster, database instance, or specific targets) over any specified time period.You can access ADDM through SQL Developer or Enterprise Manager. To access these functions you must first enable the Diagnostics Pack which allows you access to the reports. You can manually run the ADDM report with a command line@?rdbms/admin/addmrpt.sqlIf you look at SQL Developer and go to the DBA navigation link and expand the database for Performance you can see the AWR and ADDM reports. Expanding on these links shows you the various reports. For the ADDM, for example, you can quickly see if there is a recommendation or not and drill down into the recommendation. If we click on one of the finding with a Yes in the recommendation column we can look at the report and recommendations that it has. For the example we found it had two suggestions for tuning. Below are samples of this report and the two recommendations.We can look at similar information from Enterprise Manager by navigating to the Performance page and selecting the report that we want.Typical AWR report output usually contains an incredible amount of information about an Oracle database’s application workload behavior. When a database instance is suffering from a gc buffer busy wait event during the time period chosen for the AWR report, however, that event will usually surface as one of the Top 5 Timed EventsWith AWR you can create a baseline and look at spot performance or variation from a baseline. Page 21 of Diagnostic Pack Presentation does a good job of describing how to read and understand an AWR Report. The blog Bash DBA does a good job of walking through an AWR and looking for issues and problems in a system.In summary, we are not going to dive deep into AWR and ADDM diagnostics. Knowing how to do this differentiates a high paid DBA from a junior DBA. There are classes through Oracle Education - 12c Performance Management and Tuning and other vendors that teach you how to understand this option as well as the books we mentioned above and certification exams to help show that you know this material. It is important to note that all of these tools work with platform as a service (Oracle and Amazon RDS) as well as infrastructure as a service and on-premise installations. The key difference is that the diagnostic and tuning tools are bundled with the High Performance and Extreme Performance Editions. For IaaS and on-premise you need to purchase a perpetual license that we discussed a few blogs ago.

Keeping a database tuned is a full time job. Automating some of these tasks helps a DBA support more databases and reduce the time required to generate the data. Automatic report generation is a good...

PaaS

database options - advanced security

Advanced Security and Transparent Data Encryption (TDE) stops would-be attackers from bypassing the database and reading sensitive information from storage by enforcing data-at-rest encryption in the database layer. Earlier when we talked about partitioning and compression we talked about tablespace files and how to manage them. If these files are stored in clear text, anyone who can access our file system could theoretically read the dbf file and do a clear text search. If, for example, we have a credit card number "1234 5678 9012 3456". We could find this string with a strings, grep, or od statement (octal dump to show text data). If we don't know the credit card number we can just do a dump on the data and see the values stored in the database. With enough trial and error we can figure out what the structure of the database might be and correlate names to credit card information. By default, all data stored in the cloud has TDE enabled. This is done in the $ORACLE_HOME/network/admin/sqlnet.ora file. If you look at this this file, you will see that the ENCRYPTION_WALLET_LOCATION is defined as well as the SQLNET.ENCRYPTION_TYPES_SERVER is defined. When the database is created, a wallet is generated based on your ssh keys allowing you to access your data. In the database that we created we have the wallet file location in /u01/app/oracle/admin/ORCL/tde_wallet. There is a file called cwallet.sso that is enabled anytime someone connects to the database through port 1521 or through the sqlplus command. If we rename this file to something else, we can connect to the database and create clear text files. Note that it is not recommended that you do this but we are doing this to highlight first the differences between Iaas and PaaS as well as the need for data encryption in the cloud. With a database installation on top of compute as is done with EC2, Azure Compute, and Oracle Compute, we have to enable TDE, configure the wallet, configure the cwallet.sso. With PaaS, this is all done for you and you can manage the keystore and rotate key access.Note that it is not recommended that you execute these commands. They will desecure your database and allow for clear text storage and transmission of data across the internet. We are doing this as an example.cat $ORACLE_HOME/network/admin/sqlnet.oracd /u01/app/oracle/admin/ORCL/tde_walletlsThis should allow you to see the cwallet.so file which enables automatic wallet connection upon login. If we want to change encryption we can first look at the parameter encrypt_new_tablespaces and see that it is set to CLOUD_ONLY which encrypts everything. We want to change this to DDL which says that we only encrypt if we tell it to. We first want to create a tablespace and a banking user with encryption turned on. This is done withsqlplus / as sysdbaalter session set container=PDB1;drop tablespace banking including contents and datafiles;create tablespace banking datafile '/u02/app/oracle/oradata/ORCL/PDB1/banking.dbf' size 20m;We then create a banking user as well as a paas_dba user. One is used to create an encrypted table and the other is to create a clear text tabledrop user banking;create user banking identified by "PaasBundle_1" default tablespace banking; grant create session to banking;grant unlimited tablespace to banking;drop user paas_dba;create user paas_dba identified by "PaasBundle_1";grant create session to paas_dba;grant dba to paas_dba;We now connect to the PDB1 as the banking user and create a table in our encrypted tablespaceconnect banking/PaasBundle_1@PDB1create table banking_customers (first_name varchar(20), last_name varchar(20), ccn varchar(20)) tablespace banking;insert into banking_customers values('Mike','Anderson','5421-5424-1451-5340');insert into banking_customers values('Jon','Hewell','5325-8942-5653-0031');commit;alter system flush bufffer_cache;select ccn from banking_customers;This should create a table with two entries. The table will be encrypted and stored in the banking.dbf file. If we do a string search from this file we will not find the credit card number starting with 5421. Now that we have an encrypted table created we need to reconnect to the database and disable encryption on new tables. To do this we change the parameter from CLOUD_ONLY to DDL as sysdba.sqlplus / as sysdbashow parameter encrypt_new_tablespaces;alter system set encrypt_new_tablespaces=DDL SCOPE=BOTH;We can now create a new tablespace and the contents are only encrypted if we pass in the encrypt statement with the create statement. The tablespace will not be encrypted by default. We do this operation as paas_dba who has dba rights to create a tablespace and table. connect paas_dba/PaasBundle_1@PDB1;drop tablespace bankingCLEAR including contents and datafiles;create tablespace bankingCLEAR datafile '/u02/app/oracle/oradata/ORCL/PDB1/bankingCLEAR.dbf' size 20m;create table banking_Clearcustomers tablespace bankingCLEAR as select * from banking_customers;select ccn from banking_Clearcustomers;We should get back two entries from both select statements and see two credit card numbers. We can then exit sqlplus and look at the banking.dbf and bankingCLEAR.dbf files to see if we can lift credit cards. By executing the strings bankingCLEAR.dbf | grep 5421strings banking.dbf | grep 5421we see that we get the credit card number from the bankingCLEAR.dbf file. The data is inserted clear text. It is important to remember that all data should be encrypted in motion and at rest. We need to change the parameter back to CLOUD_ONLY for the encrypt_new_tablespaces moving forward. By default we connect to the data using the encryption wallet. We can disable this as well as turning off default encryption. In summary, we have looked at what it takes to encrypt data at rest in the cloud. By default it is turned on and we don't have to do anything with platform as a service. If we are using infrastructure as a service we need to purchase the Advanced Security option, turn on encrypting tablespaces bu changing a parameter, enable the wallet to our login, and install keys for the wallet. Platform as a service provides levels typically above and beyond what most customers have in their data center. The recent credit card loss that happened at Target a few years ago happened because they owned the Advanced Security option for the database but did not turn on the feature. An outside consultant (working on something non-it related) got access to a shared storage and pulled the dbf files onto a USB drive. They were able to get thousands of credit cards from the data center costing the CIO and IT staff their jobs. Today we learned how to turn off TDE in the cloud to illustrate what it takes to turn it on in your data center and we looked at the simplicity of pulling data from a dbf file if we know the data we are looking for. We could just as easily have just looked for number patterns and peoples names and correlated the two as valid credit card numbers. I would like to thank the GSE team at Oracle for doing 90% of the work on this blog. I liberally hijacked the demo scripts and most of the content from demo.oracle.com, PaaS - Data Management (Solutions Demo) by Brijesh Karmakar. The GSE team creates demo environments and scripts for specific products. This demo is an Oracle internal demo and can be requested from your pre-sales consultant in your area. I took the demo code from the Database Vault and Encryption_demo_script written on 08-Jun-2016. I have to admit that looking for demo scripts on demo.oracle.com and searching for TDE was very opportune given that I wrote this blog on the 9th and it was published on the 8th.

Advanced Security and Transparent Data Encryption (TDE) stops would-be attackers from bypassing the database and reading sensitive information from storage by enforcing data-at-rest encryption in the...

PaaS

database option - tuning part 2

In our last entry we looked at the results of a sql tuning advisor. We used SQL Developer to execute our code and create a tuning advisory for the code that we executed. We could have gone through Enterprise Manager and done the same thing but done this historically rather on live data. In this blog entry we will analyze the same results using the Enterprise Manager Express that comes with the database as a service in the Oracle Cloud. To connect to this service we need to first open up the network ports to enable connection to port 1158. This is done through the database console or we could do this with ssh tunneling of port 1158 to our database target ip address. Once we have access to port 1158 we can connect to the Enterprise Manager Express by going to the ip address of our server, in this instance 129.152.134.189 which we got from the database console, and connect to https://129.152.134.189:1158/em. Note that we might get a security exception since the certificate is self signed. We need to add an exception and connect to this service. When we are prompted for a login we connect as sys with sysdba rights. Note that we can not do this on Amazon RDS since we do not have sys access to the database in this service.When we click on the Performance link at the top of the screen we can look at the overall performance of our system and drill down in to reports to get more information. If we scroll down to the bottom we see a link called Advisor Central. We can follow this link and look at all of the tuning advisors that have been run and examine the results. We can select a previously run tuning advisor and look at the recommendations that we did from SQL Developer. When we dive into the report we get a little different style report.Note that the SQL profile recommends a 19.8% savings if we change the profile. If we click on the Recommendations and expand the information as a table rather than a graph we see that the pk_dept reference takes up a good amount of time and if we could get rid of it since it is not referenced it will speed up the select statements. If we click on the compare explain plans we can see how much of a speed up we will get if we implement the new plan. What I don't see from this is the recommendation to drop the dept d reference that we got from SQL Developer.Note that the recommendation does state "Consider removing the disconnected table or view from this statement or add a join condition which refers to it" but does not specifically recommend removing dept d from the select statement as is done in SQL Developer. If we wanted to expand upon use of the tuning advisor we could follow along the Oracle by Example Tuning Tutorial and look at how to initiate tuning advisories through Enterprise Manager. The first thing done in this tutorial is to initiate nightly tuning tasks by going into the server and enabling the Automated Maintenance Tasks. You first click the Configure button then click Configure the Automatic SQL Tuning button. After you change the Automatic Configuration of SQL Profiles to yes and click Apply you have a checkbox window to select the dates to run the tuning tasks. Once we have this defined we can execute the following code as sysset echo ondrop user ast cascade;create user ast identified by ast;grant dba to ast;alter system flush shared_pool;---- Turn off AUTOTASK--alter system set "_enable_automatic_maintenance"=0;---- Clear out old executions of auto-sqltune--exec dbms_sqltune.reset_tuning_task('SYS_AUTO_SQL_TUNING_TASK');---- Drop any profiles on AST queries--declare cursor prof_names is select name from dba_sql_profiles where sql_text like '%AST%';begin for prof_rec in prof_names loop dbms_sqltune.drop_sql_profile(prof_rec.name); end loop;end;/This creates a user ast. I recommend changing the password to something more complex.We can then run a series of malformed select statements to generate some synthetic load to report upon and correct. Note that we do this as the ast user and not sys. set echo offselect /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;select /*+ USE_NL(s c) FULL(s) FULL(c) AST */ c.cust_id, sum(s.quantity_sold) from sh.sales s, sh.customers c where s.cust_id = c.cust_id and c.cust_id < 2 group by c.cust_id;Once we have the workload created we can kick off the sql tuning advisor with some different code. set echo onexec dbms_workload_repository.create_snapshot;variable window varchar2(20);begin select upper(to_char(sysdate,'fmday'))||'_WINDOW' into :window from dual;end;/print window;---- Open the corresponding maintenance window, but with other clients disabled--alter system set "_enable_automatic_maintenance"=1/exec dbms_auto_task_admin.disable( - 'auto optimizer stats collection', null, :window);exec dbms_auto_task_admin.disable( - 'auto space advisor', null, :window);exec dbms_scheduler.open_window(:window, null, true);---- Close the maintenance window when sqltune is done--exec dbms_lock.sleep(60);declare running number;begin loop select count(*) into running from dba_advisor_executions where task_name = 'SYS_AUTO_SQL_TUNING_TASK' and status = 'EXECUTING'; if (running = 0) then exit; end if; dbms_lock.sleep(60); end loop; dbms_scheduler.close_window(:window);end;/alter system set "_enable_automatic_maintenance"=0/---- Re-enable the other guys so they look like they are enabled in EM.-- Still they will be disabled because we have set the underscore.--exec dbms_auto_task_admin.enable( - 'auto optimizer stats collection', null, :window);exec dbms_auto_task_admin.enable( - 'auto space advisor', null, :window);Note that this executes with some errors but still generates a good sql tuning advisor report. If we look back at the Advisor Central we can dive into the report and see what happened.We can get an 8 second speedup by reformatting the sql select statements. This might or might not be worthy of tuning based on how many times we execute the code.In summary, we have alternate ways of looking at sql tuning as well as a say of looking at historic data. We turned on automatic tuning reports which does consume more resources but if we have extra cpu cycles we can benefit from the reports. The Enterprise Manager Express that comes with database as a service is a very powerful tool. It is not a centralized utility like a centralized Enterprise Manager but can be used to automate and record reports for a single database. This service is only installed with the platform as a service and must be manually added and configured if you install your own database manually on top of infrastructure as a service. Having this common management interface is a huge benefit to DBAs who are asked to manage and maintain instances in the cloud. The Enterprise Manager used in the cloud is the exact same version that is used for an on-premise system. If you choose to install and configure a central Enterprise Manager server you can attach to instances in your data center as well as instances in the cloud. The only requirement is that you have file level access and sys/root access to install the agent.

In our last entry we looked at the results of a sql tuning advisor. We used SQL Developer to execute our code and create a tuning advisory for the code that we executed. We could have gone...

PaaS

database option - tuning

Today we are going to look at using the diagnostics and tuning package that comes with the High Performance and Extreme Performance Editions of the database. We do not get these options with the Standard Edition or Enterprise Edition and if we use Amazon RDS, EC2, Oracle IaaS, or Microsoft Azure Compute to use the tuning option we must bring along a license for the options. Diagnostics are licensed at $150 per named user or $7,500 per processor. This correlates to $294 per processor per month. Options like the SQL Tuning Advisor and Automatic SQL Tuning are part of the Tuning pack option. Tuning pack is $100 per named user or $5,000 per processor. This comes in at $196 per processor per month if we use the four year amortization that we talked about last week. There are three ways to look at the SQL Tuning Advisor. We can use Enterprise Manager in a central site and analyze historic data from days, weeks, and months back. Unfortunately, we can not use this in conjunction with Amazon RDS. We can use the Enterprise Manager Express which is part of the database and gives you three hours of history of database performance. Again, we can not use this in conjunction with Amazon RDS. These features are disabled and turned off as part of the Amazon installation. We can use SQL Developer to connect to the database on all platforms. This allows us to pull down real time diagnostics and look at live database performance. We will go through an Oracle by Example SQL Tuning Advisor Tutorial that details how to enable and use the tuning advisor packs. The database version that we will be using is the 11g version of the database. These same steps should work with 12c because the features have not changed and SQL Developer knows what to do between the two versions of the database and present a common user interface to do SQL Tuning.The first step that we have to do is find out the ip address of our 11g database. We do this by going to the database console and looking at our instance detail.We then create a connection to the database with SQL Developer. This is done first as the sys user as sysdba connecting to the ORCL instance at the ip address of the database. We can verify that we are connected to a High Performance Edition by issuing a select statement against the v$version table.select * from v$version;Before we can execute step 8 in the Tuning Advisor Tutorial we must enable the user scott and set a password for the account. To do this we expand the Other Users selection at the bottom left of the screen, find the user scott, and enable the account while setting the password.We can now connect to the 11g instance and give user scott permission to attach to the sql resources with the commandsgrant advisor to scott;grant administer sql tuning set to scott;We then clear the existing statistics to make sure we are not looking at old artifacts but what we are going to execute. This is done by exeuctingexec DBMS_STATS.DELETE_SCHEMA_STATS ('scott');At this point we switch over to the user scott and execute a select statementselect sum(e.sal), avg(e.sal), count(1), e.deptno from dept d, emp e group by e.deptno order by e.deptno;We can launch the SQL Tuning Advisor from the icon at the top of the screen. This opens a new tab next to the resulting output from the select statement.The output from the tuning advisor has four parts. We can look at the statistics that were gathered, look at suggested indexes, sql profile, and restructuring statement recommendations. The index output did not say anything but the other three had recommendations.The restructuring statement suggests that we remove the dept d definition since we really are not using it in the select statement. We then execute the following modified commandselect sum(e.sal), avg(e.sal), count(1), e.deptno from emp e group by e.deptno order by e.deptno;When we rerun the command without the dept d in the select statement we get a clean output from the SQL Advisor.In summary, we can use Enterprise Manager, Enterprise Manager Express, or SQL Developer to run the tuning advisor. We walked through a simple example of how to do this with SQL Developer on a single select statement. We walked through the SQL Developer because it works on all cloud platforms and the Enterprise Manager solutions do not work well with Amazon RDS. With these tools we can dive into SQL performance issues, tune the database, and optimize the cloud system to utilize fewer resources and cost us less money. If we can reduce the processor count by a couple of processors that more than pays for the cost of the High Performance Edition incremental cost over the Enterprise Edition.

Today we are going to look at using the diagnostics and tuning package that comes with the High Performance and Extreme Performance Editions of the database. We do not get these options with the...

PaaS

database option - compression part 2

Yesterday we looked at the different compression options that are available for a database. Today we are going to walk through an example. The example comes from Oracle by Example - Compression. This is a hands on tutorial that has you execute code in an 11g database. Note that you must create this database as a High Performance or Extreme Performance database. If you create a Standard Edition or Enterprise Edition the execution will fail with an option not available error as we saw with partitioning a couple of days ago. To start, we create an 11g database in the Oracle Public Cloud. We create the instance, wait an hour or so, change the network configuration to open port 1521 to the public, and connect using sys as sysdba to the instance. We are going to use SQL Developer in our Windows 2012 instance to make the connection. To get the connection information, we the database console and get the ip address of the instance.We then go to our sqldeveloper tool and add this database connection. We can use ssh tunneling or open port 1521 to the world to make the connection. The first step that we are told to do is to execute the setup.sql file available via the tutorial. We are not going to execute this program but do everything by hand through sql developer. The purpose of this script is to enable the user sh, set a password, and grant privileges to the user. We can do this from SQL Developer. The code that it recommends using isconnect / as sysdbaset echo onalter user sh identified by sh account unlock;grant create tablespace to sh;grant drop tablespace to sh;First, we don't want to use such a simple password. We change this and set it to something a little more secure. We select the database instance, in our example it is prs11gHP where we are connected as the sys user. We select other Users..., the user sh, and edit the entry. When the screen comes up to edit the user, we enable the account, set the password, grant create tablespace and drop tablespace rights to the user and apply. This effectively executes the script shown above.At this point, we have a user that can create and drop tables. We now want to load the create_sales_tbls.sql code from the tutorial. The create script first, drops the existing tables. This might generate an error because the table does not exist. This error is not significant and won't stop everything from executing. We then create a non-compressed and a compressed table by selecting from the demo sales table that exists if you installed the demo database during your install. drop table sales_nocompress purge/drop table sales_compress purge/set echo onset timing oncreate table sales_nocompressas select * from sales/create table sales_compress compress for all operationsas select * from sales where 1=0/select count(*)from sales_compress/Note that the two create statements should create a table of the same size. What we see is that the creation of the first table takes just over 4 seconds because we pull in the sales table information. The second creation does not take as long because the data is in cache and the where clause fails for all select operations. When we do the select, the table size should be zero based on the where clause. We then to an insert into the table to create a table of the same size. This is done by executing@oltp_insertset timing offselect count(*) from sales_compress/select count(*) from sales_nocompress/This executes the oltp_insert.sql code then compares the counts of the two tables to make sure they contain the same number of records. The code that is executed in the insert script isSQL> set timing onSQL> declare commit_after integer := 0 ; loop_variable integer ; cursor c_sales is select prod_id , cust_id , time_id , channel_id , promo_id , quantity_sold , amount_sold from sales ;begin for r_sales in c_sales loop if commit_after = 0 then loop_variable := 0 ; commit_after := round(dbms_random.value(1,1)) ; end if ; insert into sales_compress (prod_id, cust_id, time_id, channel_id, promo_id, quantity_sold, amount_sold) values ( r_sales.prod_id , r_sales.cust_id , r_sales.time_id , r_sales.channel_id , r_sales.promo_id , r_sales.quantity_sold , r_sales.amount_sold ) ; if loop_variable = commit_after then commit ; commit_after := 0 ; end if ; loop_variable := loop_variable + 1 ; end loop ;end ;/We are not going to go through this code but it does return the same amount of entries as the uncompressed table. The values that are inserted are pulled from the sales table and inserted into the compressed table. Note that we are using the basic compression since we did not state any compress methodology when we created the table.We can execute the examine_storage.sql script to see that the compressed storage takes up about half the storage as the uncompressed table. We can also see that the table is enabled for oltp compression by looking at the parameters of the table from a select statement.We can also look at the select time differences by reading all of the data from the compressed and uncompressed tables. Note that the compressed table takes about 3/4 of the time that the uncompressed takes to execute.In summary, we were able to create an 11g database, create a table that is compressed and non-compressed and look at the relative size and timing on retrieving data from the table. We can experiment with this data and grow the table size to see if we still get the same improvements as the table gets larger. We can try different compression algorithms to see if it effects performance or compression ratios. We have done all of this in a database as a service public cloud instance. The only tools that we needed was a SQL Developer connection and an Oracle Cloud account. We could have done with with Amazon RDS as well as EC2 and Microsoft Azure Compute. The key difference is that this experiment took about two hours to execute and we only consumed about $15 to learn and play with compression on 11g (or 12c) given that a low memory option for the database is only $6.720 per OCPU per hour. With the pay as you go option we burn less than $15 and turn off the service. We could have uploaded our own data sets into the database instance and played with the compression advisor in a sandbox and not effected our production environment. If we were using database backup as a service we could have restored a single table from our backup and play with the compression variations and compression advisor.

Yesterday we looked at the different compression options that are available for a database. Today we are going to walk through an example. The example comes from Oracle by Example - Compression. This...

PaaS

database options - compression

A natural follow on to database partitioning is database compression. With partitioning we wanted to split everything into buckets based on how frequently it is used and minimize the more used stuff so that it would fit into memory. The older stuff that we don't access that frequently can be put on slower and lower cost storage. In this blog we are going to look at different techniques to use the cheaper storage event more. Since we don't access this data very frequently and most of the time when we access it we only need to read it and not write to it, we should be able to take advantages of common data and compress the information to consume less storage. If, for example, we have the census data that we are storing and we want to store city and state information we can take advantage of not having Punxsutawney, Pennsylvania stored 5900 times based on the current population. If we stored a copy of this roughly 6000 time it would take up 6000 times 12 bytes for the city and 6000 times 12 bytes for the state. We would also store 15767 as the zip code roughly 6000 times consuming 6000 times 9 bytes. If we could create a secondary table that contains Punxsutawney, Pennsylvania 15767 and correlate it to the hexadecimal value 2e, we could store 2e for the city, state, and zip code thus consuming one byte each rather than 12, 12, and 9 bytes. We effectively save 180,000 bytes by doing a replacement value rather than storing the long strings multiple times. This is effectively the way that hybrid columnar compression works. Compression can be done at a variety of levels and locations. Disk vendors for years have touted compression in place on storage to consume less space. Compression has been used in a variety of industries. Audio compression, for example, takes recorded audio and under samples the changes in volume and pitch and only records only 8,000 samples per second since the ear can not really hear changes faster than that. These changes are then compressed and stored in an mp3 or avi format. Programs know how to take the mp3 format and rebuild the 8k sample and drive a speaker to estimate the sound that was originally created. Some people can hear the differences and still want to listen to music recorded on reel to reel tape or vinyl because the fidelity is better than CD-ROM or DVD. Videos do the same thing by compressing a large number of bits on a screen and break it into squares on the screen. Only the squares that are changing are transmitted rather than sending all of the data across the whole screen and the blocks that did not change are redisplayed rather than being retransmitted thirty times a second. This allows for video distribution of movies and video recordings across the internet and storage on a DVD rather than recording all of the data all of the time. Generically compressing data for a database can be complex and if done properly works well. It can also be done very poorly and cause performance problems and issues when reading back the data. Let's take the census data that we talked about earlier. If we store the data as bytes it will consume 198K of space on the disk. If we use the compression ratio that we talked about we will consume roughly 20K of data. This gives us a 10x compression ratio and saves us a significant amount of space on the disk. If the disk sub-system does this compression for us we write 198K of data to the disk, it consumes 20K of storage on the spindles, but when we read it back it has to be rehydrated and we transfer 198K back to the processor and consume 198K of memory to hold the rehydrated data. If the database knew what the compression algorithm and compressed the data initially in memory it would only transmit 20K to the disk, store 20K on the spindles, read 20K back from the disk, and consume 20K of memory to hold the data. This might not seem significant but if we are reading the data across a 2.5 G/second SCSI connection it takes 80ms to read the data rather than 8ms. This 72ms difference can be significant if we have to repeat this a few thousand times. It can also be significant if we have a 1 GigE network connection rather than a direct attached disk. The transfer time jumps to 200ms by moving the data from an attached disk to an nfs or smb mounted disk. We see performance problems like this with database backups to third party storage solutions like Data Domain. If you take a database backup and copy it to a Data Domain solution you get the 10x compression and the backup takes roughly an hour. You have to estimate that it will take seven to eight times the time to rehydrate the data so a restore will take 7-8 hours to recover your database. The recommended solution is to use compression inside the database rather than third party compression solutions that are designed to compress backups, home directories, and email attachments. Oracle offers Advanced Compressions options for information stored in the database. If you look at the 12c Advanced Compression Data Sheet you will notice that there are a variety options available for compression. You can compressusing historic access patterns (heat map and ADO options)using row compression (by analyzing update and insert operations as they occur)file compression (duplicate file links and file compression of LOBS, BLOGS, and CLOBS)backup data compressionData Guard compression of redo logs before transmissionindex compressionsnetwork transmission compression of results to client systemshybrid columnar compression (Exadata and ZFS only)storage snapshot optimization (ZFS only)Heat Map CompressionAt the segment level, Heat Map tracks the timestamps of the most recent modification and query of each table and partition in the database. At the block level, Heat Map tracks the most recent modification timestamp. These timestamps are used by Automatic Data Optimization to define compression and storage policies which will be automatically maintained throughout the lifecycle of the data. Heat Map skips internal operations done for system tasks -- automatically excluding Stats Gathering, DDLs, Table Redefinitions and similar operations. In addition, Heat Map can be disabled at the session level, allowing DBA’s to exclude manual maintenance, avoidingpollution of Heat Map data.With the data collected by Heat Map, Oracle Database can automatically compress each partition of a table independently based on Heat Map data, implementing compression tiering. This compression tiering can use all forms of Oracle table compression, including: Advanced Row Compression and all levels of Hybrid Columnar Compression (HCC) if the underlying storage supports HCC. Oracle Database can also compress individual database blocks with Advanced Row Compression based on Heat Map data.Row Compressiona segment-level ADO policy is created to automatically compress the entire table after there have been nomodifications for at least 30 days, using Advanced Row Compression:ALTER TABLE employee ILM ADD POLICY ROW STORE COMPRESS ADVANCED SEGMENT AFTER 30 DAYS OF NO MODIFICATION;In this next example, a row-level ADO policy is created to automatically compress blocks in the table, after no rows in the block have been modified for at least 3 days, using Advanced Row Compression:ALTER TABLE employee ILM ADD POLICY ROW STORE COMPRESS ADVANCED ROW AFTER 3 DAYS OF NO MODIFICATION;In addition to Smart Compression, other ADO policy actions can include data movement to other storage tiers, including lower cost storage tiers or storage tiers with other compression capabilities such as Hybrid Columnar Compression (HCC). HCC requires the use of Oracle Storage – Exadata, Pillar Axiom or Sun ZFS Storage Appliance (ZFSSA).In this example, a tablespace-level ADO policy automatically moves the table to a different tablespace when the tablespace currently containing the object meets a pre-defined tablespace fullness threshold:ALTER TABLE employee ILM ADD POLICY tier to ilmtbs;Another option when moving a segment to another tablespace is to set the target tablespace to READ ONLY after the object is moved. This is useful for historical data during database backups, since subsequent full database backups will skip READ ONLY tablespaces.Advanced Row Compression uses a unique compression algorithm specifically designed to work with OLTP applications. The algorithm works by eliminating duplicate values within a database block, even across multiple columns. Compressed blocks contain a structure called a symbol table that maintains compression metadata. When a block is compressed, duplicate values are eliminated by first adding a single copy of the duplicate value to the symbol table. Each duplicate value is then replaced by a short reference to the appropriate entry in the symbol table.File CompressionConsider an email application where 10 users receive an email with the same 1MB attachment. WithoutAdvanced LOB Deduplication, the system would store one copy of the file for each of the 10 users –requiring 10MB of storage. If the email application in our example uses Advanced LOB Deduplication, it willstore the 1MB attachment just once. That’s a 90% savings in storage requirements.In addition to the storage savings, Advanced LOB Deduplication also increases application performance.Specifically, write and copy operations are much more efficient since only references to the SecureFilesdata are written. Further, read operations may improve if duplicate SecureFiles data already exists in thebuffer cache.Backup data compressionRMAN makes a block-by-block backup of the database data, also known as a “physical” backup, which can be used to perform database, tablespace or block level recovery. Data Pump is used to perform a “logical” backup by offloading data from one or more tables into a flat file. Due to RMAN’s tight integration with Oracle Database, backup data is compressed before it is written todisk or tape and doesn’t need to be uncompressed before recovery – providing an enormous reduction instorage costs and a potentially large reduction in backup and restore times. There are three levels of RMAN Compression: LOW, MEDIUM, and HIGH. The amount of storage savings increases from LOW to HIGH, while potentially consuming more CPU resources. Data Pump compression is an inline operation, so the reduced dump file size means a significant savingsin disk space. Unlike operating system or file system compression utilities, Data Pump compression is fullyinline on the import side as well, so there is no need to decompress a dump file before importing it. Thecompressed dump file sets are automatically decompressed during import without any additional steps bythe Database Administrator.Data Guard redo log compressionData Guard Redo Transport Services are used to transfer this redo data to the standby site(s). With Advanced Compression, redo data may be transmitted in a compressed format to reduce network bandwidth consumption and in some cases reduce transmission time of redo data. Redo data can be transmitted in a compressed format when the Oracle Data Guard configuration uses either synchronous redo transport (SYNC) or asynchronous redo transport (ASYNC).Index CompressionAdvanced Index compression is a new form of index block compression. Creating an index using Advanced Index Compression reduces the size of all supported unique and non-unique indexes -- while still providing efficient access to the indexes. Advanced Index Compression works well on all supported indexes, including those indexes that are not good candidates (indexes with no duplicate values, or few duplicate values, for given number of leading columns of the index) with the existing index Prefix Compression feature.Network CompressionAdvanced Network Compression, also referred to as SQL Network Data Compression, can be used to compress the network data to be transmitted at the sending side and then uncompress it at the receiving side to reduce the network traffic. Advanced Network Compression reduces the size of the session data unit (SDU) transmitted over a data connection. Reducing the size of data reduces the time required to transmit the SDU. Advanced Network Compression not only makes SQL query responses faster but also saves bandwidth. On narrow bandwidth connections, with faster CPU, it could significantly improve performance. The compression is transparent to client applications.We won't cover the last two options since they don't apply to database services in the cloud unless you purchase the Exadata as a Service option. There is a Compression Estimation Tool to help you estimate the benefits of compression. A sample of this looking at 100 TB of database data shows a significant cost savings in the millions of dollars.There is also a Compression Advisor that can be downloaded and installed in your database to look at your tables and estimate how much storage you can save based on your data and your usage patterns. You can watch a Four minute marketing video on the tool and how to use it. I recommend Tyler Mouth's blog entry on customizing the output of the compression advisor to be a little more user friendly. I would also look at Mike Haas's Blog on compression and the DBAORA blog that provides a good overview of 11g compressions. Mike Messin's blog is a good blog on installing and executing the compression advisor. In summary, compression can be used with a variety of mechanisms based on your usage patterns and objectives. This option is not one size fits all and requires a DBA with knowledge of the usage patterns and familiarity of the data and applications. Letting a non-DBA decide on the compression mechanism can lead to poor performance, missing recovery objective times, increased network throughput, and higher processor utilization than necessary. The Database 12c Compression Documentation details how to create tables that are compressed, how to look and see if tables are compressed, and how to update tables for compression. Compression is a mechanism that can directly reduce your storage costs by consuming significantly less amounts of storage to store the same data. In the cloud this correlates directly to storage cost savings. You get compression as an option for High Performance Edition and Extreme Performance Edition but not the Standard Edition or Enterprise Edition versions of the database.

A natural follow on to database partitioning is database compression. With partitioning we wanted to split everything into buckets based on how frequently it is used and minimize the more used stuff...

PaaS

database option - partitioning part 2

Yesterday we looked at partitioning. Today we are going to continue this evaluation but actually execute code rather than talk in abstracts. If we want to create a partition, this is easily done by appending partitioning to a table create. It is important to remember that this option cost money when done on-premise and is typically done either to improve performance by having a smaller table to bring into memory or done to split storage so that higher speed disk can be assigned to more relevant data and lower speed and lower cost disk can be assigned to data we typically don't need to read regularly. If we are looking at using partitioning in the cloud, tiering storage is not an option. We get one disk, one type of disk, and can't assign higher speed storage to that disk partition with PaaS or DBaaS. We pay $50/TB/month to attach a disk to a compute engine and that stores our data. The tablespaces are stored in either the USER tablespace or the SYSTEM tablespace based on who creates the tablespace. To quickly review we have tables that contain our data. This data is stored in a tablespace. The tablespace might contain multiple tables or parts of tables if partitioning is used. We can assign tablespaces to different directories and typically do with on-premise systems. This allows us to put data that we need fast access to in flash memory and historic data that we might read once a year in lower cost network storage and not have to backup the historic data on a daily basis. With DBaaS we get a /u02 directory that contains the oradata folder. All tablespaces are created in this area by default. Theoretically we could mount an nfs file share if we ran the storage cloud appliance on a compute instance and pay $30/TB/month for this storage. We would have to install the nfs client on our database instance, install OSCSA on a compute instance and share the nfs directory, create a cloud storage container to hold our historic tablespaces, and point our historic partitions to our nfs mounted directories. We are not going to do this in this blog but it is an interesting thought on how to reduce the cost of storage as well as expand the amount of data that you can support with a DBaaS instance.Let's create a few tablespaces and a partitioned table to see how it works. Most of these examples are liberally hijacked from other blogs and tutorials on the internet. Oracle Expert Blog on partitioningAll things Oracle blog on partitioningOracle Tips blog on interval partitioningWe need to note that the DBaaS that we provisioned needs to be High Performance Edition or Extreme Performance Edition. This option does not work with Standard Edition or Enterprise Edition and will fail when you try to create the table. We begin by creating a few tablespaces as well as a partitioned table that stores data into these tablespaces. It is important to note that we can easily do this because consuming storage only happens when we insert data and not create a table. We can play with creation all we want at very little cost. First, let's look at our layout using SQL Developer. If we connect to our database as a sys user we can see that by default we have the following tablespaces defined in our PDB1 pluggable container. The same is true for an 11g instance or container database. We are going to look at pluggable because it is easy to make sure that what we are creating is for this instance and not someone else playing with the system. If we add our database instance to the DBA view in SQL Developer we notice that Tablespaces appears as one of the line entries under our database. We can click on this and look at the tablespaces and files associated with them provisioned in our instance. To see the file allocation and which file system the tablespace is allocated in we need to scroll across the screen to see the information on the right.We are going to create a few tablespaces then create a table and allocate provisions into these tablespaces. Note that these commands might not work on Amazon RDS because you need to have system level access to the database to create a tablespace and assign the file name. If we let the system use the default oradata area the create works fine. If we want to create the tablespace in /nfs/historic_oradata then the create will fail and is not allowed with RDS. Let's look at a simple exampleCREATE TABLESPACE T1;CREATE TABLESPACE T2;CREATE TABLESPACE T3;CREATE TABLESPACE T4;CREATE TABLE credential_evaluations( eval_id VARCHAR2(16) primary key, grad_id VARCHAR2(12), grad_date DATE, degree_granted VARCHAR2(12), degree_major VARCHAR2(64), school_id VARCHAR2(32), final_gpa NUMBER(4,2))PARTITION BY RANGE (grad_date)( PARTITION grad_date_70sVALUES LESS THAN (TO_DATE('01-JAN-1980','DD-MON-YYYY')) TABLESPACE T1, PARTITION grad_date_80sVALUES LESS THAN (TO_DATE('01-JAN-1990','DD-MON-YYYY')) TABLESPACE T2, PARTITION grad_date_90sVALUES LESS THAN (TO_DATE('01-JAN-2000','DD-MON-YYYY')) TABLESPACE T3, PARTITION grad_date_00sVALUES LESS THAN (TO_DATE('01-JAN-2010','DD-MON-YYYY')) TABLESPACE T4 )ENABLE ROW MOVEMENT;The create tablespace t1 is needed prior to creating the partition that stores data in the tablespace t1 or the create table command will fail. We have to have the tablespace created before we allocate a partition into it. After we create the tablespace, we can look at the tablespace allocation with SQL Developer by going to the DBA view and looking at PDB1, tablespaces. Note that the file /u02/app/oracle/oradata/ORCL/339C06AF452F1EB6E0531635C40AD41B/datafile/o1_mf_t1_co5fjnr3_.dbf was created for us. If we change our tablespace create command toCREATE TABLESPACE T1 datafile '/u02/app/oracle/oradata/ORCL/PDB1/t1.dbf' size 2G;CREATE TABLESPACE T2 datafile '/u02/app/oracle/oradata/ORCL/PDB1/t2.dbf' size 2G;CREATE TABLESPACE T3 datafile '/u02/app/oracle/oradata/ORCL/PDB1/t3.dbf' size 2G;CREATE TABLESPACE T4 datafile '/u02/app/oracle/oradata/ORCL/PDB1/t4.dbf' size 2G;we drop the files into the directory that we want and have control over the file name and location. It is important to note that this will fail on Amazon RDS because we do not have access to the filesystem and can't specify the filename or location. When we execute this command it takes significantly longer than our first execution because the system creates a 2 GB file before creating our tablespace and table. We would typically want to add other options like how to grow our partitions, limits on the size, and other dynamic commands. We are primarily concerned with where the file is created and not post maintenance at this point.We need to make sure that we are running on High Performance Edition or Extreme Performance Edition because Standard Edition and Enterprise Edition fail during the create table command.In summary, we looked a little deeper at partitioning by looking at the create tablespace and where it creates the files in the file system. We also looked at how we can control the naming as well as location with the create statement options. We briefly touch on two of the advantages that partitioning brings, speed and cost and talked about how to reduce cost by using an nfs share to store more data than a typical DBaaS provides as well as using $30/TB/month storage rather than $50/TB/month storage in the cloud. Hopefully this code example will allow you to play with partitioning and speed up select statements using the High Performance Edition of DBaaS.

Yesterday we looked at partitioning. Today we are going to continue this evaluation but actually execute code rather than talk in abstracts. If we want to create a partition, this is easily done...

PaaS

database option - partitioning

Database partitioning has been around since 8i version of the database over ten years ago. The initial features of partitioning were manual processes that allowed you to split data by range like dates or sequences like zip codes. Tablespaces were able to be split into multiple files and indexes applied to each file. If a select statement were executed with a where clause that met the partition boundary, a full table scan was not necessary. Splitting the data into different tablespaces allows us not only to read only the relevant data into memory but we can split our database into storage tiers. We can keep the most used data in high speed disk and historic data in slower lower cost storage. Not only can we use lower cost storage but we can compress the data that is not changing and take up less space. We keep our frequently used data in a high speed disk (or memory if we are lucky) and our older data in lower cost storage. This is only available with partitioning and the reason why many customer purchase this as an option. The return on the software investment significantly reduces the cost of our database storage. We can use flash drives for our current quarter/month data, 10K rpm drives for last quarter/month data, 7.5K rpm drives the rest of the years data, and nfs mounts for data greater than a year old. The cost savings on storage more than pays for the cost of partitioning. Unfortunately, this does not correlate into cloud services since you really don't get into tiered storage behind a database when you consume DBaaS or PaaS. We need to focus on improving performance by helping subpartitions into the available memory to speed up select statements.Some places to learn more about partitioning include Oracle Database Problem Solving and Troubleshooting Handbook Oracle Database 11g New Features (Oracle Press) Oracle Database 12c The Complete Reference (Oracle Press) Oracle Database 12c New FeaturesPartitioning home page12c Partitioning Tutorial12c new features - partitioningOracle Education class - 12c partitioningyoutube video - partitioningseries of youtube videos - partitioningBefore we go down the rabbit hole and dive deep into partitioning, let's review how a select statement works and how data is stored. Say for example we have a database that contains addresses for customers. The table contains an id number, a first name, last name, address, phone number, city, state, zip code, credit card number, credit card expiration, and email address. We have a second table for our on-line catalog that contains part numbers, a title, a description, and a file link for photos. We have a third table for our orders and it contains a customer id number, a part number, an order quantity, and order date. We would create our tables with the following commandscreate table customers ( customer_id number(8),, first_name varchar2(32), last_name varchar2(32), address varchar2(64), phone_number varchar2(10), city varchar2(32), state varchar2(16), zip_code varchar2(16), credit_card_number varchar2(16), credit_card_expiration varchar2(8) email_address varchar2(64));create table catalog ( part_number number(8), title varchar2(32), description varchar2(128), part_image blob);create order_entry( order_number number(8), customer_id number(8), part_number number(8), part_quantity number(8), order_date date);If we have ten million items in our catalog we potentially consume 128 + 32 + 8 + 16 bytes times 10,000,000. This makes our table roughly 2 TB in size. If we have two million orders we have about 0.5 TB for the order_entry table. When we create a database we have the option of defining not only the storage type that we want our table to reside in but we can define how and where to store the data associated with this table. By default all tables that we create as a user are stored in the SYSTEM tablespace. All three of these tables will be stored in the DATA area under the SYSTEM tablespace since we did not specify a storage area or tablespace to hold the tables. For the database that we created in previous blog entries using Oracle DBaaS, these files are stored in /u02. We can dive down the /u02/app/oracle/oradata/ORCL/PDB1 and see that there is a system01.dbf file. This correlates to the SYSTEM tablespace in the PDB1 pluggable database. As tables are added, they are added to the system01.dbf file. If we are in the container database ORCL the files are stored in /u02/app/oracle/oradata/ORCL/system01.dbf file. To help with database performance, index are created into tables so that a reference to a table knows where in the system01.dbf file the table customers and catalog are located. We can also create an index on the table. This index is also stored in the system01.dbf file so that we can look up common queries as they are executed. For example, if we are looking for all orders that happened in February we can select this data quicker with an index by presorting all of the data related to order_date. The index allows us to directly access the table entries in the system01.dbf table by creating an index link to the entries. This index is also stored in the system01.dbf file and re-created when we enter new data into the order_entry table. Hopefully our indexes are small enough to stay resident in memory and we don't need to go to storage to reload and reindex our indexes. Partitioning helps keep indexes smaller as well and unused indexes can be aged out to disk to free up memory. If we never look at data that is two years old, we don't need to keep an index on our two year old data in memory but pull it in from disk when needed.To reduce the access time and select time we can pre-sort the data in a different way. We can partition the data and store the table information in different files. Rather than storing everything in system01.dbf, we can store February order data in february.dbf. When an update to a table is done the insert is done into the system01.dbf file or the january.dbf, february.dbf, or march.dbf file. When we transition into April an april.dbf file is created and the january.dbf data is moved into q1_2016.dbf file. The key advantage to this is when we perform a select statement and look for data in March and April, we only look in the march.dbf and april.dbf files. The rest of the data is not loaded because we know that the data is not in the other table extents. This reduces the amount of data that is loaded into memory and reduces the amount of disk operations that are performed for every select statement. If everything was stored in the system01.dbf file, we would need to load all two million orders just to find the one or two hundred that happened in April. We basically read then throw away 97% of the data read because it does not match our request. True, the index would help but this requires multiple writes to the disk when an insert happens. With partitioning enabled for the order_date column, all order entries are stored pre-sorted by date in the different table extents. Since 11g interval partitioning automatically creates new partition tables. As we cross from February to March, the march.dbf is created and all writes corresponding to march orders are written to the new partition.There are a variety of partition types that you can use to divide dataRange partitioning - typically based on date columns, months, quarters, a range of numberic or character values. You can specify a value less than or value greater than when defining the partition. The value that you compare can be relative or specific to a current date or number. List partitioning - this describes a discrete value and assigns them to their own tablespace. We might split our catalog into plumbing products, lawn and garden products, or appliances. This helps searches into our catalog for a specific item. Note when you do a search at HomeDepot or Amazon you have the option of search in department. They are using list partitions on the back end. Hash partitioning - this is good if you don't have a specific range that will split the data. If you want to sort by address for example, it is difficult to list all of the addresses or sort them into a range. The hash partition allows you to split your data into 16 different partitions and the database will split the data with a best effort to spread all of the data between the number of partitions you define.Composite partitioning - this is a combination of two of the types described above. Composite partitioning is accomplished with the subpartition command where we first sort by one method then sub-sort by another. We could use a list-list or a list-range. We can use two of any of the above to help manage a large tablespace into smaller chunks. Reference partitioning - this allows you to partition data based on referential constraints. If, for example, you want to create a constraint in a table creation and sort on that constraint, you can do this with partition by reference. If we create a table and add a constraint that an order_id must be tied to a customer_id in our customers table we can partition by this constraint which effectively splits the orders table into orders by customer_id which is not defined in this table. Virtual column-based partitioning - virtual column partitioning allows us to split a table based on part of a column value. If, for example, we index our parts in our catalog by sequence numbers with 1-3 representing plumbing and 4-6 representing appliances, we can partition based on the first number in our part_id and effectively split the catalog based on departments without having to define the department as a column in our table. We just need to make sure that all part numbers that are inserted into our catalog follow our numbering convention and not put a gas range into the catalog staring with a 2 as the part number.If we change the customers table described and append a partition by range statement with the commandcreate table customers ( .... ) partition by range (state);we divide the table into potentially fifty different tables. As a new customer is added, they are added to the new state table. Inserts happen quicker, selects happen quicker, backups happen quicker unless all of our customers are located in one state.If we group our customers into regions and want to store data not in fifty states but in three regions we could do this wist a list range. Note that we can define the table name when we define the partition.create table customers (....) partition by list (state)(partition part1 values ('Texas', 'Louisiana', 'Oklahoma', 'Arkansas') tablespace tola_ts , partition part2 values ('California', 'Oregon', 'Washington', 'Hawaii') tablespace pac_ts, partition category_other values (default));In this example we create the tola_ts, pac_ts, and default tablespace. We split seven of the states into three buckets and store customers into the three areas. This make reporting simpler and optimizes for select statements looking for customers in or around Texas or along the Pacific Ocean. Note that we could also subpartition this data to separate the big cities from rural areascreate table customers (.....) partition by list (state)(partition part1 values ('Texas") tablespace texas_ts subpartition big_cities (partition texas_cities values('Houston', 'Dallas', 'San Antonio', 'Austin', 'Fort Worth', 'El Paso') tablespace big_texas_ts, partition category_other values(default) tablespace small_texas_ts), partition part2 values ('California', 'Oregon', 'Washington', 'Hawaii') tablespace pac_ts, partition category_other values (default)););This will create four tablespaces. One for Texas big cities, one for Texas small cities, one for Pacific rim states, and one for all other states.Database 12c added a few new commands to help manage and maintain partitions. We can now alter partitions and add, truncate, drop, split, and merge. The add and merge are very valuable functions that allow us to update ranges. If, for example, we paid a consultant two years ago to define a partition by range and they went out four years with the followingcreate table sales (.....) partition by range (salesdate)( partition part_2015 values less than (TO_DATE('01-Jan-2016', 'DD-MON-YYYY')), partition part_2016 values less than (TO_DATE('01-Jan-2017', 'DD-MON-YYYY')))ENABLE ROW MOVEMENT;But we want to start saving data by quarter rather than by year we could execute the followingalter table sales add partition p_q1_2016 values less than (TO_DATE('01-Apr-2016', 'DD-MON-YYYY')), partition p_q2_2016 values less than (TO_DATE('01-Jul-2016', 'DD-MON-YYYY'));This would slide in quarterly reporting and allow us to handle a larger volume than was created before. If at the end of the year we want to aggregate everything back into a year rather than a quarter basis we can do this with a merge commandalter table salesmerge partitions p_q1_2016, p_q2_2016, p_q3_2016, p_q45_2016into partition part_2016; Fortunately, Enterprise Manager has a partition advisor that looks at the history of your select statements and suggests how you should divide your tables into partitions. It notices that you do a lot of select by state or select by zip code and recommends partitioning by list or by hash based on your usage patterns. This was a new feature added with Enterprise Manager 11 and has gotten more robust and reliable with 13c. We should see a significant speed up if we get the right combination of partitions and indexes and could potentially take a select statement from 45 seconds to sub seconds as shown in the Enterprise Manager screen shots below.In summary, partitioning is very powerful. It helps you split up your larger tables so that they fit into the memory that you have allocated. The return on investment is difficult to do because the cost for partitioning vs the cost of memory and resulting speed up for queries is hard to measure. Enterprise Manager has tools to help you with this analysis but it is difficult to put into future dollars and what if analysis. It would be nice if you could say that splitting your table into partitions would reduce your buffer cache and allow you to shrink your SGA size by 25%. The tools are not quite there. They do tell you that you can reduce your select times by partitioning the data and predict relatively accurately how much faster a select statement will be with partitioning based on your current hardware configuration. All of these functions should work on Amazon RDS with the exception of manipulating a tablespace. This requires a different command syntax since manipulation of a tablespace requires system access. Typically the command would be alter database default tablespace users2 but with Amazon RDS you have to execute exec rdsadmin.rdsadmin_util.alter_default_tablespace('users2') instead. Given that this is not done very often, it is up to you to decide how and where you deploy your large table database.

Database partitioning has been around since 8i version of the database over ten years ago. The initial features of partitioning were manual processes that allowed you to split data by range like dates...

PaaS

preparing a desktop for PaaS

Before we can start looking at the different options of a database, we need to get a desktop ready to do database development. It sounds a little strange that we need to download software to get access to a cloud database. We could do everything from the command line but it is much simpler and easier if we can do this using desktop tools. The software that we are going to download and install areTo access cloud storageMozilla FirefoxRestClient extension for FirefoxGoogle ChromePostman extension for ChromeCloudBerry for OpenStackTo access files in our instancePuttyFilezillaCygwinTo access our database instanceSQL DeveloperMicrosoft Visual C++ librariesThe eventual target that we are looking to get to isTo do this we need to go to the Oracle Cloud Marketplace and look for the Windows 2012 Server instance. What we want to do is provision a Windows instance and use it as a remote desktop for connecting to the cloud. We could do this on our desktop but the benefit of using a Windows Server is that we can create more users and use this instance for a hands on workshop. We don't need to have anyone load any virtual machines, fight the differences between Mac and Windows, or wait for the binaries to download and install. We can do most of this on a virtual machine in the cloud and just add and delete users for workshops. To provision the Windows server, we go to the cloud marketplace, select Infrastructure, and Compute. We can the search for Windows and get a bootable image to use as our foundation.Once we agree to the legal terms we can select an instance to provision this into. The way it works is that we copy a bootable image into a cloud instance. We can then create compute instances from this bootable image and customize it to our liking. Once we agree to the terms the marketplace connects to the Oracle Cloud and uses your authentication credentials to connect to the instance. From this is gets a list of instances associated with this account, checks to see if you have agreed to terms of marketplace use for this instance by setting your profile settings for the instance. Once the bootable image is ready, a splash screen is presented stating that you are ready to provision a compute instance.The screen dumps you into a compute creation wizard that walks you through the compute provisioning. Rather than going through that interface we decided to start from scratch and log into the instance and provision a compute engine from scratch. We first select the boot image from our private images, select the shape to boot, define the instance name, configure ssh connectivity as well as set the Admininstrator password (not shown). Once we get the confirmation screen it takes a few minutes to create the boot disk then boot the compute instance on this newly formatted disk.We can check the progress by looking at the storage and compute instance. When everything is done we should see a public ip address for our instance. If we don't see our instance it is either still building or we should see an error in the history. Unfortunately, the history is hidden and a failed provisioning is now shown unless you look for it by expanding the history.Before we can connect to our instance with remote desktop, we need to define a security list to allow for rdp, associate this rule with our instance, and define the security rule for rdp and associate it with the security list and instance. Once we have rdp enabled to our instance, we look up the public ip address and connect as the Administrator user with the password that we passed in with a json header at the bottom of the creation screen (not shown). When we log in we see the server tools splash screen pop up.We want to create a secondary user, give this user admin rights as well as rights to remote desktop connect to the server. We might want to add more users not as admins but with remote desktop rights for hands on labs. We can add and delete users using this method and it refreshes the workshop for the next class.At this point we can create a staging directory and install the software that we listed above. The only product that causes a problem with the install is the SQL Developer because it requires a Microsoft package that is not installed by default. We need to download the library and all of the packages that we downloaded are ready to install. I did not go through customization of the desktop or downloading the public and private keys used for the workshop. These are obvious steps using filezilla from a shared network storage on a server in the cloud. We downloaded Firefox and Chrome primarily because Internet Explorer does not support REST Api protocols and we will need a way to create and list storage containers. We could have skipped this installation and done everything through CloudBerry but we can do everything similarly on a Mac (no need for putty or cygwin). With Firefox you need to install the REST Client api extension and Chrome requires the Postman Extension.In summary, we created a compute Windows 2012 Server instance in the Oracle Compute IaaS. We added a new user as a backup to our Administrator user. We enabled remote desktop and configured a Mac to connect to this service remotely. We then downloaded a set of binaries to our Windows desktop to allow us to manage and manipulate storage containers and database instances. We also downloaded some utilities to help us use command line tools to access our database and customize our instances. We technically could do all of this with a Windows desktop, Internet Explorer, and SQL Developer. We went to the extra steps so that we can do the same from a Mac or Windows desktop using the same tools.

Before we can start looking at the different options of a database, we need to get a desktop ready to do database development. It sounds a little strange that we need to download software to...

PaaS

database options

Before we dive into features and functions of database as a service, we need to look at the options that you have with the Oracle Database. We have discussed the differences between Standard Edition and Enterprise Edition but we really have not talked about the database options. When we select a database in the Oracle Cloud we are given the choice of Enterprise Edition, High Performance Edition, and Extreme Performance Edition. Today we are going to dive into the different Editions and talk about the options that you get with each option. It is important to note that all of the options are extra cost options that are licensed on a per processor or per user basis. If you go with Amazon RDS, EC2, or Azure Compute you need to purchase these options to match your processor deployment.One of the standard slides that I use to explain the differences in the editions is shown below.The options are cumulative when you look at them. The Enterprise Edition, for example, comes with Transparent Data Encryption (TDE). TDE is also included in the High Performance and Extreme Performance Editions. We are going to pull the pricing for all of these options from the Technology Price List. Below is a list of the options.Enterprise EditionTransparent Data EncryptionHigh Performance EditionDiagnosticsTuningPartitioningAdvanced CompressionAdvanced SecurityData GuardLabel SecurityMultitenantAudit VaultDatabase VaultReal Application TestingOLAPSpatial and GraphicsExtreme Performance EditionActive Data GuardIn MemoryReal Application Clusters (RAC)RAC One Transparent Data EncryptionTDE is a subset of the Advanced Security option. TDE stops would-be attackers from bypassing the database and reading sensitive information from storage by enforcing data-at-rest encryption in the database layer. Data is stored in the table extents encrypted and read into the database encrypted. The Oracle Wallet is needed to read the data back and perform operations on the data. Advanced Security and Security Inside Out are blogs to dive deeper into TDE features, functions, and tutorials. There is also a Community Security Discussion Forum. The Advanced Security option is priced at $300 per named user or $15,000 per processor. If we assume a four year amortization the cost of this option is $587.50 per month per processor. The database license is $1,860 per month per processor. This says that a dual core system on Amazon EC2, RDS, or Azure Compute running the Oracle database will cost you the cost of the server plus $2,448 per month. If we go with a t2.large on Amazon EC2 (2 vCPUs and 8 GB of RAM) and 128 GB of disk our charge is $128 per month. If we bump this up to an r3.large (2 vCPU, 15 GB of RAM) the price goes up to $173 per month. The cost will be $2,620 per month which compares to Enterprise Edition at $3,000 per month per processor for PaaS/DBaaS. We could also run this in Oracle IaaS Compute at $150 per month (2 vCPUs, 30 GB of RAM) to compare apples to apples. It is strongly recommended that any data that you put in the cloud be encrypted. Security is good in the cloud but encryption of data in storage is much better. When you replicate data or backup data it is copied in the format that it is stored in. If your data is clear text, your backups could be clear text thus exposing you to potential loss of data. Encrypting the data at rest is storage is a baseline for running database in the cloud. DiagnosticsDiagnostics is a subset of the Database Management Packs that allows you to look into the database and figure out things like lock contention, what is holding up a wait queue, and what resources are being consumed by processes inside the database. Historic views into the automated workload repository (AWR) reports are available with this option. You can get spot options but not historical views and comparative analytics on AWR information. Some of the tools are free like compression advisor and partitioning advisor while others are part of the diagnostics pack. Diagnostics are licensed at $150 per named user or $7,500 per processor. This correlates to $294 per processor per month. Unfortunately, you can't purchase Enterprise Edition DBaaS and add this but need to go with IaaS Compute and add this to the bring your own database license. The only way to get this feature is to go with the High Performance Edition. The binary that is installed on the cloud service specifically labels the database as Enterprise Edition, High Performance Edition, or Extreme Performance Edition. All of the features listed from here and below are prohibited from running on the Enterprise Edition when provisioned into the Oracle DBaaS. If you just want Diagnostics Pack on Enterprise Edition it does not make economic sense to purchase High Performance Edition at $4,000 per month per processor when you can do this on IaaS at $2,914 (the $2,620 from above plus $294). TuningTuning is also a subset of the Database Management Packs that allows you to look into sql queries, table layouts, and overall performance issues. Options like the SQL Tuning Advisor and Automatic SQL Tuning are part of this option. Tuning pack is $100 per named user or $5,000 per processor. This comes in at $196 per processor per month if purchased separately. A Tuning Whitepaper details some of the features and functions of the tuning pack if you want to learn more. PartitioningPartitioning is a way of improving performance of your database and backup by splitting how data is stored and read. Partitioning is powerful functionality that allows tables, indexes, and index-organized tables to be subdivided into smaller pieces, enabling these database objects to be managed and accessed at a finer level of granularity. Oracle provides a comprehensive range of partitioning schemes to address every business requirement. The key improvement is to reduce the amount of data that you are reading into memory on a query. For example, if you are looking for financial summary data for the last quarter, issuing a query into eight years of financial data should not need to read in 32 quarters of data but only data from the last quarter. If we partition the data on a monthly basis we only read in three partitions rather than all 32. Partitioning also allows us to compress older data to consume less storage while at rest. When we backup the database we don't need to copy the older partitions that don't change, only backup the partitions that have updated since our last backup. Partitioning is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. The three most purchased database options are diagnostics, tuning, and partitioning. The combined cost of these three options is $940 per processor per month. When we compare the $4,000 per processor per month of DBaaS to IaaS with these three options we are at parity. Advanced CompressionAdvanced Compression is a feature that allows you to compress data at rest (and in memory) so that it consumes less resources. Oracle Advanced Compression provides a comprehensive set of compression capabilities to help improve performance and reduce storage costs. It allows organizations to reduce their overall database storage footprint by enabling compression for all types of data: relational (table), unstructured (file), network, Data Guard Redo and backup data. Cost comparisons for this feature are directly comparable to storage costs. Advanced compression is licensed at $230 per named user or $11,500 per processor. This comes in at $450 per processor per month. Typical compression ratios are 3x to 10x compressions. This means that 1 TB of data will take up 600 GB or 100 GB at these compression ratios. Lower compression rates are recommended for data that lightly changes and high compression for data that will not change. The penalty for compression comes in when you update data that is compressed. The data must be uncompressed, the new data inserted, and recompressed. Advanced Security

Before we dive into features and functions of database as a service, we need to look at the options that you have with the Oracle Database. We have discussed the differences between Standard Edition...

PaaS

Oracle Database 12c SQL by Jason Price

Given that we have a database in Amazon RDS and Oracle PaaS we can go through some books from Oracle Press and see if anything breaks running through a book. Let's start with something simple, Oracle Database 12c SQL by Jason Price, published by Oracle Press. This is an introductory book that goes through the basic data types, sql commands, and an introduction of XML at the end of the book. The material should be relatively straightforward and not have any issues or problems executing the sample code. The sample code can be downloaded from Oracle Press Books by searching for the book title and downloading the Chapter 1 sample code. This will give us a way to load a table with data and execute code against the table. We will use SQL Developer to execute the code from d:\workshops\sql books\SQL and see what works and what does not work.To get started, we need to use our Amazon RDS instance and SQL Developer that we installed yesterday. We connect with the user oracle to port 1521 after opening up the port to anyone. From this connection we can execute sql code in the main part of the SQL Developer window and load the sample code to execute. We can test the connection with the following commandselect sysdate from dual;we can follow along the book and create the user store, load the schema into the database, and look at the examples throughout the book.Everything worked on Amazon RDS. We were able to create users, grant them audit functionality, execute XML code, and generally do everything listed in the book. The audit did not report back as expected but this could have been a user error. According to the Amazon RDS Documentation audting should work. We might not have had something set properly to report back the right information.In summary, the Amazon RDS is a good platform to learn how to program 12c SQL and the various user level commands. If you go through a book like Oracle Database 12c SQL everything should work. This or the Oracle PaaS equivalent make an excellent sandbox that you can use for a day or two and turn off minimizing your cost of experimenting.

Given that we have a database in Amazon RDS and Oracle PaaS we can go through some books from Oracle Press and see if anything breaks running through a book. Let's start with something simple, Oracle...

PaaS

resizing Amazon RDS Oracle EE

Yesterday we looked at connecting to DBaaS with the Oracle platform as a service option. We wanted to extend the database table size because we expect to grow the tablespace beyond current storage capacity. We are going to do the same thing today for Amazon RDS running an Oracle Enterprise Edition 12c instance. To start our journey, we need to create a database instance in Amazon. This is done by going to the Amazon AWS Console and click on RDS to launch a database instance. We then click on the Launch DB Instance button and click on the Oracle tab. We select Enterprise Edition then the Dev/Test for our example. We accept the defaults, define our database with ORACLE_SID=pri, username of oracle, no multi-AZ replication, and select the processor and memory size. While the database is creating we need to change the default network configuration. By default port 1521 is open but open to an ip address range. We need to open this up to everyone so that we can connect to the database from our desktop instance. We are using a Windows 2012 instance in the Oracle IaaS cloud so the default mapping back to the desktop we used to create the database does not work. Note that since we do not have permission to connect via ssh connecting with a tunnel is not an option. We must open up port 1521 to the world and can not use a tunnel to connect these two instances. The only security option that we have is ip white listing, vpn, opening up port 1521 to the world. This is done by going into the security groups definition on the detail page of the instance. We change the default inbound rule from an ip address to anywhere. We could have alternatively defined a security group that links the ip address of our IaaS instance as well as our desktop prior to creation of this database to keep security a little tighter. Once the database is finished creating we can connect to it. We get the connection string (DNS address) and open up SQL Developer. We create a new database connection using the sid of pri, username of oracle, and port 1521. Once we connect we can define the DBA view to allow us to manage parts of the database since we do not have access using Enterprise Manager.It looks like the table space will autoextend into the available space, all we should have to do is extend the /rdsdata partition. This is done by modifying the RDS instance from the console. We change the storage from the 20 that we created to 40, turn on advanced monitoring (not necessary for this exercise), and check the apply immediately button. This reconfigures the database and extends the storage. Note the resize command that happens for us. This is a sysdba level command that is executed on our behalf since we do not have sys rights outside the console. We can look at the new instance and see that the size has grown and we have more space to expand into. We can see the changes from the monitoring consoleIn summary, we are able to easily scale storage in an Amazon RDS instance even though we do not have sys access to the system. We do need to use the AWS Console to make this happen and can not do this through Enterprise Manager because we can't add the agent to the instance. It is important to note that some of the options that are available from the console and some are available with altered command line options that give you elevated admin privileges without giving you system access. Look at the new command structures and decide if forking your admin tools just to run on RDS is something worth doing or too much effort. These changes effectively lock you into running your Oracle database on Amazon RDS. For example to change the tablespace definition in Oracle you would typically typealter database default tablespace users2; but with Amazon RDS you would need to typeexec rdsadmin.rdsadmin_util.alter_default_tablespace('users2');Is this a show stopper? It could be or it might be trivial. It could stop some pre-packaged applications from working in Amazon RDS and you are forced to go with EC2 and S3. The upgrade storage example that we just went through would be a manual process with the involvement of an operating system administrator thus incurring additional cost and time. Again, this blog is not about A is better than B. This blog is about showing things that are hidden and helping you decide which is better for you. Our recommendation is to play with Amazon RDS and Oracle PaaS and see which fits your needs best.

Yesterday we looked at connecting to DBaaS with the Oracle platform as a service option. We wanted to extend the database table size because we expect to grow the tablespace beyond current...

PaaS

resizing database as a service with Oracle

Historically, what happens to a database months after deployment has always been an issue and problem. If we go out and purchase a computer and disk storage then deploy a database onto the server. If we oversize the hardware and storage, we wasted budget. If we undersize the hardware and storage we had to purchase a new computer or new storage and get an operating system expert to reconfigure everything on the new server and get a database administrator to reconfigure the database installation to run on the new server or new storage. For example, if we purchased a 1 TB disk drive and allocated it all to /u02 the database had a ton of space to grow into. We put the DATA area there and put the RECO area into /u03. Our database service suddenly grows wildly and we have a record number of transactions and increase the offerings in our product catalog and our tablespace suddenly grows to over 800 GB. Disk performance starts to suffer and we want to grow our 1 TB to 2 TB. To do this we have to shut down our database, shut down the operating system, attach the new disk, format and mount it as /u05, copy the data from /u02 to /u05, remount /u05 as /u02, and reboot the system. We could have backed up the database from /u02 and reformatted /u02 and /u05 as a logical volume to allow us to dynamically grow the disk and allow us to purchase a 1 TB for our /u05 disk rather than a 2 TB disk and reduce our cost. We successfully grew our tablespace by purchasing more hardware, involving an operating system admin, and our database administrator. We were only down for a day or half day while we copied all of our data and modified the disk layout.Disk vendors attacked this problem early by offering network or fiber attached storage rather that direct attached storage. They allow you to add disks dynamically keeping you from having to go out and purchase new disks. You can attach your disk as a logical unit number and add spindles as desired. This now requires you to get a storage admin involved to update your storage layout and grow your logical unit space from 1 TB to 2 TB. You then need to get your operating system admin to grow the file system that is on your /u02 logical unit mount to allow your database admin to grow the tablespace beyond the 1 TB boundary. Yes, this solves the problem of having to bring down the server, touch the hardware, add new cables and spindles to the computer. It allows data centers to be remote and configurations to be done dynamically with remote management tools. It also addresses the issue of disk failures much easier and quicker by pushing the problem to the storage admin to monitor and fix single disk issues. It solves a problem but there are better ways today to address this issue.With infrastructure as a service we hide these issues by treating storage in the cloud as dynamic storage. With Amazon we can provision our database in EC2 and storage in S3. If we need to grow our S3, we allocate more storage to our bucket and grow the file system in EC2. The database admin then needs to go in and grow the tablespace to fill the new storage area. We got rid of the need for a storage admin, reduced our storage cost, and eliminated a step in our process. We still need an operating system admin to grow the file system and a database admin to grow the tablespace. The same is true if we use Azure compute or Oracle IaaS. Let's go through how to attach and grow storage to a generic compute instance. We have a CentOS image running in IaaS on the Oracle Cloud. We can see that the instance has 9 GB allocated to it as the root operating system. We would like to add a 20 GB disk then grow the disk to 40 GB as a second test. At first we notice that our instance is provisioned and we the 9 GB disk labeled CentOS7 allocated to our instance as /dev/xvdb. We then create a root partition /dev/xvdb1, provision an operating system onto it using the xfs file system, and mount it as the root filesystem. To add a 20 GB disk, we go into the Compute management screen, and create a new storage volume. This is easy because we just create a new volume and allocate 20 GB to it. Given that this disk is relatively small, we don't have to wait long and can then attach it to our CentOS7 instance by clicking on the hamburger menu to the right of our new 20 GB disk and attaching it to our CentOS7 instance.It is important to note that we did not need to reboot the instance but suddenly the disk appears as /dev/xvdc. We can then partition the disk with fdisk, create a file system with mkfs, and mount the disk by creating a new /u02 mount point and mounting /dev/xvdc1 on /u02.The real exercise here is to grow this 20 GB mounted disk to 40 GB. We can go into the Volume storage and Update the storage to a larger size. This is simple and does not require a reboot or much work. We go to the Storage console, Update the disk, grow it to 40 GB, and go back to the operating system and notice that our 20 GB disk is now 40 GB. We can create a new partition /dev/xvdc2 and allocate it to our storage.Note that we selected poorly when we made our file system selection. We selected to lay out an ext3 file system onto our /dev/xvdc1 partition. We can't grow the ext3 filesystem. We should have selected ext4. We did this on purpose to prove a point. The file selection is critical and if you make the wrong choice there is no turning back. The only way to correct this is to get a backup of our /u02 mount and restore it onto the ext4 newly formatted partition. We also made a second wrong choice of laying the file system directly on the raw partition. We really should have created a logical partition from this one disk and put the file system on the logical partition. This would allow us to take our new /dev/xvdc2, create a new physical partition, add the physical partition to our logical partition, and grow the ext4 file system. Again, we did this on purpose to prove a point. You need to plan on expansion when you first lay out a system. To solve this problem we need to unmount the /u02 disk, delete the /dev/xvdc1 and /dev/xvdc2 partitions, create a physical partition with logical volume manager, create a logical partition, and lay an ext4 file system onto this new volume. We then restore our data from the backup and can simply grow the partition much easier in the future. We are not going to go through these steps because the exercise is to show you that it is much easier with platform as a service and not how to do it on infrastructure as a service. If we look at a database as a service disk layout we notice that we have /dev/xvdc1 as /u01 which represents the ORACLE_HOME, /dev/mapper/dataVolGroup-lvol0 as /u02 which represents the tablespace area for the database, /dev/mapper/fraVolGroup-lvol0 which represents the fast recovery area (where RMAN dumps backups), and /dev/mapper/redoVolGroup-lvol0 which represents the redo log area (where DataGuard dumps the transactions logs). The file systems are logical volumes and created by default for us. The file systems are ext4 which can be seen by looking at the /etc/fstab file. If we need to grow the /u02 partition we can do this by using the scale up option for the database. We can add 20 GB and extend the data partition or the fra partition. We also have the option of attaching the storage as /u05 and manually growing partitions as desired. It is important to note that scaling up the database does require a reboot and restart of the database. When we try to scale up this database instance we get a warning that there is a Java service that depends upon the database and it must be stopped before we can add the storage desired. In summary, we can use IaaS to host a database. It does get rid of the need for a storage administrator. It does not get rid of the need for an operating system administrator. We still have to know the file system and operating system commands. If we use PaaS to host a database, we can add storage as a database administrator and not need to mess with the logical volume or file system commands. We can grow the file system and add table extents quickly and easily. If we undersize our storage, correcting for this mistake is much easier than it was years ago. We don't need to overpurchase storage anymore because we can allocate it on demand and pay for the storage as we use it. We can easily remove one of the headaches that has been an issue for years and no longer need to triple our storage estimates and go with realistic estimates and control budget better and easier.

Historically, what happens to a database months after deployment has always been an issue and problem. If we go out and purchase a computer and disk storage then deploy a database onto the server. If...

PaaS

SQL Developer connection to DBaaS

Today we are going to connect to our database using SQL Developer. We could connect using sqlplus with a remote command but instead we are going to use a graphical tool to connect to our database in the cloud. It is important to note that this is the same tool that is used to connect to our on premise database. We can execute sql commands, look at the status of the database, clone pluggable databases from one service to another, and generally manipulate and manage the database with command line features of wizards. SQL Developer is a free integrated development environment that simplifies the development and management of Oracle Database in both traditional and Cloud deployments. SQL Developer offers complete end-to-end development of your PL/SQL applications, a worksheet for running queries and scripts, a DBA console for managing the database, a reports interface, a complete data modeling solution, and a migration platform for moving your 3rd party databases to Oracle. There are a few books that have been written about this product.Oracle SQL DeveloperOracle SQL Developer Data Modeler for Database Design Masteryas well as blogsJeff Smith's SQL Developer BlogKris Rice's BlogBarry McGillin's BlogOTN SQL Developer Community areaI suggest looking at the followingGetting StartedOn-line DemosOn-line TutorialsWe are not going to dive deep into SQL Developer but rather introduce a couple of concepts for monitoring our database in the cloud. We are running version 4.1.3 on a Windows desktop. We actually are cheating a little bit and running it on a Windows 2012 Server that is provisioned into IaaS in the Oracle Cloud. It makes a good scratch space for demos and development hands on labs. When we connect we can connect to the public ip address of our database on port 1521 or we can create an ssh tunnel and connect to localhost on port 1521. We will first connect via an ssh tunnel. To start, we need to log into our database service and figure out what the ip address is for the system we provisioned. For our system we notice that the ip address is 129.152.150.120.We are going to first connect with ip tunneling through putty. We launch putty and enter the ip address, the ssh keys, and open up port 1521 as a tunnel. We open a connection and all connections to port 1521 on localhost will be connected to our cloud service at the ip address specified. Note that this solution works if we have one database that we are connecting to. If we have two database instances in the cloud we will need to map a different port number on localhost to port 1521 or open up the ports to the internet which we will talk about later. We need to keep this shell active and open but we can iconify the window.In SQL Developer we can now create a new connection to our database. This is done by clicking on the green plus sign in the top right of the screen. This opens a dialog window to define the connection to the database. We will call this connection prs12cHP which is the name of our service in the cloud. We are going to connect as sys so we need to select the advanced connection to connect as sysdba. It is important to note that you can not do this with Amazon RDS if you provision an Oracle database in the Amazon PaaS. Amazon does not allow you to login as sys or system and does not give you sysdba privileges. If you want sysdba access you will need to deploy Oracle into Amazon EC2 to get access. Once we define our connection to localhost, port 1521, sys as sysdba, and an OID of ORCL we can test our interface and accept the connection once it is successful. Note that we can execute commands in the right window and look at things like what version of the database we are running. In this example we are running the High Performance Edition so we can use diag and tuning extensions from SQL Developer.There is a new DBA feature in the latest release of SQL Developer. We can launch a navigation menu to add our cloud database by going to the View ... DBA option at the top of the screen. This give us another green plus sign so that we can add the database and expose typical management views adn functions. Two things that are of note here are a simple exposure to pluggable database as well as a clone option associated with this exposure. We can do other things like look at backup jobs, look at table space allocation and location, look at users that are authorized and active. This is not a replacement for Enterprise Manager because it is looking at immediate and not historic data. Now that we have connected through a tunnel, let's look at another option. We can open up port 1521 on the database service and connect straight to the ip address. This method is not recommended because it opens up your database to all ip addresses on the internet if you are using a demo or evaluation account. You can whitelist ip addresses, vpn, or subnet limit the systems that it answers. This is done through the compute service management interface under the networking tab. We need to enable the dblistener for our database service. Once we do this we can connect SQL Developer to the database using the ip address of the database service. We might need to do this if we are connecting to multiple cloud servers and don't want to create a tunnel for each of them.In summary, we have connected to our database service using SQL Developer. This is the same tool that we use to connect to databases in our data center. We can connect the same way that we normally do via an ip address or tunnel to keep the server in the cloud a little more secure. We noted the differences between the Amazon RDS and Oracle DBaaS options and provided a workaround with EC2 or Azure Compute as an alternative. It is important to remember the differences between PaaS features and IaaS features when it comes time to calculating the cost of services. PaaS gives you expanded features like automated backup and size up/down which we will look at next week.

Today we are going to connect to our database using SQL Developer. We could connect using sqlplus with a remote command but instead we are going to use a graphical tool to connect to our database in...

PaaS

Using Enterprise Manager to manage cloud services

Yesterday we talked about the virtues of Enterprise Manager. To honest the type of monitoring tool is not important but the fact that you have one is. One of the virtues that VMWare touts of VSphere is that you can manage instances on your server as well as instances in VCloud. This is something worthy of playing with. The same tool for your on premise instances also managing your instances in the cloud has power. Unfortunately, VCloud allows you to allocate virtual machines and storage associated with it so you only have a IaaS option of compute only. You can't allocate just storage. You can't deploy a database server unless you have a database deployed that you want to clone. You need to start with an operating system and build from there. There are benefits of PaaS and SaaS that you will never see in the VCloud implementation.Oracle Enterprise Manager provides the same universal management interface for on premise and in cloud services. Amazon falls short on this. First, they don't have on premise instances so the tools that they have don't monitor anything in your data center, only in their cloud. Microsoft has tools for monitoring services plugins for looking at Azure services. It is important to note that you need a gateway server in the Azure cloud to aggregate the data and ship the telemetry data back and report it in the monitoring tool. There is a good Blog detailing the cost if IaaS monitoring in Azure. The blog points out that the outbound data transfer for monitoring can cost up to $17/month/server so this is not something that comes for free. Today we are going to look at using Enterprise Manager as a management tool for on premise systems, the Oracle Public Cloud, Amazon AWS, and Microsoft Azure. We are going to cheat a little and use a VirtualBox instance of Enterprise Manager 13c. We are not going to go through the installation process. The books and blogs that we referenced yesterday detail how to do this. Unfortunately, the VirtualBox instance is available from edelivery.oracle.com. We are not going to use this instance but are going to use an instance for demo purposes only available internal to Oracle. The key difference between the two systems is that the edelivery instance is 21 GB in size for download and expands to provide an OEM 13c instance for testing while the internal system (retriever.us.oracle.com) has a 12c and 11g database installed and is 39.5 GB (expanded to almost 90 GB when uncompressed). Given the size of the instance I really can't provide external access to this instance. You can recreate this by downloading the edelivery system, installing an 11g database instance, installing a 12c database instance, and configuring OEM to include data from those instances to replicate the screen shots that we are including. If we look at the details on the virtual box instance we notice that we need at least 2 cores and 10 GB of memory to run this instance. The system is unusable at 8 GB of RAM. We really should bump this up to 12 GB of RAM but given that it is for demo purposes and for training it is ok if it runs a little slow. If we were running this in production it is recommended to grow this to 4 cores and 16 GB of memory and also recommended that you not use a downloaded VirtualBox instance for production but install from scratch.The key things that we are going to do are walk through what it takes to add a monitoring agent onto the service that we are trying to monitor and manage. If we look at the architecture of Enterprise Manager we notice that there are three key components; the Oracle Management Repository (OMR), the Oracle Management Service (OMS), and the Oracle Management Agent (OMA). The OMR is basically a database that keeps a history of all telemetry actions as well as reports and analytics for the systems being monitored. The OMS is the heart of Enterprise Manager and runs on a WebLogic server. The code is written in Java and presents the primary user interface to the administrators as well as being the gateway between the OMR and the agents or OMAs. The agents are installed on the target systems and collect operating system data, database data, weblogic data, and all other log data to ship back to the OMR for analysis by the users. It is important to note at this point that most PaaS and SaaS providers do not allow you to install an Enterprise Manager Agent or any other management agent on their instances. They want to manage the services for you and force you to use their tools to manage their instance. SalesForce, for example, only gives you access to your customer relationship data. You can export your contact lists to an csv file to backup your data but you can't correlate the contact list to the documents that you have shared with these users. Amazon RDS does not provide a file system access, system access to the database, or access to the operating system so that you can install the management agent. You must use their tools to monitor services provided on their sites. Unfortunately, this inhibits you from looking at important things like workload repository reports or sql tuning guides to see if something is running slow or waiting on a lock. Your only choice is to deploy the desired PaaS or SaaS as a manual or bundled install on IaaS forcing you to manually manage things like backups and patching on your own. The first thing that we need to do in Enterprise Manager is to log in and click on the Setup button on the top right. We need to define named credentials since we are going to connect to the cloud service using public and private ssh keys. We need to follow the Security pull down to Named Credentials.We click on the Create icon in the top left and add credentials with public and private keys. If we don't have an ssh key to access the service we can generate an ssh key using ssh-keygen which generates a public and private key and upload the key using the SSH Access pull down in the hamburger menu. Once we upload the ssh key we can use ssh -i keyname.ppk opc@ip_address for our database server. We will use this keyname.ppk to connect with Enterprise Manager and have all telemetry traffic transferred via the ssh protocol.Once we have the credentials valid in the cloud account we can create the ssh access through Enterprise Manager. To do this we to to Setup at the top right, Security, Named Credentials. We then click on the Create button in the middle left to start entering data about the credentials. The name in the the screen shot below failed because it begins with a number so we switched it to ssh2017 since 2017ssh failed the naming convention. We are trying to use host access via ssh which is done with pull down menu definitions. The system defaults to a host access but we need to change from host to global which does not tie our credentials to one ip address. We upload our public and private key as well as associate this with the opc user since that user has sudo rights. We can verify the credentials by looking at the bottom of the list. This should allow us to access our cloud host via ssh and deploy an agent to our cloud target.Note that we created two credentials because we had a step fail later. We created credentials for the opc user and for the oracle user. The opc credentials are called ssh2017 as shown in the screen shots. The oracle credentials are called oracle2017 and are not shown. The same steps are used just the username is changed as well as the name of the credentials. If we want to install the management agent onto our instance we need to know the ip address of the service that we are going to monitor as well as an account that can sudo to root or run elevated admin services. We go to the Enterprise Manager splash screen, login, select the Setup button in the top right and drill down to Add Target and Add Target Manually. This takes us to the Add Target screen where we can Install Agent on Host. To get rid of the warnings, we added our cloud target ip address to the /etc/hosts file and used a fully qualified and short name associated with the ip address. We probably did not add the right external dns name but it works with Enterprise Manager. When we add the host we use the fully qualified host name. We can find this by logging into the cloud target and looking at the /etc/hosts file on that server. This gives us the local ip address and a fully qualified host name. Once we have this we can enter a directory to upload the agent software to. We had to create an agent directory under the /u01/app/oracle directory. We select the oracle2017 credentials (the screen shots use ssh2017 but this generates an error later) we defined in the previous step and start uploading the agent software and configuring the host as a target. Note that we could have entered the ip address rather than going through adding the ip address to /etc/hosts. We would have received a warning with the ip address.When we first tried this we got an error during the initialization phase that opc did not own the /u01/app/oracle directory and had to create an agent directory and change ownership. Fortunately, we could easily resubmit and enter a new directory without having to reenter all of the other information. The deployment takes a while because Enterprise Manager needs to upload the agent binaries, extract, and install them. The process is updated with status so that you can see the progress and restart when errors happen. When we changed the ownership, the installation failed at a later step stating the opc did not have permission to add the agent to the inventory. We corrected this by installing as oracle and setting the /u01/app/oracle/agent directory to be owned by oracle. When we commit the ip address or host name as well as the ssh credentials, we can track progress as the management server deploys the agent. We get to a point where we note that the oracle user does not have ssh capabilities and we will need to run some stuff manually from the opc account. At this point we should have an enterprise manager connection to a cloud host. To get this working from my VirtualBox behind my AT&T Uverse wireless router I first had to configure a route on my broadband connection and set the ip address of the Enterprise Manager VirtualBox image to a static ip address. This allows the cloud instance to talk back to the OMS and store data in the OMR. The next step is to discover the database instances. This is done by going through a guided discovery on the host that we just provisioned. It took a few minutes to sync up with the OMS but we could verify this with the emctl status agent command on the target host. We add the target manually using the guided discovery and select database services to look for on the target. At this point we should have a database, listener, and host connected to our single pane of management glass. We should see a local database (em12c) and a cloud based database (prs12cHP). We can look at the host characteristics as well as dive into sql monitoring, database performance, and database management like backup and restore options or adding users to the repository. We could add a Java Cloud Service as well as link these two systems together and trace a web page request down to a sql read and look at what the longest latency component is. We can figure out if the network, java memory allocation, or databse disk is causing the slowest response. We can also look at sql tuning recommendations to get suggestions on changing our sql code or execution plans using the arw report and sql tuning utilities in Enterprise Manager. In summary, we can connect to an on premise server as well as a cloud server. We can't connect to an Amazon RDS instance because we don't get file system level access to push a client to or a root user to change the agent permissions. We do get this with IaaS on Oracle, Compute servers on Azure, and EC2 on Amazon. We also get this with PaaS on Oracle and potentially event Force.com from SalesForce. No one give you this ability with SaaS. It is assumed that you will take the SaaS solution as is and not need to look under the covers. Having a single pane of glass for monitoring and provisioning services is important. The tool should do more than tell you how full a disk is or how much of a cpu is loaded or available. It should dive into the application and let you look at where bottlenecks are and help troubleshoot issues. We could spend weeks diving into Enterprise Manager and the different management packs but we are on a journey to look at PaaS options from Amazon, Microsoft, and Oracle.

Yesterday we talked about the virtues of Enterprise Manager. To honest the type of monitoring tool is not important but the fact that you have one is. One of the virtues that VMWare touts of VSphere...

PaaS

Managing servers and instances in the cloud

Managing servers and instances has been an ongoing issue since the introduction of the first computer. Recently with the advent of virtualization the idea of a management console to control what processors are running what services and what storage is allocated to what operating system has gained popularity. Many people are familiar with VMWare VSphere where you get a view of processors. We get a view of a server and can see virtual images deployed on this server. We can see how well the resources (memory, cpu, and disk) are being utilized. We can allocate more or less resources since this is a dynamic allocation and make sure that we are not over allocating resources and wasting them or under allocating them and causing applications to run slower.In this example we can see that we have two processors, 2 GB of memory, and just under 300 GB of disk on this computer. We have five virtual machines running on this computer and can dive into each operating system and look at what operating system is installed and how the limited resources are allocated and utilized. What we can't see is what applications are installed and how the applications are running. For example, is the Windows Home Server 2011 running an Apache Web Server and how many hits did the web server get in the past four days? Monitoring tools beg the question of what are you monitoring. If you are managing limited resources and making sure that you have not over or under allocated services, tools like VSphere are excellent tools. Unfortunately, you will need other tools to dive into another tool. EMC, for example, has a storage manager that lets you look not only at a logical unit level but a controller and disk level. It understands VMWare and lets you look at how disks are related to virtualization engines and how they are consuming resources.Again, this is a very good tool to look at how well a disk is performing, how well data is laid out across spindles, and how well your data network is being transmitted between disk and server. We can see hot spots. We can see disks that are over and under utilized. We can manage a scarce resource and make sure that it is properly utilized.When we talk about monitoring we need to shift our thought process. Yes, it is important to manage compute, memory, and storage resources but it is also important to realize that these resources are commodities. If we run low, we get more. If we use too much we are wasting resources. We should be able to automate allocation of resources and size up or size down resources without manual monitoring. What we are really interested in is how well is our company running. If we are a university we might be interested in the latency of delivering online video classes. We might be interested in how many classes are being added to a student schedule during registration. If we are a ticket retailer we might be interested in how many tickets were requested and paid for on a minute by minute basis. Note that we are not talking about how well a disk drive is allocated or if we have enough processors allocated to a virtual machine, we are talking in term of business terms. We are looking at tying revenue generating services back to computer resources and trying to figure out what is causing a problem. In the online video classroom example, we might have our processors allocated properly, storage tuned to the last IOP, and memory allocated to buffer data and reduce disk reads. If we are on the same network as the athletic department and our basketball team made it to the elite eight during March madness and the athletic department live streams the game on the same network as our classroom servers our classes will be offline due to demand to watch the basketball game. Tools from EMC and VMware will show that everything is working fine and life is good. Meanwhile the help desk is getting calls from students off campus that can't access their assignments during midterms and their Thursday class is not available. What we need is a monitoring system that can look at systems and incorporate more than just processor and disk. What we need is a tool that can look at systems and services and not just resources. We would like to look at the video distribution system and be able to dive into the disk, network, or processor and see what the bottleneck is and fix it quickly.Oracle released a tool years ago called Enterprise Manager. The tool started out as a database monitoring tool that allowed you to dive into sql calls and figure out why it was taking longer than necessary. With acquisitions of companies like BEA and Sun Microsystems the tool expanded to look at how Java was performing inside a WebLogic server and how disk drives were performing that were serving up requests for the database and WebLogic server. Acquisitions of companies like JD Edwards and PeopleSoft drove the monitoring tools in the opposite direction and screens showing how many purchase orders were being processed on an hourly basis were suddenly available. You could look at what was the bottleneck in closing your books for the end of month reconciliation. Was it a manual process waiting on a report to drop into a directory or was it a sql statement that was taking minutes rather than seconds to complete? You could start looking at a process like purchase orders and dive into a database to see if a table was reaching storage limits as well as figure out that someone recently patched the database which caused an index to not look at a new column that was created and searches are now going against this column so select statements are doing a full table scan rather than using an index to report answers quicker. Adding more storage in this case will be a waste of time. Yes, we are running out of storage on a table but the real issue is we need to re-index the database or execute a new sql execution plan. Below is a screen shot of how well a database is performing with links to look at all the sub-components of the database.Books have been written on Enterprise Manager. We are not going to cover everything in this blog to make you an expert on the subject. Expert Oracle Enterprise Manager 12cOracle Enterprise Manager Cloud Control 12c Deep DiveOracle Enterprise Manager 12c Administration CookbookOracle Enterprise Manager 12c Command-Line InterfaceManaging IaaS and DBaaS Clouds with Oracle Enterprise Manager Cloud Control 12cOracle Enterprise Manager Cloud Control 12c: Managing Data Center ChaosThere are also a number of blogs related to Enterprise ManagerOracle OEM BlogOracle Press BlogGokhan Atil's BlogRob Zoeteweij's BlogDavid Marco's BlogIOUG Enterprise Manager BlogThis is a partial list of blogs returned by a Google search. I am sure I missed a few. Note that the list of books and blogs is not a short list. There are classes offered by Oracle University that you can take virtually or in a classroom (both cost money).Using Oracle Enterprise Manager Cloud Control 12c Ed 2Oracle Enterprise Manager Cloud Control 12c: Advanced Configuration WorkshopOracle Enterprise Manager Cloud Control 12c: Install & UpgradeOracle Enterprise Manager Cloud Control 12c: Overview BundleOracle Enterprise Manager Cloud Control 12c: Management BundleThe way that Oracle Enterprise Manager is paid for is simple. The base system is free and you pay for the options that you want to use. Unfortunately, the Technology Price Guide is not very clear as to what is and is not Enterprise Manager and what is an option on the database. For example, on page 7, most of the management packs are listed. If you want diagnostics for the database you will need to license your database at $7,500 per processor and not Enterprise Manager. You can license at $150 per named user but the licensing metrics for your database need to match the licensing for you management pack. You could have a two processor license for production and a 25 named user license for development and testing so you will need to blend these licenses into Enterprise Manager with the management packs. Diagnostics is specifically confusing because you enable or disable this feature in Enterprise Manager and not in the database. The telemetry data is being collected for the database but the reporting on the results of the analysis is not being done in the database. You could turn on the reporting in Enterprise Manager without involving the DBA thus incurring an additional license fee that you had not paid for. There is no license key or email that is sent to Oracle saying that you enabled the license it is a simple checkbox in Enterprise Manager that says turn on diagnostic reporting. In recent versions a warning screen pops up telling you that this is not a free feature. In OEM 10g the feature was turned on by default and you had to turn it off. This has changed in recent releases. If you try to turn this feature on when connecting to an Enterprise Edition in the Oracle Public Cloud you will get a feature not available message. You need to go with High Performance or Extreme Performance edition of the database to get the diagnostics enabled.There are also management packs for Oracle Applications and the pricing for these products can be found in the Oracle Applications Price List. You need to search for the word "packs" to find the price of the management packs in this list. You can get a list of all the management packs from the Oracle Tech Network page for Enterprise ManagerIt is important to note that the Enterprise Manager that runs in your data center monitoring your servers and Oracle hardware and software products is the same tool that you can use to monitor and manage PaaS and IaaS resources in the Oracle Public Cloud. You can connect to the instance in the cloud using ssh and read the telemetry from the cloud instance as if it were installed on one of your servers. You can use extensions to the latest version of Enterprise Manager, 13c, to clone a pluggable database instance from your on site installation to a cloud instance.You can also setup reporting and self service requests to have end users ask for a new service to be provisioned either on site or in the cloud. Below is a screen shot of how to do this for a database. We could do something similar for a WebLogic server, an Apache Web server, a PeopleSoft instance for dev/test, or any layer of the Oracle stack. In summary, selection of a management tool is important. Tools are good to understand and properly use. At some point you need to step back and ask what is the questions that I need answers to. Am I diving too deep on trying to optimize something that is not that worth deep analysis? Could I automate this and not have to monitor it at all? If I run out of processing power does it make sense to automatically scale up the number of processors? Should I scale out by spinning up more web servers? Do I need to re-architect my network topology to isolate disk traffic from client traffic? If I generate a report who will consume the results? Is the report for someone in IT? Purchasing? The process owner? Is it a technology or financial report? Products like Enterprise Manager allow you to generate all of these reports using different management extensions. My suggestion is to look at some of the introductory videos on the Oracle Tech Network to get an introduction to the problem that you are trying to solve then figure out how much it will cost to measure what is important to you.

Managing servers and instances has been an ongoing issue since the introduction of the first computer. Recently with the advent of virtualization the idea of a management console to control...

PaaS

database alternatives

One of the key questions that I get asked on a regular basis is to justify the cost of some product. Why not use freeware? Why not put things together and use free stuff? When I worked at Texas A&M and Rice University we first looked at public domain software. We heavily used the Apache web server, Tomcat, MySQL, Postgress, Linux, and BSD. These applications worked up to a point. Yes, you can spin up one Apache web server on one server. Yes, you can have one Apache web server listen on multiple IP addresses and host multiple web servers. The issue typically is not how many web servers can you handle but how many clients can you answer. Easily 90% of the web servers could handle the load that it saw on a regular basis. We spent 80% of our time on the 10% that could not handle the load. Not all of the web servers could handle the functionality. For example, a student registration system needs to keep a shopping cart of classes selected and you need to level up to an Apache Tomcat server to persistently keep this data and database connections live. If you use a web server you need to store all transactions in the database, all of the classes selected, and all of the fees associated with the class. Every interaction with the web server causes multiple connections with the database server. Doing this drives the number of processors needed by the database thus driving up the cost of the hardware and software license.If we use an application server that can handle caching of data, we can keep a list of available classes on the application server and not only have to go back to the database server for transactions. When a student selects a class, it takes it out of inventory and puts it in their class schedule for the next year. The same is true for on-line shopping, purchasing tickets to a play or airline, drafting for a fantasy football team. Years ago ESPN ran a March Madness contest on-line. They presented your selections with an Apache web server and every team selection required an interaction with their database on the back end. The system operated miserably and it took hours to select all rounds to fill out your bracket. They updated the server with Javascript and a Tomcat server and allowed you to fill out all of round one in your browser. Once you finished the first round you submitted your selections and were presented with a round two based on your first round selections. They later put this on WebLogic and put all of the round selections in Java code on the WebLogic server. The single interaction with the database became submission of your complete bracket. They went from thousands of interactions with a database to a single interaction per submission. We can have similar architecture discussions at the database layer as well. If I am looking at a simple table lookup, why pay for a robust database like Oracle 12c? Why not use something like Azure Table Storage Services and do a simple select statement from a file store. Why not put this in a free version of Oracle in Apex on the web and define a REST api to pull the data based on a simple or potentially more complex select statement. Again, 90% of the problems can be solved with simple solutions. Simple table lookups like translating a simple part name to a price can be done with Excel, MySQL, APEX, JSON processing, or REST apis. The difficulty comes up with the remaining 10%. How do I correlate multiple tables together to figure out the price of an item based on cost of inventory, cost of shipping, electrical costs, compensation costs for contractors and sales people, and other factors that determine profitability and pricing. How do I do a shortest routing algorithm for a trucking system based on traffic, customer orders, inventory in a warehouse, the size of a truck, and the salary of the driver and loading dock personnel. For things like this you need a more complex database that can handle multiple table joins, spatial data, and pulling in road conditions and traffic patterns from external sources. Products like IBM DB2, Oracle Database, and Microsoft SQL Server can address some of these issues.We also need to look at recovery and restoration time. When a Postgress server crashes, how long does it take to recover the database and get it back online? Can I fail over to a secondary parallel server because downtime is lost revenue or lost sales. If you go to HomeDepot to order plumbing parts and their site goes down, how long does it take to go to the Ace or Lowes web site and order the same part and have it delivered by the same delivery truck to your home or office? Keeping inventory, order entry, and web services up becomes more than just answering a query. It becomes a mission critical service that can not go down for more than a few seconds. Services like Data Guard, Golden Gate, and Real Application Clustering are required to keep services up and active. MySQL, MongoDB, Amazon Aurora, and other new entry level database technologies can handle simple requests but take minutes/hours to recover information for a database. Failing over through storage to another site is typically not an answer in this case. It takes minutes/hours to recover and restart a moderate database of 20 TB or larger. First the data replication needs to finish then the database needs to be booted at a secondary site and it needs to maintain consistency in the data as it comes back up. The application server then needs to connect to the new service and recommit requests that came in during and since the system failure. As this is happening, customers are opening a new browser tab and going to your competition to find the same part on another site. In summary, it takes more than just getting a bigger and faster application server or database. Moving the services to the cloud isn't necessarily the answer. You need to make sure that you move the two components together the majority of the time. Look at your application and ask where do you spend more of your time? It is tuning sql statements? Is it writing new queries to answer business questions? Is it optimizing your disk layout to get tables to the database faster? Take a step back and ask why is the database pounding the disk so hard. Can I cache this data in the database by adding a little more memory to the disk controller or database server? Can I cache the data at the application server by adding more memory there and keep from asking the database for the same information over and over again? In the next few days we are going to look at database options and database monitoring. We are going to look at some of these tools and refer back to the bigger picture. Yes, we can tune the storage to deliver all of the bits at the highest rate possible. Our question will not be how to do this but should we be doing this. Would something like an Exadata or an in-memory option allow us to transfer less data across the storage network and get us answers faster? Would adding memory somewhere allow us to buffer more data and reduce the database requests which reduces the amount of data needed from the disk.

One of the key questions that I get asked on a regular basis is to justify the cost of some product. Why not use freeware? Why not put things together and use free stuff? When I worked at Texas...

PaaS

database management

Today we are going to look at managing an Oracle database. We are going to start with a 12c database that we created in the Oracle Public Cloud. We selected database as a service (as opposed to virtual image), monthly billing, 12c, and enterprise edition high performance edition. We accepted the defaults for the table size so that we can figure out how to extend the table size and selected no backups rather than starting RMAN for daily incrementals or cloud object storage for weekly full backups. We basically have four options for managing a database. If we have a small number of databases we might look at using the sqlplus sysdba command line access and grind through administration. We also have a database monitor that is installed by default with the database cloud service. We can dive into this database through the monitor and look at log running queries, tablespace sizes, and generic utilization. We can also connect with sql developer and look at the new DBA interfaces that were added in the latest release in early 2016. The fourth and final way of administering is to look at commercial management tools like Oracle Enterprise Manager (OEM) or other tools that aggregate multiple systems and servers and give you exposure beyond just the database. These commercial tools allow you to look at they layer that you are most interested in. You can get a PeopleSoft Management Pack for OEM that allows you to look at purchase order flow, or payroll requests. You can get diagnostics and tuning packs for the application server and database that allows you to look at what part of the PeopleSoft implementation is taking the longest. Is it the network connection? It is a poorly tuned Java Virtual Machine that is memory thrashing? It is a sql statement that is waiting on a lock? Is it a storage spindle that is getting hammered from another application? Is it a run away process on your database server that is consuming all of the resources? All of these questions can be answered with a monitoring tool if you not only know how to use it but what is available for free and what you need to purchase to get the richer and more valuable information.To get to the database monitor we go to the cloud services console (which changed over the weekend so it looks a little different), click on database, click on Service Console, and click on the database name.If we click on the dbaas_monitor menu item in the hamburger menu system to the right of the service name it might fail to connect the first time. It will take the ip address of the database and try to open https://ip address/dbaas_monitor. We first need to open up port 443 to be able to communicate to this service. To get to the network connection we need to go to the Compute Service Monitor, click on the Network tab, and change the proper port number for our server prs12cHP. If we hover over the labels on the left we see what ports we are looking for. We are specifically interested in the https protocol. If we click on the hamburger menu next to this line item we can Update the security list which pops up a new window.To enable this protocol we enable the service and click the Update button. Once we do this we can retry the dbaas_monitor web page. We should expect a security exception the first time and need to add an exception. We login as dbaas_monitor and the password that we entered in the bottom left of the screen for the system passwords when we created the database. At this point we can look at cpu utilization, table space usage, if the database is running, and all other monitoring capabilities. Below are the screen shots for the listener and the table sizes and storage by pluggable database. We can look a little deeper at things like alerts, wait times, and real time sql monitoring. These are all available through command line but providing a service like this allows junior database administrators to look at stuff quickly and easily.The biggest drawback to this system is that you get a short snapshot and not a long term historic archive of this data. If we use Enterprise Manager, which we will look at in a later blog, from a central site we collect the data in a local repository we can look back at months old data rather than live or data from the past few hours.In summary, if we use platform as a service, we get tooling and reporting tools integrated into services rather than having to spin these up or look at everything from the command line as is done with infrastructure as a service. We get other features but we are diving into database monitoring this week. We briefly touched on database monitoring through what was historically called dbmonitor and is moving towards dbaas_monitor or a central enterprise manager pane of glass for database services in our data center and in the cloud. One of the key differentials from Oracle Database as a Service and Amazon RDS is database monitoring. We will look at database monitoring for Amazon RDS later this week and notice there are significant differences.

Today we are going to look at managing an Oracle database. We are going to start with a 12c database that we created in the Oracle Public Cloud. We selected database as a service (as opposed...

PaaS

Database in Microsoft Azure

Today we are going to look at what it takes to install Oracle Database Enterprise Edition 12c in Microsoft Azure. We had previously looked at deploying Application Express in Azure. The steps to deploy Enterprise Edition are almost the same. We start with the same process by logging into the portal, click on New, search for Oracle and look for the enterprise edition of the database. In this example we are going to select Enterprise Edition 12c.The two links at the bottom link you to the licensing and privacy statements from the Oracle website. Note that the license is not included for this edition of the database and you need to adhere to the licensing restrictions of a perpetual license for a cloud deployment. If we refer back to our calculations for perpetual license in AWS we amortize the database license over four years brings this cost to $3,720/month for a four core server as recommended by Microsoft. Note that we can go with a smaller core count and smaller memory count unlike with Amazon. AWS restricts us to a minimum core count for the Oracle database but Azure allows you to go below the suggested minimums to a system that is unusable. It is impossible to run the database on a single core 1 GB of RAM but the option is presented to you. From the previous screen, we click Create to start the deployment. We can only deploy into a Classic Virtual Machine instance. The first things that we need to define are the server name, username to log in as, and password or ssh keys for the username. We can also define a new storage group or pull from an existing storage group. For our test either works. When we look at the shapes suggested by Microsoft, a D12 Standard shape (4 cores and 28 GB) is the smallest configuration. This comes in at $290/month or roughly $10/day. This is a little more than we want to pay for a simple test system. We can get by with 2 cores and 3.75 GB for a simple experiment. We can do this at $89/month or roughly $3/day with an A2 Standard shape. We select the shape and click Select.On the next screen we select the storage profile. The first option is Standard or Premium disk. If we select Premium SSD our shape gets resized to D2 Standard at a much higher per month charge. This gives us a higher IOP to storage which might or might not be required for our deployment. If we default back to Standard to get the lower shape cost, we have the option or locally replicated data, replication between data centers, and read access in a second geo the price goes from $2.40/100 GB/month to $4.80 to $6.10. We will go for the locally replicated data to minimize cost. We can define a new domain name for this account or accept the default. We can also define a virtual network for this instance as well. We can select the subnet routine as well as dynamic or static ip address assignment. We are going to accept the defaults for the network. We do need to open port 1521 by adding an endpoint to this instance. If we scroll down on the network screen we can add a port by adding an endpoint. We might or might not want to open up this port. When we do this it opens up the port to the world. We can tunnel through ssh to access port 1521 but for demonstration purposes we are going to open up this port to the world and potentially look at white listing or ip address restricting access to this instance. We might also want to open port 1158 to see the enterprise manager console, port 80 for application express which is also available in enterprise edition of the database. We do have the option of monitoring extensions to look at how things are performing. We are going to skip this option for our experiment but it is interesting to note that you do have additional options for monitoring.We are not going to explore the diagnostics storage or availability sets because they really don't apply to the database. They are more concerned with operating system and do not extend into the database. At this point we are ready to launch the instance so we click Ok. We do get one final review before we provision the instance with the database installed.When we click Ok we get a message that the instance is deploying. We can look at more detail by clicking on the bell icon at the top and drilling down into the deployment detail.It is important to note that the database binaries are installed by the database is not configured. There is no listener running. The ORACLE_SID has not been set. We need to run the odbca to create a database instance. Other tutorials on installing an Oracle Database on Azure can be found atLinux install with DB downloadOracle Database on Windows (binaries not available anymore)To create a database at this point we need to run the dbca command. When I first tried to execute this command I got a strange error in that the system asked for a password then cleared the screen. This is a known issue relating to line wrap and XTERM configurations. It can be fixed by going into the putty settings and turning off line wrap.If we look at the command line needed to create a database with dbca we notice that we first need -silent to disable the system from using a default X-Window screen to walk you through the installation. We do not have the X-Window system enabled or the ports configured so we need to install the database from the command line. This is done with the -silent option. The second option is -createDatabase. This tells dbca to create a new database. We also need to define a template to use as the foundation. Fortunately we have pre-defined templates in the /u01/app/oracle/product/12.1.0/dbhome_1/assistants/dbca/templates directory. We will be usign the General_Purpose.dbc template. We could use the Data_Warehouse.dbc or create a new one with the New_Database.dbt template. We also need to define the ORACLE_SID and characterset with the -gdbName, -sid, and -characterSet parameters. We finally wrap up the command options with -responseFile set to NO_VALUE. The entire command looks likedbca -silent -createDatabase -templateName General_Purpose.dbc -gdbname orcl -sid orcl -responseFile NO_VALUE -characterSet AL32UTF8 -memoryPercentage 30 -emConfiguration LOCALThis will create a database with ORACLE_SID set to orcl. We add a couple of other paramters to configure enterprise manager to be local rather than a central enterprise manager agent and limit the memory that we will use to 30% of the memory on the system. The database creation agent will configure the database. This step will take 10-15 minutes to get to 100%. Some tutorials on how to use dbca in the silent mode can be found atVitalSoftTechDBA Expert - Charles Kimorafaq.com - question and answerPierre Forstmann's blogThere are really no videos on youtube showing an install. In our example we should have include the -pdbName option to create an initial pluggable database as part of our database installation. Once we see the 100%, the database is complete. We then need to set our ORACLE_SID, ORACLE_HOME, PATH, and start the listener so that we can connect to the database. This is done with the commandsoraenvexport ORACLE_HOME=/u01/app/oracle/product/12.0.1/db_homeexport PATH=$PATH:$ORACLE_HOME/binlsnrctl starthttps://blogs.oracle.com/pshuff/resource/azure_db_12c_dbca_lsnrctl.png" width="90%">From here we can look at the header information to verify that we installed a 12c Enterprise Edition and look at the location of the data files with the following commandsselect * from v$version;select con_id, name from v$datafile order by 1;We can connect with SQL Developer because we opened up port 1521.In summary, we can deploy Oracle Database 12c into the Microsoft Azure cloud. We get a partial install when we provision the database from the Marketplace. We still need to go through the dbca configuration as well as spinning up the listener and opening up the right ports for the database. The solution is not PaaS but database on IaaS. We can not size up the database with a single command. We do not get patching or automated backup, in fact we have not event setup backup at this point. This is similar to the Amazon AWS installation in EC2 but falls short of the database as a service delivered as PaaS in the Oracle Public Cloud. Pricing has the same considerations as the Database on AWS EC2 discussion we had yesterday with the only difference being the price for the compute and storage instance. We did not need to look at the online calculator because Microsoft does a very good job of presenting pricing options when you are configuring the instance. Again, we are not trying to say that once implementation is better or worse than the other but provide information so that you can decide your tradeoffs when selecting one cloud vendor over another.

Today we are going to look at what it takes to install Oracle Database Enterprise Edition 12c in Microsoft Azure. We had previously looked at deploying Application Express in Azure. The steps to...

PaaS

Database in Amazon EC2

Today we are going to look at what it takes to get a 12c database instance up and running in Amazon EC2. Note that this is different than our previous posts on getting Standard Edition running on Amazon and running Enterprise Edition running on Amazon RDS. We are going to take the traditional approach as if we were installing the database on a virtual image like VMWare, HyperV, or OracleVM. The approach is to take IaaS and layer the database upon it. There are a few options on how to create the database instance. We can load everything from scratch, we can load a pre-defined AMI, we can create a golden image and clone it, we can do a physical to virtual then import the instance into the cloud, or we can create a Chef recipe and automate everything. In this blog we are going to skip the load everything because it is very cumbersome and time consuming. You basically would have to load the operating system, patch the operating system, create users and groups, download the binaries, unpack the binaries, manage the firewall, and manage the cloud port access rights. Each of these steps takes 5-30 minutes so the total time to get the install done would be 2-3 hours. Note that this is much better than purchasing hardware, putting it in a data center, loading the operating system and following all the same steps. We are also going to skip the golden image and cloning option since this is basically loading everything from scratch then cloning an instance. We will look at cloning a physical and importing into the cloud in a later blog. In this blog we are going to look at selecting a pre-defined AMI and loading it. One of the benefits of the Marketplace model is that you get a pre-defined and pre-configured installation of a software package. Oracle provides the bundle for Amazon in the form of an AMI. For these instances you need to own your own perpetual license. It is important to understand the licensing implications and how Oracle defines licensing for AWS. Authorized Cloud Environment instances with 4 or fewer virtual cores are counted as 1 socket, which is considered equivalent to a processor license. For Authorized Cloud Environment instances with more than 4 virtual cores, every 4 virtual cores used (rounded up to the closest multiple of 4) equate to a licensing requirement of 1 socket. This is true for the Standard Edition license. For the Enterprise Edition license the assumption is that the cloud processor is an x86 chip set to a processor license is required for every 2 virtual cores. All of the other software like partitioning, diagnostics, tuning, compression, advanced security, etc also need to be licensed with the same metric.If we look at the options for AMIs available we go to the console, click on EC2, and click on Launch Instance.When we search for Oracle we get a wide variety of products like Linux, SOA, and database. If we search for Oracle database we refine the search a little more but get other supplementary products that are not the database but products that relate to the database. If we search for Oracle database 12c we get six return values. We find two AMIs that look the same but the key difference is that one limits you to 16 cores and the other does not. We can select either one for our tests. If we search the Community AMIs we get back a variety of 11g and 10g installation options but no 12c options. (Note that the first screen shot is the Standard Edition description, it should be the Enterprise Edition since two are listed).We are going to use the Commercial Marketplace and select the first 12c database instance. This takes us to a screen that lets us select the processing shape. Note that the smaller instances are not allowed because you need a little memory and a single core does not run the database very well. This is one of the advantages over selecting an operating system ourselves and finding out that we selected too few cores or not enough memory. Our selections are broken down into general purpose, compute optimized, or storage optimized. The key difference is how many cores, how much memory, and dedicated vs generic IOPs to the disk. We could select an m3.xlarge or c3.xlarge and the only difference would be the amount of memory allocated. Network appears to be a little different with the c3.xlarge having less network throughput. We are going to select the m3.xlarge. Looking at pricing we should be charged $0.351/hour for the Ec2 instance, $0.125 per GB-month provisioned or $5/month for our 40 GB of disk, and $0.065 per provisioned IOP-month or $32.50/month. Our total cost of running this x3.xlarge instance will be $395.52/month or $13.18/day. We can compare this to a similarly configured Amazon RDS at $274.29/month. We need to take into account that we will need to purchase two processor licenses of the Enterprise Edition license at $47,500 per processor license. The cost of this license over four years will be $95,000 for the initial license plus 22% or $20,900 per year for support. Our four year cost of ownership will be $178,600. Amortizing this over four years brings this cost to $3,720/month. Our all in cost for the basic Enterprise Edition will cost us $4,116.35/month. If we want to compare this to the DBaaS cost that we covered earlier we also need to add the cost of the Transparent Data Encryption so that we can encrypt data in the cloud. This module is included in the Advanced Security Module which is priced at $15,000 per processor license. The four year cost of ownership for this package is $56,400 bringing the additional cost to $1,175/month. We will be spending $5,291.35 for this service with Amazon.If we want to compare this with PaaS we have the option or purchasing the same instance at $1,500/OCPU/month or $3,000/month or $2.52/OCPUhour for the Enterprise Edition on a Virtual Image. We only need two OCPUs because this provides us with two threads per virtual core where Amazon provides you with one thread per core. We are really looking for thread count and not virtual core count. Four virtual processors in Amazon is equivalent to two OCPUs so our cost for a virtual image will be $1.5K/OCPU * 2 OCPUs. If we go with the Database as a Service we are looking at $3,000/OCPU/month or $6,000/month or $5.04/OCPU/hour for the Enterprise Edition as a service. What we need to rationalize is the extra $708/month for the PaaS service. Do we get enough benefit from having this as a service or do we spend more time and energy up front to pay less each month? If we are going to compare the High Performance edition against the Amazon EC2 edition we have to add in the options that we get with High Performance. There are 13 features that need to be licensed to make the comparison the same. Each of these options cost anywhere from $11,500 per processor to $23,000 per processor. We saw earlier that each option will add $1,175/month so adding the three most popular options, partitioning, diagnostics, and tuning, will cost $3,525/month more. The High Performance edition will cost us $2,000/OCPU/month or $4K/month for the virtual image and $4,000/OCPU/month or $8K/month. Again we get ten more options bundled on with the High Performance option at $8K/month compared to $8,816.35 with the AWS EC2 option. We also get all of the benefits of PaaS vs IaaS for this feature set. Once we select our AMI, instance type, we have to configure the options. We can request a spot instance but this is highly discouraged for a database. If you get terminated because your instance is needed you could easily loose data unless you have DataGuard configured and setup for synchronous data commit. We can provision this instance into a virtual private network which is different from the way it is done in the Oracle cloud. In the Oracle cloud you provision the service then configure the virtual instance. In Amazon EC2 it is done at the same time. You do have the option of provisioning the instance into one of five instance zones but all are located in US East. You can define the administration access roles with the IAM role option. You have to define these prior to provisioning the database. You can also define operating of this instance from the console. You can stop or terminate the instance when it is shut down as well as prohibit someone from terminating the instance unless they have rights to do so. You can enable CloudWatch (at an additional charge of $7.50/month) to monitor this service and restart it if it fails. We can also add elastic block attachment so that our data can migrate from one instance to another at an additional cost. We now have to consider the reserved IOPs for our instance when we look at the storage. By default we get 8 GB for the operating system, 50 GB for the data area with 500 provisioned IOPS, and 8 GB for log space. The cost of the reserved IOPS adds $38.75/month. If we were looking at every penny we would also have to look at outbound traffic from the database. If we read all of our 50 GB back it would increase the price of the service by a little over $3/month. Given that this is relatively insignificant we can ignore it but it was worthy of looking at with the simple monthly calculator.Our next screen is the tags which we will not use but could be used to search if we have a large number of instances. The screen after that defines the open ports for this service. We want to add other ports like 1521 for the database, and 443 and 80 for application express. Port 1158 and 22 were predefined for us to allow for enterprise manager and ssh access.At this point we are ready to launch our instance. We will have 50 GB of table space available and the database will be provisioned and ready for us upon completion.Some things to note in the provisioning of this instance. We were never asked for an OID for the database. We were never asked for a password associated with the sys, system, or sysdba user account. We were never asked for a password to access the operating system instance. When we click on launch we are asked for an ssh key to access the instance once it is created.When you launch the instance you see a splash screen then a detail screen as the instance is created. You also get an email confirming that you are provisioning an instance from the marketplace. At this point I notice that I provisioned Standard Edition and not Enterprise Edition. The experience is the same and nothing should change up to this point so we can continue with the SE AMI.Once the instance is created we can look at the instance information and attach to the service via putty or ssh. The ip address that we were assigned was 54.242.14.146. We load the private key and ip address into putty and connect. We first failed with oracle then got an error message with root. Once we connect with ec2-user we are asked if we want to create a database, enter the OID, and enter the sys, system, and dbsnmp passwords. The database creation takes a while (15-30 minutes according to the create script) and you get a percent complete notification as it progresses. At this point we have a database provisioned, the network configured, security through ssh keys to access the instance, and should be ready to connect to our database with sql developer. In our example it took over an hour to create the database after taking only five minutes to provision the operating system instance. The process stalled at 50% complete and sat there for a very long time. I also had to copy the /home/ec2-user/.ssh/authorized_keys into the /home/oracle/.ssh directory (after I created it) to allow the oracle user to login. The ec2-user account has rights to execute as root so you can create this directory, copy the file, and change ownership of the .ssh directory and contents to oracle. After you do this you can login as oracle and manage the database who owns the processes and directories in the /u01 directory. It is important to note that the database in EC2 provides more features and functions than the Amazon RDS version of the database. Yes, you get automated backup with RDS but it is basically a snapshot to another storage cloud instance. With the EC2 instance you get features like spatial, multi-tenant, and sys access to the database. You also get the option to use RMAN for backups to directories that you can read offsite. You can setup DataGuard and Enterprise Manager. The EC2 feature set is significantly more robust but requires more work to setup and operate.In summary, we looked at what it takes to provision a database onto Amazon EC2 using a pre-defined AMI. We also looked at the cost of doing this and found out that we can minimally do this at roughly $5.3K/month. When we add features that are typically desired this price grows to $8.8K/month. We first compared this to running DBaaS in a virtual instance in the Oracle Public Cloud at $6K/month (with a $3K/month smaller footprint available) and DBaaS as a service at $8K/month (with a $4K/month smaller footprint available). We talked about the optional packs and packages that are added with the High Performance option and talked about the benefits of PaaS vs IaaS. We did not get into patching, backups, and restart features provided with PaaS but did touch on them briefly when we went through our instance launch. We also compared this to the Amazon RDS instance in features and functions at about a hundred of dollars per month cheaper. The bulk of the cost is the database license and not the compute or storage configuration. It is important to note that the cost of the database perpetual license is still being paid for if you are running the service or not. With PaaS you do get the option of keeping the data active in cloud storage attached to a compute engine that is running but you can turn off the database license on an hourly or monthly basis to save money if this fits your usage model of a database service.

Today we are going to look at what it takes to get a 12c database instance up and running in Amazon EC2. Note that this is different than our previous posts on getting Standard Edition running on...

Industry generic technologies

What's New in the Cloud

One thing that the last year has taught me is that things change quickly. One of the biggest challenges is to keep up with this change and figure out what is new and what is not. We are going to take a diversion today and look at changes in the Oracle Public Cloud then get back to provisioning database into different cloud platforms tomorrow. This is important because it helps us define how to differentiate platform as a service from infrastructure as a service with software installed on it. Entries like scale up and scale down of services, DataGuard between two data centers for DBaaS, temporary bursting services to larger instances, various connectors and plug ins for integration and SOA services are examples of PaaS advantages. Many of these features automatically happen or reduce hundreds of commands that needs to be executed to configure a service or integration. Provisioning a database into an IaaS service comes with tradeoffs and sacrifices. It is important to know what added services you are getting when you select PaaS over IaaS. The list of new features helps us understand the added value of PaaS and how we can leverage them. Let's start with infrastructure and see how things have changed. If you go to the Oracle Public Cloud Documentation you see tabs listing all of the services. For infrastructure this corresponds to compute, storage, and networking. If we click on infrastructure then compute, it takes us to the Compute Documentation. Note that there is a What's New page. At the time of writing this blog, the newest entry is April 2016. The key announcements in this entry includeApril 2016Oracle Compute Cloud Service — Generally Available (GA)- it was controlled availability1 OCPU subscription - previous min was 500 OCPUsBursting - non-metered services can short term double the cores allocated an additional services billed like a metered serviceOracle-provided Windows images - Windows 2012 R2Oracle-provided Solaris images - Solaris x86 11.3Cloning storage volumes using snapshotsCloning instances using snapshotsResizing storage volumes - storage can be resized while attached to active instancePrivate Images page moved to a new tab on the web consoleInstance IP addresses now shown on the Instances pageImproved image upload toolMarch 2016Changes in the web console for creating storage volumesopc-init documentation - startup initialization scripts when a new image is bootedFebruary 2016Oracle Network Cloud Service - VPN for Dedicated ComputeSecurity IP list as the destination in a security rule created using the web consoleSSH key management actions moved to the Network tab of the web consoleSummary information displayed for each resource in the web consoleSimplified navigation and improved performance in the web console - Orchestration tab changedThere isn't a what's new for storage and networking because it is folded into the compute page. Note that there were a few storage entries (resize to an active instance and cloning storage volumes) and network entries (VPN, Security list, SSH key management) in the compute page. For platform as a service, there is a What's new for DBaaS that details changes to the database as a service and schema as a service options. May 2016Oracle Data Guard available - database creation and replication between data centersBackup and recovery available through the console - previously required ssh accessUpdated version of Oracle REST Data ServicesOracle GlassFish Server removed - services now available through REST servicesApril 2016Configure a service instance’s database as the replication database for Golden GateMarch 2016Add an SSH public key to a service instance - allows for multiple ssh keys to an instanceJan 2016 PSU integrated into base image for single-instance databasesJan 2016 bundle patch integrated into base image for Oracle RAC databasesFebruary 2016Selectable database character set and national character set during instance creationJan 2016 PSU available for patchingJanuary 20162 TB (terabyte) storage volumes now supportedAbility to create “temporary” storage volumes using Oracle Compute Cloud Service - storage can be short term added and removed as needed.In the Application Development area there is a What's New for Application Container ServiceMay 2016New Command-Line InterfaceNew utilities for JavaScript and Node packaging and dependency management New deployment configurations for Java-based applications target Oracle Application Container Cloud ServiceA new Oracle Developer Cloud Service sample projectApril 2016Node.js 0.10.x, 0.12.xOracle Linux 6.6Oracle Java SE 7, 8Developer Cloud ServiceMay 2016Deploy to Oracle Application Container Cloud Service instancesSnippets supportNew Member dialogHome tab remembers your last opened child tabUpload artifacts to the project’s Maven repository from the Code tabView the dependency information for Gradle buildsThe Code button in the Commits view displays files of the current pathMore pre-defined standard search queries added in the Merge Request tabAudit Log in the Job Details pageBuild is triggered on push to Git repositoryDeploy to Oracle Java Cloud Service using Oracle WebLogic REST APIsLock a Git repository branchRestrict push and merge actions on a protected branchHipChat Webhook supportJava Cloud ServiceMay 2016Manage Oracle platform services from a command line interface (CLI)Create and manage access rulesCreate service instances that use database deployments with cloud-only backupsFlexible usage changes to Oracle Java Cloud Service non-metered subscriptions - additional processors can be short term allocated and billed on a metered basisApril 2016Create WebLogic Server 12.2.1 service instancesProvision service instances with a domain partitionCreate service instances that use Oracle Real Application Clusters (RAC) databasesNew patches are available, WebLogic server, Java Developer KitMarch 2016Manage SSH access for service instancesAdd a second load balancer to a service instanceMobile Cloud ServiceMay 2016Location Platform APIMicrosoft Azure Active Directory authenticationexport and import artifacts across MCS instancesOAuth and JWT token policies for REST connectorsApril 2016Facebook credentials or their corporate single-sign on credentialsJavaScript SDK has been re-tooled to specifically support browser-based mobile appsCordova SDK supports hybrid development on the Cordova frameworkFor Content and Collaboration ServicesProcess Cloud ServicesApril 2016New Process EditorNew Data Association editorTransformation editorBusiness Indicator metricsBusiness Analytics dashboards Outbound REST Connector editorDocument-Initiated ProcessWeb Service Message ProtectionSecurity CertificatesNew REST APIsWorkspace EnhancementsSSO and AuthenticationWeb Form Snapshots Business Objects from JSON instanceFor the Integration Cloud ServiceIntegration Cloud ServiceApril 2016Orchestration support - BPEL Process integrationOracle Sales Cloud Adapter - REST APIs and interface catalogREST Adapter enhancementsSAP Adapter - inbound integration supportMicrosoft SQL Server Adapter - inbound integration supportFile Adapter - inbound integration supportJava Messaging Server Adapter - outbound integration supportDocuSign Adapter - outbound integration supportSuccessFactors Adapter - outbound integration supportServiceNow Adapter - outbound integration supportOracle Field Service Adapter - inbound and outbound integration supportAdapter PortalSearch improvementsMapper visual enhancementsExecution Agent (on-premises Oracle Integration Cloud Service)March 2016Adobe eSign Adapter - outbound integration supportFile Adapter - outbound integration support (support for 5 MB)Microsoft SQL Server Adapter - outbound integration supportFTP Adapter - secure FTP server supportSAP Adapter - TRFC, QRFC, and error document supportOracle Database adapter - inbound integration supportOracle Siebel Adapter - inbound integration supportSalesforce Adapter - custom WSDL supportREST Adapter - multidimensional, nested array support in JSON documentsScheduler - Delete files upon successful retrieval after an errorLarge payload support - 10 MBSOA Cloud ServiceMay 2016Oracle Enterprise Scheduler is now available as part of Oracle SOA Cloud ServiceThree new tutorialsMarch 2016Scale Oracle SOA Cloud Service NodesNon-Metered Subscriptions Oracle Managed File Service Oracle B2B For Business Analytics the changes areMarch 2016File size limit increased to 50MBVisualize data in Oracle ApplicationsUpdate data sources after uploadNew ways to present data visualizations; Donut charts, Tile views, Text boxesEnhancements to visualizations; Trends, Color management, Thumbnails, Sort data elements, Filter dataQuickly copy report columns with “Save Column As…”Build multiple data modelsUpload data from Excel spreadsheets and OTBI (Oracle Transactional Business Intelligence) data sourcesData Loader deprecatedIntegrate with multiple data sourcesWhitelist safe domainsIndex content and schedule crawlsDownload the public key for remote data connectivityUpdates to the REST APIIn summary, it is important to look at the new services and new announcements. Some of the changes are relatively small and of low impact. Other changes provide new features and functions that might change the way that you can leverage cloud services. These pages are updated monthly while the cloud services are typically updated every other week. It is recommended that you get into a routine schedule of checking the What's New links in the documentation. Unfortunately, there is not a single location to look at all of these updates. This blog is an attempt to aggregate the new features for Iaas and PaaS.

One thing that the last year has taught me is that things change quickly. One of the biggest challenges is to keep up with this change and figure out what is new and what is not. We are going to take...

PaaS

Database as a Virtual Image

The question that we are going to dive into this week is what does it really mean to be platform as a service vs infrastructure as a service. Why not go to Amazon and spin up an EC2 instance or search for an Oracle provided AMI on Amazon or Virtual Image on Azure? What benefit do I get from PaaS? To answer that we need to look at the key differences. Let's look at the two options when you provision a database in the Oracle DBaaS. When you provision a database you have the option of service levels; Database Cloud Service and Database Cloud Service - Virtual Image. We looked at the provisioning of the cloud service. It provisions a database, creates the network rules, and spins up an instance for us. What happens when we select Virtual Image?The release and version screens are the same. We selected 12c for the release and High Performance for the version. Note that the questions are much simpler. We are not asked about how much storage. We are not asked for an SID or sys password. We are not asked about backup options. We are not given the option of DataGuard, RAC, or GoldenGate. We are only asked to name the instance, pick a compute shape, and provide an ssh public key.This seems much simpler and better. Unfortunately, this isn't true. What happens from here is that a Linux 6.6 instance is created and a tarball is dropped into a staging area. The database is not provisioned. The file system is not prepared. The network ports are not configured and enabled. True, the virtual instance creation only takes a few minutes but all we are doing is provisioning a Linux instance and copying a tarball into a directory. Details on the installation process can be found at Database Cloud Installation - Virtual Image Documentation.If you look at the detailed information about a system that is being created with a virtual image and a system that is being created as a service there are vast differences.The first key difference is the amount of information displayed. Both instances have the same edition, Enterprise Edition - High Performance. Both will display this difference in the database as well as in the banner if asked what version the database is. The Service Level is different with the virtual image displayed as part of the service level. This effects the billing. The virtual image is a lower cost because less is done for you. Product (per OCPU)General Purpose High-Memory Per MonthPer HourPer MonthPer HourStandard Edition Service$600$1.008$700$1.176Enterprise Edition Service$3,000$5.040$3,100$5.208High Performance Service$4,000$6.720$4,100$6.888Extreme Performance Service$5,000$8.401$5,100$8.569Virtual Image Product (per OCPU)General Purpose High-Memory Per MonthPer HourPer MonthPer HourStandard Edition Service$400$0.672$500$0.840Enterprise Edition Service$1,500$2.520$1,600$2.688High Performance Service$2,000$3.360$2,100$3.528Extreme Performance Service$3,000$5.040$3,100$5.208The only other information that we get from the management screen is that the instance comsumes 30 GB rather than 100 GB that the database service instance consumes. Note that the database service instance also has the container name and a connection string for connecting to the database. Both will eventually show an ip address and we should look into the operating system to see the differences. The menu to the right of the instance is also different. If we look at the virtual machine instance we only see ssh access, access rules, and deletion of the instance as options. The ssh access allows us to upload the public key or look at the existing public key that is used to access the instance. The access rules takes us to a new screen that shows the security rules that have been defined for this instance, which is only ssh and nothing else.If we look at a database as a service instance, the menu is different and allows us to look at things like the DBaaS Monitor, APEX, Enterprise Manager monitor, as well as the ssh and access rules.Note that the database as a service instance has a lot more security rules defined with most of them being disabled. We can open up ports 80, 443, 4848, 1158, 5500, and 1521. We don't have to define these rules, just enable them if we are accessing them from a whitelist, ip address range, or public internet. Once we connect to both instances we can see that both are running Linux hostname 3.8.13-68.2.2.2.el6uek.x86_64 #2 SMP Fri Jun 19 16:29:40 PDT 2015 x86_64 x86_64 x86_64 GNU/LinuxWe can see that the file system is different with the /u01, /u02, /u03, and /u04 partitions not mounted in the screen shots below.If we look at the installation instructions we see that we have to create the /u01, /u02, /u03, and /u04 disks by hand. These are not created for us. We also need to create a logical volume as well as creating the storage services. Step one is to scale up the service by adding a disk. We need to grow the existing file system by first attaching a logical volume then laying out/expanding the logical volume that we have. Note that we can exactly mirror our on-premise system at this point. If we put everything into a 1 TB /u01 partition and blend the log files and data files into one disk (not really recommended) we can do this. To add the /u01 disk we need to scale up the service and add storage. Note that we only can add a raw disk and can not grow the data volume as we can with the database service.Note that this scale up does require a reboot of the service. We have the option of adding one logical unit or a full 1 TB disk then partitioning it or we can add the different volumes into different disks. The drawback of doing this is that the way that attached storage is charged is $50/TB/month so adding four disks that consume 20 GB each will consume $200/month because we are allocated the full 1 TB even though we just allocate 20 GB on each disk. We do not subdivide the disk when it is attached and are charged on a per TB basis and not a per GB basis. To save money it is recommended to allocate a full TB rather than a smaller amount. To improve performance and reliability it is recommended to allocate multiple disks and stripe data across multiple spindles and logical units. This can be done at the logical volume management part of disk management detailed in the documentation in provisioning the virtual image instance. We can look at the logical volume configuration with the lvm pvdisplay, lvm vgdisplay, and lvm lvdisplay. This allows us to look at the physical volume mapping to map physical volumes to logical unit numbers, look at logical volumes for mirroring and stripping options, and volume group options which gets mapped to the data, reco, and fra areas. Once our instance has rebooted we note that we added /dev/xvdc which is 21.5 GB in size. After we format this disk it partitions down to a 20 GB disk as we asked. If we add a second disk we will get /dev/xvdd and can map these two new disks into a logical volume that we can map to /u01/and /u02. A nicer command to use to look at this is the lsblk command which does not require elevated root privileges to run.Once we go through the mapping of the /u01, /u02, /u03, and /u04 disks (the documentation only goes into single disks with no mirroring to mount /u01 and /u02) we can expand the binary bits located in /scratch/db. There are two files in this directory, db12102_bits.tar.gz and db12102_se2bits.tar.gz. These are the enterprise edition and standard edition versions of the database. We are not going to go through the full installation but look at some of the key differences between IaaS with a tarball (or EC2 with an AMI) and a DBaaS installation. The primary delta is that the database is fully configured and ready to run in about an hour with DBaaS. With IaaS we need to create and mount a file system, untar and install the database, configure network ports, define security rules, and write scripts to automatically start the database upon restarting the operating system. We loose the menu items in the management page to look at the DBaaS Monitor, Enterprise Manager monitor, and Application Express interface. We also loose the patching options that appear in the DBaaS management screen. We loose the automated backup and database instance and PDB creation as is done with the DBaaS.In summary, the PaaS/DBaaS provisioning in not only a shortcut but removes manual steps in configuring the service as well as daily operations. We could have just as easily provisioned a compute service, attached storage, downloaded the tarball that we want to use from edelivery.oracle.com. The key reasons that we don't want to do this are first pricing and second patching. If we provision a virtual image of database as a service the operating system is ready to accept the tarball and we don't need to install the odbc drivers and other kernel modules. We also get to lease the database on an hourly or monthly basis rather than purchasing a perpetual license to run on our compute instance.Up next, selecting a pre-configured AMI on Amazon and running it in AWS compared to a virtual image on the Oracle Public Cloud.

The question that we are going to dive into this week is what does it really mean to be platform as a service vs infrastructure as a service. Why not go to Amazon and spin up an EC2 instance or...

PaaS

DBaaS for real this time

We have danced around creating a database in the Oracle Public Cloud for almost a week now. We have talked about Schema as a Service, Exadata as a Service, licensing, and the different versions of DBaaS. Today, let's tackle what it takes to actually create a database. It is important to note that the accounts that we are using are metered services accounts. We don't have the option to run as a non-metered service and have to provision the services on an hourly or monthly basis. Unfortunately, we are not going to go through the step by step process of creating a database. There are plenty of other sites that do this wellTechnetwork Oracle By Example DBaaS Quick StartTechnetwork Oracle By Example - Creating an InstanceOracle Docs - Tutorial on creating a databaseAnd my personal favoriteCreate DBaaS with REST api by Jean-Philippe PinteI personally like the Oracle by Example links. Most of the screen shots are out of date and look slightly different if you go through the steps now. For example, the Configure Backup and Recovery screen shots from the first link above shows local backup as an options. This option has been removed from the menu. My guess is a few months from now all of this will be removed and you will be asked for a container that will be automatically created for you rather than having to enter a container that was previously created as is done now. The critical steps that are needed to follow these examples areGet a trial cloud account - instructions on how to do thisLog into your cloud account - Account documentationNavigate to the Database Cloud Service consoleClick the Create Instance buttonDefine the Subscription type, billing type, software release, software editionConfigure your instance with name, description, ssh public key, compute shape, backup mechanism and location, storage size, sys password, SID and PID, and optional configurations (like potentially DataGuard, RAC, and GoldenGate).Wait for instance to be provisionedConnect to the database via ssh using ssh private key and putty/sshOptionally open up ports (port 1521 for client connect, port 80 for apex)Do something productiveThe tutorials go through screen shots for all of these services. You can also watch this on youtube11 minute walk through3 minute walk through9 minute walk throughCreating a RAC instance in Oracle Public CloudThings to watch out for when you create a database instance in the Oracle Public CloudIf you configure a backup service on a demo system and increase the database size to anything of size, you will overflow the 500 GB of storage in about three weeks. Things will stop working when you try to create a serviceAll ports are locked down with the exception of ssh. You can use an ssh tunnel to securely connect to localhost:1521 if you tunnel this port. If you are using a demo account you can only open port 1521 to the world. White listing and ip address lists are not supported in the demo accountsPlay with SQL Developer connections across the internet. It works just like it does on-premise. The DBA tool has good management interfaces that allows you to do simple administration services from the toolPlay with Enterprise Manager 13c. It is easy to connect to your database via ssh and add your cloud instance to the OEM console. You can manage it just like an on-premise database. Cloning a PDB to the cloud is trivial. Database backup to the cloud is trivialPlay with unplugging and replugging a PDB in 12c. You can clone and unplug from your on-premise system, copy the xml files to the cloud, and plug in the PDB to create a clone in the cloud.The longer you let a database run, the smaller your credit will get. If you are playing with a sandbox you can stop a database. This will stop charging for the database (at $3-$5/hour) and you will only get charged for the compute and storage (at $0.10/hour). If you leave a database running for 24 hours you burn through $72-$120 based on your edition selection. You will burn through $3 in 24 hours if you turn off the database and restart it when you want to jump back into your sandbox. Your data will still be there. That is what you are paying $3 a day for.If you are using a demo system, you can extend your evaluation once or twice. There is a button at the top right allowing you to extend you evaluation period. Make sure you do this before time runs out. Once time runs out you need to request another account from another email address.If you are playing with an application, make sure that you spin up WebLogic or Tomcat in a Java or Compute instance in the same account. Running a application server on-premise and a database in the cloud will suffer from latency. You are shipping MB/GB across with select statement returns. You are shipping KB/MB to paint part of a screen. It is better to put the latency between the browser and the app server than the app server and the database serverRequest an account in Amazon and Azure. The more you play with DBaaS in the Oracle environment the more you will appreciate it. Things like creating a RAC cluster is simple. Linking a Java Service to a Database Service is simple. Running a load balancer in front of a Java Service is easy. Play with the differences between Iaas with a database and Paas DBaaS. There is a world of difference.If you run your demo long enough, look at the patch administration. It is worth looking at since this is a major differential between Oracle, Amazon, and Azure.In summary, we didn't go through a tutorial on how to create a database as a service. At this point all of you should have looked at one or two tutorials, one or two videos, and one or two documentation pages. You should have a sample database to move forward with. It does not matter if it is Standard Edition, or Enterprise Edition, High Performance, or Extreme Performance. You should have a simple database that we can start to play with. The whole exercise should have taken you about an hour to learn and play and an hour to wait for the service to run to completion. Connect via ssh and run sqlplus as the oracle user. Open up port 1521 and download SQL Developer and connect to your cloud instance. Explore, play, and have fun experimenting. That is the primary reason why we give you a full database account and not a quarter of an account that you can't really do much with.

We have danced around creating a database in the Oracle Public Cloud for almost a week now. We have talked about Schema as a Service, Exadata as a Service, licensing, and the different versions of...

PaaS

technology behind DBaaS

Before we can analyze different use cases we need to first look at a couple of things that enable these use cases. The foundation for most of these use cases is data replication. We need to be able to replicate data from our on-premise database into a cloud database. The first issue is replicating data and the second is access rights to the data and database allowing you to pull the data into your cloud database.Let's first look at how data is stored in a database. If you use a Linux operating system, this is typically done by splitting information into four categories; ORACLE_HOME, +DATA, +FRA, and +RECO. The binaries that represent the database and all of the database processes go into the ORACLE_HOME or ORACLE_BASE. In the cloud this is dropped into /u01. If you are using non-rac the file system is a logical volume manager (LVM) where you stripe multiple disks to mirror or triple mirror data to keep a single disk failure from bringing down your database or data. If you are using a rac database this goes into ASM. ASM is a disk technology that manages replication and performance. There are a variety of books and websites written on this technologyLVM linksLinux Kernel in a NutshellLinux LVM Beginner's GuideOEL LVMASM linksDatabase Cloud Storage: The Essential Guide to Oracle Automatic Storage ManagementOracle ASM 12c Pocket Reference Guide: Database Cloud StorageOracle Automatic Storage Management: Under-the-Hood & Practical Deployment GuideThe reason why we go into storage technologies is that we need to know how to manage how and where data is stored in our DBaaS. If we access everything with IaaS and roll out raw compute and storage, we need to know how to scale up storage if we run out of space. With DBaaS this is done with the scale up menu item. We can grow the file system by adding logical units to our instance and grow the space allocated for data storage or data logging.The second file system that we should focus on is the +DATA area. This is where data is stored and all of our file extents and tables are located. For our Linux cloud database this is auto-provisioned into /u02. In our test system we create a 25 GB data area and get a 20G file system in the +DATA area.If we look at the /u02 file system we notice that there is one major directory /u02/app/oracle/oradata. In the oradata there is one directory associated with the ORACLE_SID. In our example we called it ORCL. In this directory we have the control01.dbf, sysaux01.dbf, system01.dbf, temp01.dbf, undotbs01.dbf, and users01.dbf. These files are the place where data is stored for the ORCL SID. There is also a PDB1 directory in this file structure. This correlates to the pluggable database that we called PDB1. The files in this directory correspond to the tables, system, and user information relating to this pluggable database. If we create a second pluggable a new directory is created and all of these files are created in that directory. The users01.dbf, PDB1_users01.pdf in the PDB1 directory, file defines all of the users and their access rights. The system01.dbf file defines the tables and system level structures. In a pluggable database the system01 file defines the structures for the PDB1 and not the entire database. The temp01.dbf holds temp data tables and scratch areas. The sysaux01.dbf contains the system information contains the control area structures and management information. The undotbs01.dbf is the flashback area so that we can look at information that was stored three days ago in a table. Note that there is no undotbs01.dbf file in the pluggable because this is done at a global area and not at the pluggable layer. Backups are done for the SID and not each PID. Tuning of memory and system tunables are done at the SID layer as well.Now that we have looked at the files corresponding to tables and table extents, we can talk about data replication. If you follow the methodology of EMC and NetApp you should be able to replicate the dbf files between two file systems. Products like SnapMirror allow you to block copy any changes that happen to the file to another file system in another data center. This is difficult to do between an on-premise server and cloud instance. The way that EMC and NetApp do this are in the controller layer. They log write changes to the disk, track what blocks get changed, and communicate the changes to the other controller on the target system. The target system takes these block changes, figures out what actual blocks they correspond to on their disk layout and update the blocks as needed. This does not work in a cloud storage instance. We deal on a file layer and not on a track and sector or bock layer. The fundamental problem with this data replication mechanism is that you must restart or ingest the new file into the database. The database server does not do well if files change under it because it tends to cache information in memory and indexes into data get broken if data is moved to another location. This type of replication is good if you have an hour or more recovery point objective. If you are looking at minutes replication you will need to go with something like DataGuard, GoldenGate, or Active DataGuard.DataGuard works similar to the block change recording but does so at the database layer and not the file system/block layer. When an update or insert command is executed in the database, these changes are written to the /u04 directory. In our example the +REDO area is allocated for 9.8 GB of disk. If we look at our /u04 structure we see /u04/app/oracle/redo contains redoXX.log file. With DataGuard we take these redo files, compress them, and transfer them to our target system. The target system takes the redo file, uncompresses it, and applies the changes to the database. You can structure the changes either as physical logging or logical logging. Physical logging allows you to translate everything in the database and records the block level changes. Logic logging takes the actual select statement and replicates it to the target system. The target system either inserts the physical changes into the file or executes the select statement on the target database. The physical system is used more than the logical replication because logical has limitations on some of the statements. For example, any blob or file operations can not translate to the target system because you can't guarantee that the file structure is the same between the two systems. There are a variety of books available on DataGuard. It is also important to note that DataGuard is not available for Standard Edition and Enterprise Edition but for High Performance Edition and Extreme Performance Edition only. Oracle Data Guard 11g HandbookOracle Dataguard: Standby Database Failover HandbookCreating a Physical Standby DocumentationCreating a Logical Standby DocumentationGolden Gate is a similar process but there is an intermediary agent that takes the redo log, analyzes it, and translates it into the target system. This allows us to take data from an Oracle database and replicate it to SQL Server. It also allows us to go in the other direction. SQL Server, for example, is typically used for SCADA or process control systems. The Oracle database is typically used for analytics and heavy duty number crunching on a much larger scale. If we want to look at how our process control systems is operating in relation to our budget we will want to pull in the data for the process systems and look at how much we spend on each system. We can do this by either selecting data from the SQL Server or replicating the data into a table on the Oracle system. If we are doing complex join statements and pulling data in from multiple tables we would typically want to do this on one system rather than pulling the data across the network multiple times. Golden Gate allows us to pull the data into a local table and perform the complex select statements without having to suffer network latency more than the initial copy. Golden Gate is a separate product that you must pay for either on-premise or in the cloud. If you are replicating between two Oracle databases you could use Active DataGuard to make this work and this is available as part of Extreme Edition of the database. The /u03 area in our file system is where backups are placed. The file system for our sample system shows /u03/app/oracle/fast_recovery_area/ORCL. The ORCL is the ORACLE_SID of our installation. Note that there is no PDB1 area because all of the backup data is done at the system layer and not at the pluggable layer. The tool used to backup the database is RMAN. There are a variety of books available to help with RMAN as well as an RMAN online tutorialOracle RMAN for Absolute BeginnersOracle Database 12c Oracle RMAN Backup & RecoveryRMAN Recipes for Oracle Database 12c: A Problem-Solution Approach It is important to note that RMAN requires a system level access to the database. Amazon RDS does not allow you to replicate your data using RMAN but uses a volume snapshot and copies this to another zone. The impact of this is that first, you can not get your data out of Amazon with a backup and you can not copy your changes and data from the Amazon RDS to your on-premise system. The second impact is that you can't use Amazon RDS for DataGuard. You don't have sys access into the database which is required to setup DataGuard and you don't have access to a filesystem to copy the redo logs to drop into. To make this available with Amazon you need to deploy the Oracle database into EC2 with S3 storage as the back end. The same is true with Azure. Everything is deployed into raw compute and you have to install the Oracle database on top of the operating system. This is more of an IaaS play and not a PaaS play. You loose patching of the OS and database, automated backups, and automatic restart of the database if something fails. You also need to lay out the file system on your own and select LVM or some other clustering file system to prevent data loss from a single disk corruption. All of this is done for you with PaaS and DBaaS. Oracle does offer a manual process to perform backups without having to dive deep into RMAN technology. If you are making a change to your instance and want a backup copy before you make the change, you can backup your instance manually and not have to wait for the automated backup. You can also change the timing if 2am does not work for your backup and need to move it to 4am instead. We started this conversation talking about growing a table because we ran out of space. With the Amazon and Azure solutions, this must be done manually. You have to attach a new logical unit, map it into the file system, grow the file system, and potentially reboot the operating system. With the Oracle DBaaS we have the option of growing the file system either as a new logical unit, grow the /u02 file system to handle more table spaces, or grow the /u03 file system to handle more backup space. Once we finish our scale up the /u03 file system is no longer 20 GB but 1020 GB in size. The PaaS management console allocates the storage, attaches the storage to the instance, grows the logical volume to fill the additional space, and grows the file system to handle the additional storage. It is important to note that we did not require root privileges to do any of these operations. The DBA or cloud admin can scale up the database and expand table resources. We did not need to involve an operating system administrator. We did not need to request an additional logical unit from the storage admin. We did not need to get a senior DBA to reconfigure the system. All of this can be done either by a junior DBA or an automated script to grow the file system if we run out of space. The only thing missing for the automated script is a monitoring tool to recognize that we are running into a limit. The Oracle Enterprise Manager (OEM) 12c and 13c can do this monitoring and kick off processes if thresholds are crossed. It is important to note that you can not use OEM with Amazon RDS because you don't have root, file system, or system access to the installation which is required to install the OEM agent. In summary, we looked at the file system structure that is required to replicate data between two instances. We talked about how many people use third party disk replication technologies to "snap mirror" between two disk installations and talked about how this does not work when replicating from an on-premise to a cloud instance. We talked about DataGuard and GoldenGate replication to allow us to replicate data to the cloud and to our data center. We looked at some of the advantages of using DBaaS rather than database on IaaS to grow the file system and backup the database. Operations like backup, growing the file system, and adding or removing processors temporarily can be done by a cloud admin or junior DBA. These features required multiple people to make this happen in the past. All of these technologies are needed when we start talking about use cases. Most of the use cases assume that the data and data structures that exist in your on-premise database also exist in the cloud and that you can replicate data to the cloud as well as back from the cloud. If you are going to run a disaster recovery instance in the cloud, you need to be able to copy your changes to the cloud, make the cloud a primary instance, and replicate the changes back to your data center once you bring your database back online. The same is true for development and testing. It is important to be able to attach to both your on-premise database and database provisioned in the cloud and look at the differences between the two configurations.

Before we can analyze different use cases we need to first look at a couple of things that enable these use cases. The foundation for most of these use cases is data replication. We need to be able...

PaaS

DBaaS in Oracle Public Cloud

Before we dive deep into database as a service with Oracle we need to define some terms. We have thrown around concepts like Standard Edition, Enterprise Edition, High Performance Edition, and Extreme Performance Edition. We have talked about concepts like DataGuard, Real Application Clustering, Partitioning, and Compression. Today we will dive a little deeper into this so that we can focus on comparing them running in the Oracle Public Cloud as well as other cloud providers. First, let's tackle Standard Edition (SE) vs Enterprise Edition (EE). Not only is there a SE, there is a SE One and SE2. SE2 is new with the 12c release of the database and the same as SE and SE1 but with different processor and socket restrictions. The Oracle 12c documentation details the differences between the different versions. We will highlight the differences here. Note that you can still store data. The data types do not change between the versions of the database. A select statement that works in SE will work in SE2 and will work in EE. The first big difference between SE and EE is that SE is licensed on a per socket basis and EE is licensed on a per core basis. The base cost of a SE system is $600 per month per processor in the Oracle Public Cloud. The Standard Edition is limited to 8 cores in the cloud. If you are purchasing a perpetual license the cost is $17,500 and can run across two sockets or single sockets on two systems. The SE2 comes with a Real Application Cluster (RAC) license so that you can have a single instance running on two computers. The SE2 instance will also limit the database to run in 16 threads so running in more cores will have no advantage. To learn more about the differences and limitations, I recommend reading Mike Dietrich's Blog on SE2.The second big difference is that many of the optional features are not available with SE. For example, you can't use diagnostics and tuning to figure out if your sql command is running at top efficiency. You can't use multi-tenant but you can provision a single pluggable database. This means that you can unplug and move the database to another database (and even another version like EE). The multi-tenant option allows you to have multiple pluggable databases and control them with a master SGA. This allows admins to backup and patch a group of databases all at once rather than having to patch each one individually. You can separate security and have different logins to the different databases but use a global system or sys account to manage and control all of the databases. Storage optimization features like compression and partitioning are not available in SE either. Data recovery features like DataGuard and FlashBack are not supported in SE. DataGuard is a feature that copies changes from one system through the change logs and apply them to the second system. FlashBack does something similar and allows you to query a database at a previous time and return the state of the database at that time. It uses the change log to reconstruct the database as it was at the time requested. Tools like RMAN backup and streams don't work in SE. Taking a copy of a database and copying it to another system is not allowed. The single exception to this is RMAN works in the cloud instance but not in the perpetual on-premise version. Security like Transparent Data Encryption, Label Security, Data Vault, and Audit Vault are not supported in SE. The single exception is transparent data encryption to allow for encryption in the public cloud is supported for SE. All of these features are described here.When we get Enterprise Edition in the Oracle Public Cloud at $3K/OCPU/month or $5.04/OCPU/hour and the only option that we get is transportation data encryption (TDE) bundled with the database. This allows us to encrypt all or part of a table. TDE encrypts data on the disk when it is written with a SQL insert or update command. Keys are used to encrypt this data and can only be read by presenting the keys using the Oracle Wallet interface. More information on TDE can be found here. The Security Inside Out blog is also a good place to look for updates and references relating to TDE. This version of the database allows us to scale upto 16 processors and 4.6 TB of storage. If we are looking to backup this database, the largest size that we can have for storage is 2.3 TB. If our table requirements are greater than 2.3 TB or 4.6 TB you need to go to Exadata as a Service or purchase a perpetual license and run it on-premise. If we are looking to run this database in our data center we will need to purchase a perpetual license for $47.5K per processor license. If you are running on an IBM Power Server you need to license each processor per core. If you are running on x86 or Sparc servers you multiply the number of cores by 0.5 and can run two cores per processor license. TDE is part of the Advanced Security Option which lists for $15K per processor license. When calculating to see if it is cheaper to run on-premise vs the public cloud you need to factor in both license requirements. The same is true if you decide to run EE in AWS EC2 or Azure Compute. Make sure to read Cloud Licensing Requirements to understand the limits of the cost of running on EC2 or Azure Compute. Since all cloud providers use x86 processors the multiplication factor is 0.5 times the number of cores on the service.The High Performance Edition contains the EE features, TDE, as well as multi-tenant, partitioning, advanced compression, advanced security, real application testing, olap, DataGuard, and all of the database management packs. This is basically everything with the exception of Real Application Clusters (RAC), Active DataGuard, and In-Memory options. High Performance comes in at $4K/processor/month or $6.72/OCPU/hour. If we wanted to bundle all of this together and run it in our data center we need to compare the database at $47.5K/processor license plus roughly $15K/processor/option (there are 12 of them). We can then calculate which is cheaper based on our accounting rules and amortization schedule. The key differential is that I can use this version on an hourly or monthly basis for less than a full year. For example, if we do patch testing once a quarter and allocate three weeks a quarter to test if the patch is good or bad, we only need 12 weeks a year to run the database. This basically costs us $12K/processor/year to test on a single processor and $24K on a dual processor. If we purchased the system it would cost us $47.5K capital expenditure plus 22% annually for support. Paying this amount just to do patch testing does not make sense. With the three year cost of ownership running this on premise will cost us $78,850. If we use the metered services in the public cloud this will cost us $72K. The $6,850 does not seem like a lot but with the public cloud service we won't need to pay for the hardware, storage, or operating system. We can provision the cloud service in an hour and replicate our on site data to the cloud for the testing. If we did this to a computer or virtual image on site it will take hours/days to provision a new computer, storage, database, and replicate the data. It is important to note here that you need to be careful with virtualization. You need to use software that allows for hard partitioning. Products like VMWare and HyperV are soft partitioning virtualization software. This means that you can grow the number of processors dynamically and are required to license the Oracle software for the potential high water mark or all of the cores in the cluster. If you are running on something like a Cisco UCS blade server that has a dual socket 16 core processor, you must license all 32 cores to run the database even though you might just create a 2 core virtual instance in this VMWare installation. It gets even worse if you cluster 8 blades into one cluster then you must license all 256 cores. This get a little expensive at $47.5K times 128 processors. Products like OracleVM, Solaris Contailers, and AIX LPARs solve this cost problem with hard partitions. The third enterprise edition is the Extreme Performance Edition of the database. This feature is $5K/OCPU/month or $8.401/processor/hour. This option comes with RAC, Active DataGuard, and In-Memory. RAC allows you to run across multiple compute instances and restart queries that might fail if one node fails. Active DataGuard allows you to have two databases replicating to each other and for both to be open and active at the same time. Regular or passive DataGuard allows you to replicate the data but not keep the target open and active. In-Memory allows you to store data not only in row format but in column format. When data is entered into the table it is stored on disk in row format. A copy is also placed in memory but stored in column format. This allows you to search faster given that you have already sorted the data in memory and can skip stuff that does not apply to your search. This is typically done with an index but we can't always predict what questions that the users are going to ask and adding too many indexes slows down all operations.It is important to reiterate that we can take our perpetual license and run it in IaaS or generic compute. We can also effectively lease these licenses on a monthly or hourly rate. If you are running the database, you are consuming licenses. If you stop the database, you stop consuming the database license but continue to consume the storage and processor services. If you terminate the database you stop consuming the database, processor, and storage services because they are all deleted upon termination. In summary, there are four flavors of DBaaS; Standard Edition, Enterprise Edition, High Performance Edition, and Extreme Performance Edition. Standard Edition and Enterprise Edition are available by other cloud providers but some require perpetual licenses and some do not. If you decide to run this service as PaaS or DBaaS in the Oracle Public Cloud you can pay hourly or monthly and start/stop these services if they are metered to help save money. All of these services come with partial management features offloaded and done by Oracle. Backups, patches, and, restart of services are done automatically for you. This allows you to focus more on how to apply the database service to provide business benefits rather than the feeding and maintenance to keep the database operational. Up next, we will dive into use cases for database as a service and look at different configurations and pricing models to solve a real business problem.

Before we dive deep into database as a service with Oracle we need to define some terms. We have thrown around concepts like Standard Edition, Enterprise Edition, High Performance Edition, and...

PaaS

Exadata as a Service

For the last four days we have been focusing on Database as a Service in the cloud. We focused on Application Express, or Schema as a Service, in the last three days and looked at pricing and how to get APEX working in the Oracle Public Cloud, Amazon AWS, and Microsoft Azure. With the Oracle Public Cloud we have three options for database in the cloud at the platform as a service layer; Schema as a Service, Database as a Service, and Exadata as a Service. We could run this in compute as a service but have already discussed the benefits of offloading some of the database administration work with platform as a service (backup, patching, restarting services, etc).The question that we have not adequately addressed is how you choose between the three services offered by Oracle. We touched on one of the key questions, database size, when we talked about Schema as a Service. You can have a free database in the cloud if your database is smaller than 25 MB. It will cost you a little money, $175/month, if you have a database smaller than 5 GB. You can grow this to 50 GB and stay with the Schema as a Service. If your database is larger than 50 GB you need to look at Database as a Service or Exadata as a Service. You also need to look at these alternatives if you are running an application in a Java container and need to attach to the database through the standard port 1521 since Schema as a Service only supports http(s) connection to the database. If you can query the database with a REST api call, Schema as a Service is an option but is not necessarily tuned for performance. Products like WebLogic or Tomcat or other Java containers can buffer select statements in cache and not have to ask the same question over and over again from the database. For example, if we census data and are interested in the number of people who live in Texas, we get back roughly 27 million rows of data from the query. If we want to drill down and look at how many people live in San Antonio, we get back 1.5 million rows. If our Java code were smart enough and our application server had enough buffer space, we would not need to read the 27 million rows back when we want to just look at the 1.5 million rows relating to San Antonio. The database can keep the data in memory as well and does not need to read the data back from disk to make the select statement to find the state or city rows that match the query.Let's take a step back and talk about how a database works. We create a table and put information in columns like first name, last name, street address, city, state, zip code, email address, and phone number. This allows us to contact each person either through snail mail, email, or phone. If we allocate 32 bytes for each field we have 8 fields and each row takes up 256 bytes to identify each person. If we store data for each person who lives in Texas we consume 27 million rows. Each row takes up 256 bytes. The whole table will fit into 6.9 GB of storage. This data is stored in a table extent or file that we save into the /u02/data directory. If we expand our database to store information about everyone who lives in the United States we need 319 million rows. This will expand our database to 81.7 GB. Note that we have crossed the boundary for Schema as a Service. We can't store this much information in a single table so we have to look at Database as a Service or Exadata as a Service. Yes, we can optimize our database by using less than 32 bytes per column. We can store zip codes in 16 bytes. We can store phone numbers in 16 bytes. We can store state information in two bytes. We can also use compression in the database and not store the characters "San Antonio" in a 32 byte field but store it in an alternate table once and correlate it to the hexadecimal number 9c. We then store 9c into the state field which tells us that the city name is stored in another table. This saves us 1.5 million times 31 bytes (one to store the 9c) or 46 MB of storage. If we can do this for everyone in Texas shrink the storage by 840 MB. This is roughly 13% of what we had allocated for all of the information related to people who live in Texas. If we can do this for the city, state, and zip code fields we can reduce the storage required by 39% or shrink the 81.7 GB to 49.8 GB. This is basically what is done with a technology called Hybrid Columnar Compression (HCC). You create a secondary table that correlates the 9c value to the character string "San Antonio". You only need to store the character string once and the city information shrinks from 32 bytes to 1 byte. When you read back the city name, the database or storage that does the compression returns the string to the application server or application. When you do a select statement the database looks for the columns that you are asking for in the table that you are doing a select from and returns all of the data that matches the where clause. In our example we might useselect * from census where state = 'Texas';select * from census where city = 'San Antonio';We can restrict what we get back by not using the "*" value. We can get just the first_name and last_name and phone number if that is all we are interested in. The select statement for San Antonio will return 1.5 million rows times 8 columns times 32 bytes or 384 MB of data. A good application server will cache this 384 MB of data and if we issue the same select statement again in a few seconds or minutes we do not need to ask the database again. We issue a simple request to the database asking it if anything has changes since the last query. If we are running on a slow internet connection as we find in our homes we are typically running at 3 MB/second download speeds. To transfer all of this data will take us 128 seconds or about two minutes. Not reading the data a second time save us two minutes. The way that the database finds which 384 MB to return to the application is done similarly. It looks at all of the 81.7 GBs that store the census data and compares the state name to 'Texas' or hex value of corresponding to the state name. If the compare is the same, that row is put into a response buffer and transmitted to the application server. If someone comes back a few seconds later and requests the information correlating to the city name 'San Antonio', the 81.7 GB is read from disk again and and the 384 MB is pulled out to return to the application server. A smart database will cache the Texas data and recognize that San Antonio is a subset of Texas and not read the 81.7 GB a second time but pull the data from memory rather than disk. This can easily be done by partitioning the data in the database and storing the Texas data in one file or disk location and storing the data correlating to California in another file or disk location. Rather than reading back 81.7 GB to find Texas data we only need to read back 6.9 GB since it has been split out in storage. For a typical SCSI disk attached to a computer, we read data back at 2.5 GB/second. To read back all of the US data it takes us 33 seconds. It we read back all of the Texas data it takes us 2.76 seconds. We basically save 30 seconds by partitioning our data. If we read the Texas data first and the San Antonio data second with our select statements, we can cache the 6.9 GB in memory and not have to perform a second read from disk saving us yet another 33 seconds (or 3 seconds with partitioned data). If we know that we will be asking for San Antonio data on a regular basis we setup an index or materialized view in the database so that we don't have to sort through the 6.9 GB of data but access the 384 MB directly but read just the relevant 384 MB of data the first time and reduce our disk access times to 0.15 seconds. It is important to note that we have done two simple things that reduced our access time from 33 seconds to 0.15 seconds. We first partitioned the data and the way that we store it by splitting the data by state in the file system. We second created an index that helped us access the San Antonio data in the file associated with Texas without having to sort through all of the data. We effectively pre-sort the data and provide the database with an index. The cost of this is that every insert command to add a new person to San Antonio requires not only updating the Texas table but updating the index associated with San Antonio as well. When we do an insert of any data we must check to see if the data goes into the Texas table and update the index at the same time whether the information correlates to San Antonio or not because the index might change if data is inserted or updated in the middle of the file associated with the Texas information.Our original question was how do we choose between Schema as a Service, Database as a Service, and Exadata as a Service. The first metric that we used was table size. If our data is greater than 25 MB, we can't use the free APEX service. If our data is greater than 50 GB, we can't use the paid APEX or Schema as a Service. If we want to use features like compression or partitioning, we can't use the Schema as a Service either unless we have sys access to the database. We can create indexes for our data to speed requests but might or might not be able to setup compression or partitioning since these are typically features associated with the Enterprise Edition of the database. If we look at the storage limitations of the Database as a Service we can currently store 4.8 TB worth of data in the database. If we have more data than that we need to go to Exadata as Service. The Exadata service comes in different flavors as well and allows you to store up to 42 TB with a quarter rack, 84 TB with a half rack, and 168 TB with a full rack. If you have a database larger than 168 TB, there are no solutions in the cloud that can store your data attached to an active database. You can backup your data to cloud storage but you can not have an active database attached to it. If we look a little deeper into the Exadata there are multiple advantages to going with Exadata as a Service. The first and most obvious is that you are suddenly working on dedicated hardware. In most cloud environments you share processors with other users as well as storage. You do not get a dedicated bandwidth from processor to disk but must time share this with other users. If you provision a 16 core system, it will typically consume half of a 32 core system that has two sockets. This means that you get a full socket but have to share the memory and disk bandwidth with the next person running in the same server. The data read from the disk is cached in the disk controller's cache and your reads are optimized until someone else reads data from the same controller and your cached data gets flushed to make room. Most cloud vendors go with commodity hardware for compute and storage so they are not optimized for database but for general purpose compute. With an Exadata as a Service you get hardware optimized for database and you get all of the processors in the quarter, half, or full rack. There is no competing for memory bandwidth or storage bandwidth. You are electrically isolated from someone in the other quarter or half rack through the Infiniband switch. Your data is isolated on spindles of your own. You get the full 40 GB/second to and from the disk. Reading the 81.7 GB takes 2.05 seconds compared to 32.68 seconds through a standard SCSI disk controller. The data is partitioned and stored automatically so that when we ask for the San Antonio data, we only read back the 384 MB and don't need to read back all of the data or deal with the index update delays when we write the data. The read scans all 81.7 GB and returns the results in 0.01 seconds. We effectively reduce the 33 seconds it took us previously and dropped it to 10 ms. If you want to learn more about Exadata and how and why it makes queries run faster, I would recommend the following booksExpert Oracle ExadataOracle Exadata Survival GuideOracle Exadata Recipes: A Problem-Solution ApproachAchieving Extreme Performance with Oracle ExadataOracle Exadata Expert's HandbookExpert Oracle Exadataor the following youtube video channelsOracle Exadata Videos (126 videos)Oracle Exadata & Tutorials (28 videos)or the following web sitesExadata X6-2 ServerExadata X6-8 ServerThe Exadata as a Service is a unique offering in the cloud. Amazon and Microsoft have nothing that compares to it. Neither company offers dedicated compute that is specifically designed to run a database in the cloud with dedicated disk and dedicated I/O channels. Oracle offers this service to users of the Enterprise Edition of the database that allows them to replicate their on-premise data to the cloud, ingest the data into an Exadata in the cloud, and operate on the data and processes unchanged and unmodified in the cloud. You could take your financial data that runs on a 8 or 16 core system in your data center and replicate it to an Exadata in the cloud. Once you have the data there you can crunch on the data with long running queries that would take hours on your in house system. We worked with a telecommunications company years ago that was using an on-premise transportation management system and generated an inventory load list to put parts on their service trucks, work orders for the maintenance repair staff, and a driving list to route the drivers on the optimum path to cover the largest number of customers in a day. The on-premise system took 15-16 hours to generate all of this workload and was prone to errors and outages requiring the drivers to delay their routes or parts in inventory to be shipped overnight for loading in the morning onto the trucks. Running this load on an Exadata dropped the analytics to less than an hour. This allowed trucks to be rerouted mid-day to higher profit customers to handle high priority outages as well as next day delivery of inventory between warehouses rather than rush orders. Reducing the analytics from 15 hours to less than an hour allowed an expansion of services as well as higher quality of services to their customer base.Not all companies have daily issues like this and look for higher level processing once a quarter or once or twice a year. Opening new retail outlets, calculating taxes due, or provisioning new services that were purchased as Christmas presents are three examples of predictable, periodic instances where consuming a larger footprint in the cloud rather than investing in resources that sits idle most of the year in your data center. Having the ability to lease these services on an monthly or annual basis allows for better utilization of resources not only in your data center but reduces the overall spend of the IT department and expanding the capabilities of business units to do things that they normally could not afford. Exadata as a Service is offered in a non-metered configuration at $40K per month for a quarter rack (16 cores and 144 TB of disk), $140K per month for a half rack (56 cores and 288 TB of disk), or $280K per month for a full rack (112 cores and 576 TB of disk). The same service is offered on a metered basis for $80K for a quarter rack, $280K for a half rack, and $560K for a full rack (in the same configuration as the non-metered service). One of the things that we recommend is that you analyze the cost of this service. Is it cheaper to effectively lease a quarter rack at $80K for a month and get the results that you want, effectively lease a quarter rack at $480K for a year, or purchase the hardware, database license, RAC licenses, storage cell licenses, and other optional components to run this in your data center. We will not dive into this analysis because it truly varies based on use cases, value to your company for the use case, and cost of running one of these services in your data center. It is important to do this analysis to figure out which consumption model works for you. In summary, Exadata as a Service is a unique service that no other cloud vendor offers. Having dedicated hardware to run your database is unique for cloud services. Having hardware that is optimized for long, complex queries is unique as well. Exadata is one of the most popular hardware solutions offered by Oracle and having it available on a monthly or annual basis allows customers to use the services at a much lower cost than purchasing a box or larger box for their data center. Having Oracle manage and run the service frees up your company to focus on the business impact of the hardware and accelerated database rather than spend month to administer the server and database. Tomorrow we will dive into Database as a Service and see how a generic database in the cloud has a variety of use cases and different cost entry points as well as features and functions.

For the last four days we have been focusing on Database as a Service in the cloud. We focused on Application Express, or Schema as a Service, in the last three days and looked at pricing and how...

PaaS

APEX in Azure

Today we are going to look and see what it takes to get Schema as a Service running in the Microsoft Azure Cloud. Our last two entries looked at Schema as a Service, or Application Express, or APEX, running in the Oracle Public Cloud for free, $175, $900, or $2000/month or running in the Amazon RDS Service for $50 - $2700/month. The idea behind Schema as a Service is that you are leasing a database instance on a compute server in the cloud. You get automated backup, html and htmls interfaces into the database, tools to load and unload data, interfaces to run ad-hoc queries or scripted queries, and REST apis to access our data. The pricing on the Oracle Public Cloud is based on how much storage that you consume for your database. The cost on Amazon RDS is based on the compute power allocated to the database. Storage is charged separately and the more you consume, the more you get charged.Azure uses the Amazon model for pricing in that it charges on a processor shape. Automated backup and the APEX interface/libraries are not included and requires a separate step to install and configure the software on top of the Oracle database. The default operating installation is Windows Server and you are billed monthly for this shape as well. Azure does not offer backup or any other part of a managed service but only provided a virtual image with the Oracle database pre-installed on a Windows Server in the cloud. The pricing for this software isShapeCoresRAMDiskStandard EditionEnterprise EditionShape priceA0 Basic or Standard0.250.75 GB19 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo0.02/hr or $15/moA1 Basic or Standard11.75 GB224 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo0.09/hr or $67/moA2 Basic or Standard23.5 GB489 GB$1.11/hr or $826/month$3.16/hr or $2,351/mo0.18/hr or $134/moA5 Standard214 GB489 GB$1.11/hr or $826/month$3.16/hr or $2,351/mo0.33/hr or $246/moA3 Basic or Standard47 GB999 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo0.036/hr or $268/moA6 Standard4428 GB999 GB$1.28/hr or $952/mo$6.32/hr or $4,702/moA4 Basic or Standard814 GB2,039 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo0.72/hr or $536/moA7 Standard856 GB2,039 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo1.32/hr or $982/moA8 Standard856 GB382 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/moA10 Standard856 GB382 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/moA9 Standard16112 GB382 GB$5.10/hr or $3794/mo$25.27/hr or $18,801/moA11 Standard16112 GB382 GB$5.10/hr or $3794/mo$25.27/hr or $18,801/moD1 Standard13.5 GB50 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo$0.14/hr or $104/moD2 Standard27 GB100 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo$0.28/hr or $208/moD11 Standard214 GB100 GB$1.11/hr or $826/mo$3.16/hr or $2,351/mo$0.33/hr or $246/moD3 Standard414 GB200 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo$0.56/hr or $417/moD12 Standard428 GB200 GB$1.28/hr or $952/mo$6.32/hr or $4,702/mo$0.652/hr or $485/moD4 Standard828 GB400 GB$2.55/hr or $1,897/mo$12.63/hr or $9,397/mo$1.12/hr or $833/moD14 Standard16112 GB800 GB$5.10/hr or $3,794/mo$25.27/hr or $18,801/mo$2.11/hr or $1571/moNote that this is a generic database pricing with compute shape pricing. It is not a Schema as a Service pricing. We still need to layer APEX on top of the generic database installation. This provides the cheapest option for deploying Schema as a Service as $841/month or $1.13/hr for Standard Edition or $2,366/month or $3.18/hr for Enterprise Edition. We can basically stop here. Running an Oracle database on a single core with less than 1 GB is unusable. Going to the A1 Basic or Standard shape increases the memory to 1.75 GB which might work for a single schema and the 70 GB of disk will store enough tables for most use cases. Let's walk through creation of an Oracle database on Azure. First we go to Azure portal and search for the Oracle virtual machine by clicking on New.We are looking for either the Enterprise Edition or Standard Edition of the database.We select the Standard Edition and ask that it be provisioned.Once we select the product, we select the shape that we will run this instance on. The shape recommended is relatively large so we have to look at all shapes and we select the A1 shape primarily for cost reasons. We should look at what we are trying to do and load the right core count and memory footprint.We select the network and storage models for this instance. We go with the defaults in this example rather than adding another storage instance since we are just looking for a small footprint.Some things to note here. First, we have the option of a username and password or ssh to connect to the operating system. Second, we are never presented with any information about the operating system. Based on the documentation that we read earlier, I assume that this would be Windows Server. It turns out that is actually Oracle Enterprise Linux 6.7. Third, we are never asked information about the database. We are not asked about the SID, password for sys, or ports to open up or use to connect to the database. It turns out that the database is installed in the /u01 directory but no database is created. You still need to run the dbca to create a database instance and start a listener. There are other OS options available to install the database on. We theoretically could have selected Windows 2012 but these options did not come up with our search. It took a few minutes to start up the database virtual machine. We can look up the details on the instance to find the ip address to connect to and use putty or ssh to connect to the instance.When we log in we can look at the installation directory and notice that everything is installed in /u01. When we look at the /etc/oratab we notice that nothing is configured or installed. We will need to run oratab to create a database instance. We will then need to download and install APEX to configure Schema as a Service. In summary, we can install and run an Oracle database on Azure. The installation is lacking a bit. This is more of an Infrastructure as a Service with a binary pre-installed but not configured. This is not Database as a Service and lacking when we compare it to Schema as a Service. To get Schema as a Service we need to download software, install it, change the network configuration, and update the virtual machine network (we did not show this). Microsoft has a good tutorial on the steps needed to create the database once the virtual machine is installed. You do get root access to the operating system. You do get sys access to the database. You do get file system access through the operating system. The documentation says that the service is on Windows but got provisioned on Linux. We might have done something wrong looking back or the documentation is wrong. The service is priced for generic database starting at $800/month or more based on the shape you select. This installation is not DBaaS or PaaS. Backups are not automatically done for you. Patches are not configure or installed. You basically get an operating system with a binary configured. The database is not configured and ports are not configured to allow you to connect across the internet.

Today we are going to look and see what it takes to get Schema as a Service running in the Microsoft Azure Cloud. Our last two entries looked at Schema as a Service, or Application Express, or...

PaaS

Apex in Amazon AWS

Today we are going to look at running Schema as a Service using Amazon AWS as your IaaS foundation. Yesterday we looked at Schema as a Service using Oracle DBaaS. To quickly review, you can run a schema in the cloud using apex.oracle.com for tables upto 25 MB for free or cloud.oracle.com for tables of 5 GB, 20GB, or 50 GB for a monthly fee. You do not get a login to the operating system, database, or file system but do everything through an html interface. You can query the database through an application that you write in the APEX interface or through REST api interfaces, both of which are accessible through http or https. Today we are looking at what it takes and how much it cost to do the same thing using Amazon AWS.Amazon offers a variety of database options as a managed service. This is available through the Amazon RDS Console. This screen looks likeIf you go to the RDS Getting Started page you will see that you can provision MySQLOracleSQL ServerMaria DBor an Amazon custom database, AuroraWe won't go into a compare and contrast in this blog entry to compare the different database but go into the Oracle RDS offerings and look at what you get that compares to Schema as a Service offered by Oracle.The Oracle database offerings that you get from Amazon RDS areOracle Standard Edition OneOracle Standard Edition TwoOracle Standard EditionOracle Enterprise EditionNote that we can launch any version in EC2 but we are trying to look for a platform as a service where the database is pre-configured and some management is done for us like patching, backups, and operating system maintenance, failure detection, and service restarting. You can launch 11g or 12c version of the Oracle database but it is important to note that through RDS you do not get access to the operating system, file system, or sys/system user. There is an elevated user that lets you perform a limited function list but not all options are available to this elevated user. Some features are also not enabled in the 12c instanceIn-MemorySpatialmulti-tenant or pluggable databasesReal Application Clusters (RAC)Data Guard / Active Data GuardConnection to Enterprise ManagerAutomated Storage ManagementDatabase VaultJava librariesLocator servicesLabel SecurityIn the 11g instance the above list plus the following features are not supportedReal Application TestingStreamsXML DBThe following roles and privileges are not provided to users in Amazon RDSAlter databaseAlter systemCreate any directoryDrop any directoryGrant any privilegeGrant any roleBefore we dive into the usage of Amazon RDS, let's talk pricing and licensing. The only option that you have for a license included with RDS is the Standard Edition One license type. To figure out the cost, we must look at the sizes that we can provision as well as the RDS cost calculator. To start this journey, we start at the AWS console, go to the RDS console, and select Oracle SE1 as the instance type. If we select the license-included License Model we get to look at the shapes that we can deploy as well as the versions of the database. We can use the cost calculator in conjunction to figure out the monthly cost of deploying this service. For our example we selected 11.2.0.4 v7, db.t2.micro (1 vCPU, 1 GB RAM), and 20 GB of storage. For this shape we find that the monthly cost will be $25.62. We selected the 11.2.0.4 version because this is the only 11g option available to us for the SE1 licensed included selection. We could have selected the 12.1.0.1 as an option. If we select any other version we must bring our own license to run on AWS. It is important to look at the outbound transfer rate because this cost is some times significant. If we put 20 GB outbound traffic the price increases to $26.07 which is not significant. This says that we can backup our entire database once a month offsite and not have to pay a significant to get our database off RDS. It is important to look at the shape options that we have for the different database versions. We should also look at the cost associated with it. For 11g we havedb.t2.micro (1 vCPU, 1 GB) - $25.62/monthdb.t2.small (1 vCPU, 2 GB) - $51.24/monthdb.t2.medium (2 vCPU, 4 GB) - $102.48/monthdb.t2.large (2 vCPU, 8 GB)- $205.70/monthdb.m4.large (2 vCPU, 8 GB)- $300.86/monthdb.m4.xlarge (4 vCPU, 16 GB)- $602.44/monthdb.m4.2xlarge (8 vCPU, 32 GB)- $1324.56/monthdb.m4.4xlarge (16 vCPU, 64 GB) - $2649.11/monthdb.m3.medium (1 vCPU, 3.75 GB) - $153.72/monthdb.m3.large (2 vCPU, 7.5 GB) - $307.44/monthdb.m3.xlarge (4 vCPU, 15 GB) - $614.88/monthdb.m3.2xlarge (8 vCPU, 30 GB) - $1352.74/monthdb.r3.large (2 vCPU, 15 GB) - $333.06/monthdb.r3.xlarge (4 vCPU, 30 GB) - $666.12/monthdb.r3.2xlarge (8 vCPU, 61 GB) - $1465.47/monthdb.r3.4xlarge (16 vCPU, 122 GB) - $2930.93/monthdb.m2.xlarge (2 vCPU, 17 GB) - $409.92/monthdb.m2.2xlarge (4 vCPU, 34 GB) - $819.84/monthdb.m2.4xlarge (8 vCPU, 68 GB) - $1803.65/monthdb.m1.small (1 vCPU, 3.75 GB) - $84.18/monthdb.m1.medium (2 vCPU, 7.5 GB) - $168.36/monthdb.m1.large (4 vCPU, 15 GB) - $336.72/monthdb.m1.xlarge (8 vCPU, 30 GB) - $673.44/monthFor the 12c version we havedb.m1.small (1 vCPU, 3.75 GB) - $84.18/monthdb.m3.medium (1 vCPU, 3.75 GB) - $153.72/monthdb.m3.large (2 vCPU, 7.5 GB) - $307.44/monthdb.m3.xlarge (4 vCPU, 15 GB) - $614.88/monthdb.m3.2xlarge (8 vCPU, 30 GB) - $1352.74/monthdb.m2.xlarge (2 vCPU, 17 GB) - $409.92/monthdb.m2.2xlarge (4 vCPU, 34 GB) - $819.84/monthdb.m2.4xlarge (8 vCPU, 68 GB) - $1803.65/monthdb.m1.medium (2 vCPU, 7.5 GB) - $168.36/monthdb.m1.large (4 vCPU, 15 GB) - $336.72/monthdb.m1.xlarge (8 vCPU, 30 GB) - $673.44/monthIf we want to create the database, we can select the database version (11g), the processor size (smallest just to be cheap for demo purposes), and storage. We define the OID, username and password for the elevated user, and click nextWe then confirm the selections and backup schedule (scroll down to see), and click on Launch.When we launch this, the system shows that the inputs were accepted and the database will be created. We can check on the status by going to the RDS console.It takes a few minutes to provision the database instance, 15 minutes in our test. When the creation is finished we see available rather than creating for the status.Once the instance is created we can connect to the database using the Oracle Connection Instructions and connect using sqlplus installed on a local machine connecting to a remote database (the one we just created), using the aws connection tools to get status (aws rds describe-db-instances --headers), or connecting with sql developer to the ip address, port 1521, and user oracle with the password we specified. We chose to open up port 1521 to the internet during the install which is not necessarily best practices.Note that we have fallen short of Schema as a Service. We have database as a service at this point. We will need to layer application express on top of this to get Schema as a Service. We can install APEX 4.1.1 on the 11g instance that we just created by following installation instructions. Note that this is a four step process followed by spinning up and EC2 instance and installing optional software to run a listener because the APEX listener is not supported on the RDS instance. We basically add $15-$20/month to spin up a minimal EC2 instance and install the listener software and follow the nine step installation process to link the listener to the RDS instance. The installation and configuration steps are similar for 12c. We can provision a 12c instance of the database in RDS, spin up an EC2 instance for the listener, and configure the EC2 instance to point to the RDS instance. At this point we have the same experience that we have as Schema as a Service with the Oracle DBaaS option.In summary, we can provision a database into the Amazon RDS. If we want anything other than Standard Edition One, we need to bring our own license at $47.5K/two cores for Enterprise Edition or $17.5K for Standard Edition and maintain our annual support cost at 22%. If we want to use a license provided by Amazon we can but are limited to the Standard Edition One version and APEX 4.1.1 with 11.2.0.4 or APEX 4.2.6 with 12.1.0.1. Both of these APEX versions are a major version behind the 5.0 version offered by Oracle through DBaaS. On the positive side, we can get a SE One instance with APEX installed for about $50/month for 11g or $100/month for 12c which is slightly cheaper than the $175/month through Oracle. The Oracle product is Enterprise Edition vs Standard Edition One on AWS RDS and the Oracle version does come with APEX 5.0 as well as being configured for you upon provisioning as opposed to having to perform 18 steps and spin up an EC2 instance to act as a listener. It really is difficult to compare the two products as the same product but if you are truly only interested in a simple query engine schema as a service in the cloud, RDS might be an option. If you read the Amazon literature, switching to Aurora is a better option but that is a discussion for another day.

Today we are going to look at running Schema as a Service using Amazon AWS as your IaaS foundation. Yesterday we looked at Schema as a Service using Oracle DBaaS. To quickly review, you can run a...

PaaS

Application Express or Schema as a Service

Today we are doing to dive headlong into Schema as a Service. This is an interesting option offered by Oracle. It is a unique service that is either free or can cost as much as $2K/month. Just a quick review, you get the Oracle database from an http interface for:10 MB storage - free25 MB storage - free5 GB - $175/month20 GB - $900/month50 GB - $2000/monthWhen you consume this service you don't get access to an operating system. You don't get access to a file system. You don't get access to the database from the command line. All access to the database is done through an http or https interface. You can access the management console to load, backup, and query data in the database. You can load, backup, and update applications that talk to the data in the database. You can create a REST api that allows you to read and write data in your database as well as run queries against the database. You get a single processor access with 7.5 GB of RAM running the 12c version of the Oracle database inside a pluggable container and isolated from other users sharing this processor with you. Microsoft offers an Express Edition of SQL Server and the table storage service that allows you to do something similar. Amazon offers a lightweight database that does simple table lookups. The two key differences between these products is that all access to the Oracle Schema as a Service is done through http or https. Applications can be written but run inside the database and not on a separate server as is done with the other similar cloud options. Some say this is an advantage, some say it is a disadvantage. We covered this topic from a different angle a month ago in convertign excel to apex and printing from apex. Both of these blog entries talk about how to use Schema as a Service to solve a problem. Some good references on Schema as a Service can be found atAPEX Getting StartedAPEX OverviewAPEX Documentationa variety of books. I typically use safari books subscription to get these books on demand and on my iPad for reading on an airplane. When we login to access the database we are asked to present the schema that we created, a username, and a password. Note in this example we are either using the external free service apex.oracle.com or the Oracle corporate service for employees apex.oraclecorp.com. The two services are exactly the same. As an Oracle employee I do not have access to the public free service and am encouraged to use the internal service for employees. The user interface is the same but screen shots will bounce between the two as we document how to do things.Once we login we see a main menu system that allows us to manage application, manage tables in the database, to team development, and download and install customer applications. The Object Browser allows us to look at the data in the database. The SQL Commands allow us to make queries into the database. The SQL Scripts allows us to load and save sql commands to run against the database. Utilities allows us to load and unload data. The REST ful service allows us to define html interfaces into the database. If we look at the Object Browser, we can look at table definitions and data stored in tables. We can use the SQL Commands tab to execute select statements against a table. For example, if we want to look at part number B77077 we can select it from the pricelist by matching the column part_number. We should get back one entry since there is only one part for this part number. If we search for part number B77473 we get back multiple entries that are the same part number. This search returns six lines of data with more data in other columns than the previous select statement.The SQL Scripts allows you to load scripts to execute from your laptop or desktop. You can take queries that have run against other servers and run them against this server.Up to this point we have looked at how to write queries, run queries, and execute queries. We need to look at how to load data so that we have something to query against. This is done in the Utilities section of the Schema as a Service. Typically we start with a data source either as an XML source or an Excel spreadsheet. We will look first at taking an Excel spreadsheet like the one below and importing it into a table.Note that the spreadsheet is well defined and headers exist for the data. We do have some comments at the top that we need to delete so that the first row becomes the column names in our new table. Step one is to save the sheet as a comma separated value file. Step two is to edit this file and delete the comment and blank like. Step three is to upload the file into the Schema as a Service web site. At this point we have a data source loaded and ready to drop into a table. The user interface allows us to define the column type and tries to figure out if everything is character strings, numbers, or dates. The tools is good at the import but typically fails at character length and throws exceptions on specific rows. If this happens you can either manually enter the data or re-import the data into an existing table. Doing this can potentially cause replication of data so deleting the new table and re-importing into a new table might be a good thing. You have to play at this point to import your data and get the right column definitions to import all of your data.Alternatively we can import xml data into an existing table. This is the same process of how we backup our data by exporting it as xml.At this point we have loaded data using a spreadsheet or csv file and an xml file. We can query the database by entering sql commands or loading sql scripts. We could load data from a sql script if we wanted but larger amounts of data needs to be imported with a file. Unfortunately, we can not take a table from an on-premise database or an rman backup and restore into this database. We can't unplug a pdb and plug it into this instance. This service does have limitations but for free or less than $200 per month, the service provides a lot of functionality.To create an application to read and display our data, we must create an application. To do this we go into the application development interface of Schema as a Service.We select the Application Builder tab at the top and click on Create icon. This takes us to a selection of what type of application to build. We are going to build a desktop application since it has the most function options. We could have easily selected the mobile option which formats displays for a smaller screen format.We have to select a name for our application. In this example we are calling it Sample_DB since we are just going to query our database and display contents of a table.We are going to select one page. Note that we can create multiple pages to display different information or views into a table. In our previous blog entry on translating excel to apex we created a page to display the cost of archive and the cost of storage on different pages. In this example we are going to create one page and one page only.If we have shared components from other apps or want to create libraries that can called from other apps, we have the option at this point to define that. We are not going to do this but just create a basic application to query a database.We can create a variety of authorization sources to protect our data. In this example we are going to allow anyone to read and write our table. We select no authorization. We could use an external source or look at the user table in the database to authenticate users. For this example, we will leave everything open for our application.We get a final confirmation screen (not shown) and create the application. When the create is finished we see the following screen that lets us either run the application or edit it. If we click on the Home icon we can edit the page. This screen is a little daunting. There are to many choices and things that you can do. There are different areas, breadcrumbs for page to page navigation, and items that you can drop into a screen. In this example we are going to add a region by hovering the mouse over the Content Body and right clicking the mouse. This allows us to create a new region in the body of our page.Note that a new region is created and is highlighted in the editor. We are going to edit the content type. We have a wide variety of options. We could type in static text and this basically becomes a static web page. Note that we could create a graph or chart. We could create a classic report. We could create a form to submit data and query a table. We will use the interactive report because it allows us to enter sql for a query into our table.In this example we will enter the select statement in the query box. We could pop this box into another window for full screen editing. For our example we are doing a simple select * into our table with select * from pricelist. When we click the run button at the top right we execute this code and display it in a new window. We can sort this data, we can change the query and click Go. This is an interactive from to read data from our table. If we wanted to restrict the user from reading all of the data we would have selected a standard report rather than an interactive report.The final part of our tutorial is creation of a REST api for our data. We would like to be able to go to a web page and display the data in the table. For example, if we want to look at the description of part number B77077 it would be nice to do it from a web page or get command at the command line. To do this we go to the SQL Workshop tab and click the RESTful Service icon at the top right.Again, this screen is a little daunting. We get a blank screen with a create button. Clicking the create button takes to a screen where we need to enter information that might not be very familiar.The screen we see is asking us to enter a name for our service, a template, and a resource handler. Looking at this for the first time, I am clueless as to what this means. Fortunately, there is an example on how to enter this information if you scroll down and click on the Example button. If we look at the example we see that the service name is the header that we will hit from our web page. In our example we are going to create a cloud RESTapi where we expose the pricelist. In this example we call the service cloud. We call the resource template pricelist and allow the user to pass in a part_number to query. In the resource handler we go a get function that does a select from the table. We could pass in the part number that we want to read but for simplicity we ignore the part number and return all rows in the table. Once we click save, we have exposed our table to a web query with no authentication. Once we have created our REST service we can query the database from a we browser using the url of the apex server/pls/apex/(schema name)/pricelist/(part number). In this example we go to apex.oraclecorp.com/pls/apex/parterncloud/pricelist/B77077. It executes the select statement and returns all rows in the table using JSON format.In summary, we are able to upload, query, and display datatbase data using http and https protocols. We can upload data in xml or csv format. We can query the database using web based tools or REST interfaces. We can display data either by developing a web based program to display data or pull the data from a REST interface and get the data in JSON format. This service is free if your database size is small. If we have a larger database we can pay for the service as well as host the application to read the data. We have the option to read the data from a REST interface and pull it into an application server at a different location. We did not look at uploading data with a PUT interface through the REST service but we could have done this as well. Up next, how do we implement this same service in AWS or Azure.

Today we are doing to dive headlong into Schema as a Service. This is an interesting option offered by Oracle. It is a unique service that is either free or can cost as much as $2K/month. Just a...

PaaS

Database as a Service

Today we are going to dive into Database as a Service offered from Oracle. This product is the same product offered by Oracle as a perpetual processor license or perpetual named user license for running database software in your data center. The key different is that the database is provisioned onto a Linux server in the cloud and rather than paying $47,500 for a processor license and 22% annually after that, you pay for the database services on an hourly or monthly basis. If you have a problem that needs only a few weeks, you pay for the service for a few weeks. If you have a problem that takes a very large number of processors but for a very short period of time, you can effectively lease the large number of processors in the cloud and purchase a much smaller number of processors in your data center. Think of a student registration system. If you have 20K-30K students that need to log into a class registration system, you need to size this server for the peak number of students going through the system. In our example, we might need an 8 core system to handle the load during class registration. Outside the two or three weeks for registration, this system sits idle at less than 10% utilization because it is used to record and report grades during the semester. Rather than paying $47.5K times 8 cores times 0.5 for an x86 or Sparc server ($190K), we only have to pay $47.5K times 2 cores times 0.5 for x86 or Sparc cores ($47.5K) and lease the additional processors in the cloud for a month at $3K/core/month ($24K). We effectively reduced the cost from $190K to $71.5K by using the cloud for the peak period. Even if we do this three times during the year the price is $119.5K which is a cost savings of $70.5K. The second year we would be required to pay $41.8K in support cost for the larger server. By using the smaller server we drop the support cost to $10.5K. This effectively pays for leasing a third of the cloud resources by using a smaller server and bursting to the cloud for high peak utilization. Now that we have looked at one of our use cases and the cost savings associated with using the cloud for peak utilization and reducing the cost of on servers and software in our data center, let's dive into the pricing and configuration of Database as a Service (DBaaS) offered by Oracle in the public cloud services. If we click on the Platform -> Database menu we see the following page.If we scroll down to the bottom we see that there are effectively three services that we can use in the public cloud. The first is Database Schema as a Service. This allows you to access a database through a web interface and write programs to read and present data to the users. This is the traditional Application Express interface or APEX interface that was introduced in Oracle 9. This is a shared service where you are given a database instance that is shared with other users. The second service is Database as a Service. This is the 11g or 12c database installed on a Linux installation in the cloud. This is a full installation of the database with ssh access to the operating system and sqlplus access to the database from a client system. The third service is Exadata as a Service. This is the Oracle database on dedicated hardware that is optimized to run the Oracle database. The Schema as a Service is also known as Application Express. If you have never played with apex.oracle.com, click on the link and register for a free account. You can create an instance, a database schema, and store upto 10 MB or 25 MB of data for free. If you want to purchase a larger storage amount it is sold in 5 GB, 20 GB, or 50 GB increments. The 10 or 25 MB instance is free. The 5 GB instance is $175/month. The 20 GB is $900/month, and the 50 GB is $2,000/month. Tomorrow we will dive a little deeper into Schema as a Service. In summary, this is a database instance that can contain multiple tables and has an application development/application web front end allowing you to access the database. You can not attach with sqlplus. You can not attach with port 1521. You can not put a Java or PHP front end in front of your database and use it as a back end repository. You can expose database data through applications and REST api interfaces. This instance is shared on a single computer with other instances. You can have multiple instances on the same computer and the login give you access to your applications and your data in your instance. The Database as a Service (DBaaS) is slightly different. With this you are getting a Linux instance that has been provisioned with a database. It is a fully deployed, fully provisioned database based on your selection criteria. There are many options when you provision DBaaS. Some of the options are virtual vs full instance, 11g vs 12c, standard edition vs enterprise edition vs enterprise edition high performance vs enterprise edition extreme performance. You need to provide an expected data size and if you plan on backing up the data and a cloud object repository if you do. You need to provide ssh keys to login as oracle or opc/root to manage the database and operating system. You also need to pick a password for the sys/system user inside the database. Finally, you need to pick the processor and memory shape that will run the database. All of these options have a pricing impact. All of these options effect functionality. It is important to know what each of these options means.Let's dive into some of these options. First, virtual vs full instance. If you pick a full instance you will get an Oracle Enterprise Linux installation that has the version of the database that you requested fully installed and operational. For standard installations the file system is the logical volume manager and the file system is provisioned across four file systems. The /u01 file system is the ORACLE_HOME. This is where the database binary is installed. The /u02 file system is the +DATA area. This is where table extents and table data is located. The /u03 file system is the +FRA area. This is where backups are dropped using the RMAN command which should run automatically every night for incremental backups and 2am on Sunday morning for a full backup. You can change the times and backup configurations with command line options. The /u04 area is teh +RECO area. This is where change logs and other log files are dropped. If you are using Data Guard to replicate data to another database or from another database, this is where the change logs are found. If you pick a virtual instance you basically get a root file system running Oracle Enterprise Linux with a tar ball that contains the oracle database. You can mount file systems as desired and install the database as you have it installed in your data center. This configuration is intended to mirror what you have on-premise to test patches and new features. If you put everything into /u01 then install everything that way. If you put everything in the root file system, you have the freedom to do so even though this is not the recommended best practice.The question that you are not asked when you try to create a DBaaS is if this service is metered or non-metered. This question is asked when you create your identity domain. If you request a metered service, you have the flexibility to select the shapes that you want and if you are billed hourly or monthly. The rates are determined by the processor shape, amount of memory, and what database option you select (standard, enterprise, high performance, or extreme performance). More on that later. With the metered option you are free to stop the database (but not delete it) and retain your data. You suspend the consumption of the database license but not the compute and storage. This is a good way of saving a configuration for later testing and not getting charged for using it. Think of it as having an Uber driver sit outside the store but not charge you to sit there. When you get back in the car the charge starts. A better analogy would be the Cars2Go. You can reserve a car for a few hours and drive it from Houston to Austin. You park the car in the Cars2Go parking slot next to the convention center and don't pay for parking. You come out at the end of your conference, swipe your credit card and drive the car back to Houston. You only get charged for the car when it is between parking lots. You don't get charged for it while it is parked in the reserved slot. You pay a monthly charge for the service (think of compute and storage) at a much lower rate. If you think of a non-metered service as renting a car from a car rental place, you pay for the car that they give you and it is your until you return it to the car rental place. You can't not pay for the car while you are in your convention as with Card2Go. You have to pay for parking at the hotel or convention center. You can't decide half way into your trip that you really need a truck instead of a car or a mini-van to hold more people and change out cars. The rental company will end your current agreement and start a new one with the new vehicle. Non-metered services are similar. If you select an OC3M shape then you can't upgrade it to an OC5 to get more cores. You can't decide that you need to use the diagnostics and tuning and upgrade from enterprise edition to enterprise edition high performance. You get what you started with and have 12 months to consume the services reserved for you.The choice of 11g or 12c is a relatively simple one. You get 11.2.0.4 running on Oracle Enterprise Linux 6.6 or you get 12.1.0.2 running on Oracle Enterprise Linux 6.6. This is one of those binary questions. You get 11g or 12c. It really does not effect any other question. It does effect features because 12c has more features available to it but this choice is simple. Unfortunately, you can't select 11.2.0.3 or 10.whatever or 9.whatever. You get the latest running version of the database and have an option to upgrade to the next release when it is available or not upgrade. Upgrades and patches are applied after you approve them. The next choice is the type of database. We will dive into this deeper in a couple of days. The basic is that you pick Standard Edition or Enterprise Edition. You have the option of picking just the base Enterprise Edition with encryption only, with most of the options in the High Performance Option, or all of the options with Extreme Performance Option. The difference between High Performance and Exterme Performance is the Extreme included Active DataGuard, In-Memory options, and Real Application Clustering options. Again, we will dive into this deeper in a later blog entry. The final option is the configuration of the database. I wanted to include a screen shot here but the main options that we look at are the CPU and memory shape which dictates the database consumption cost as well as the amount of storage for table space (/u02) and backup space (/u03 and /u04). There are additional charges above 128 GB for table storage and for backups. We will not go into the other options on this screen in this blog entry.In summary, DBaaS is charged on a metered or un-metered basis. The un-metered is a lower cost option but less flexible. If you know exactly what you need and the time that it is needed, this is a better option. Costs are fixed. Expenses are predictable. If you don't know what you need, metered service might be better. It gives you the option of starting and stopping different processor counts, shutting off the database to save money, and select different options to test out different features. Look at the cost option and a blog that we will do in a few days analyzing the details on cost. Basically, the database can be mentally budgeted as $3K/OCPU/month for Enterprise Edition, $4K/OCPU/month for High Performance, and $5K/OCPU/month for Extreme Performance. Metered options typically cross over at 21 days. If you use metered service for more than 21 days your charges will exceed this amount. If you use it for less, it will cost less. The Exadata as a Service is a special use case of Database as a Service. In this service you are getting a quarter, half, or full rack of hardware that is running the database. You get dedicated hardware that is tuned and optimized to run the Oracle database. Storage is dedicated to your compute nodes and not one else can use these components. You get 16, 56, or 112 processors dedicated to your database. You can add additional processors to get more database power. This service is available in a metered or non-metered option. All of the database options are available with this product. All of the processors are clustered into one database and you can run one or many instances of a database in this hardware. With the 12c option you get multi-tenant features so that you can run multiple instances and manage them with the same management tools but give users full access to their instance but not other instances running on the same database. Exadata cost for metered servicesExadata cost for non-metered servicesIn summary, there are two options for database as a service. You can get a web based front end to a database and access all of your data through http and https calls. You can get a full database running on a Linux server or Linux cluster that is dedicated to you. You can consume these services on a an hourly, monthly, or yearly basis. You can decide on less expensive or more expensive options as well as how much processor, memory, and storage that you want to allocate to these services. Tomorrow, we will dive a little deeper into APEX or Schema as a Service and look at how it compares to services offered by Amazon and Azure.

Today we are going to dive into Database as a Service offered from Oracle. This product is the same product offered by Oracle as a perpetual processor license or perpetual named user license...

PaaS

Intro to PaaS

Today we are going to move up the stack. We will first focus on the Oracle solutions talking about the different platform as a service offerings. It is important to spend a little time reviewing this layer because what one company calls PaaS, another calls SaaS. The best way to get started is to go to cloud.oracle.com and look at the pull downs at the top of the screen. We see Infrastructure, Platform, and Applications. When we pull down the Platform menu we see that there are different areas that we can dive into.Data management is the first area that we will review. This is basically a way to aggregate and look at data. We can store data in a database, store on-premise databases into the cloud, store data in NoSQL repositories, and do analytics on a variety of data with Big Data Preparation and Big Data services. All of these involve pulling data into a repository of some type and performing queries against the repository. The key difference is the way that the data is stored, how we can ask questions, and the results that we get back. At this point we will not dive into any of these deeply but at a later point dive deep into the database and database backup. The Application Development is moving farther away from the technology of storing data and moving closer to how we present data to users. The Java platform, for example, allows us to do things like create a shopping cart or hosting more complex applications in a Java repository or container. The Mobile Cloud Service allows us to dive into existing applications and present a user interface to iPhones, Android Phones, and tablets. The idea is to customize existing web and fat clients into a mobile format that can be consumed on mobile devices. The Messaging Cloud Service is a messaging protocol that allows for transactions in the cloud. If you are looking at connecting different cloud services together it allows you to serialize the communication between vendors for a true transactional experience. The Application Container Cloud is a lightweight Java container allowing you to upload and run java applications but without access to the operating system. This is a shared multi-tenant version of a WebLogic server. The Developer Cloud Service is a DevOps integration for the Java and Database services. This service is an aggregation of public domain components used to develop microservices at the database or java layer. The Application Builder Cloud Service is a cloud based REST api development interface allowing you to integrate with Application software in the Oracle Cloud as well as other Clouds. The API Catalog is a way of publishing the REST apis that you have and expose them to your customers.The Content and Process Cloud Services are an aggregation of services that address group communications as well as business process flow. The Documents Cloud Service is a way of file sharing on the web. The Process Cloud Service is an extension that allows you to launch business processes (think Business Process Manager or BPM) in the cloud. The Sites Cloud Service is a web portal interface that takes documents and processes and aggregates them into a single cloud site allowing you to take a wiki like presentation but put business processes into the presentation. The Social Network Cloud Service allows you to integrate social network services like Facebook and Twitter into your web presence. It allows you to integrate these services as well as search these repositories for information relating to your company.The Business Analytics part of Platform services provides data visualization and analytic tools as well as data aggregation utilities. The Business Intelligence component is the traditional BI package that allows users to create custom queries into your database. The Big Data Preparation allows you to aggregate data from a variety of sources into a Big Data repository. The Big Data Discovery allows you to look at your data in a variety of ways and generate reports based on your data and views of data. The Data Visualization Cloud Service allows you to view and analyze your data from different perspectives. This is similar to the BI and Big Data but looks at data slightly differently. The Internet of Things Cloud Service allows you to aggregate monitoring and measuring devices into a repository. The Cloud Integration part of Platform services is the traditional data aggregation tools from other repositories. The Integration Cloud Service allows you to aggregate traditional SaaS vendors to unify fields like how a customer is defined or what data elements are incorporated into a purchase order. The SOA Cloud Service is implementation of the Oracle SOA Suite in the cloud. The GoldenGate Cloud Service is an implementation of the Oracle Golden Gate software that allows you to take data from different databases and synchronize the different repositories independent of the database vendor. The Internet of Things Cloud Service is the same listed in the Business Analytics section mentioned before.The Cloud Management part of Platform services allows you to take the log files that you have inside your data center and analyze them for a variety of things. You can aggregate your log files into the Log Analytics Cloud Services to look for patterns, intrusion attempts, and problems or issues with services. The IT Analytics Cloud Service looks at log files and looks for trends like disks filling up, processors being used or not used appropriately. The Application Performance Cloud Service looks at log files to look at how systems and applications are operating rather than how systems are working rather than how components are working. In Summary, we looked at an overview of the Platform as a Services offered by Oracle. Unfortunately, the variety of topics are too great for one blog. We did a high level overview of these services. In upcoming blogs we will dive deeper into each of these services and look at not only what they are but how they work and how to provision these services. We will also compare and contrast how these services compare to services offered by Amazon and Azure as we dive into each service.

Today we are going to move up the stack. We will first focus on the Oracle solutions talking about the different platform as a service offerings. It is important to spend a little time reviewing...

Industry generic technologies

storage cloud appliance in the cloud

Last week we focused on getting infrastructure as a service up and running. I wanted to move up the stack and talk about platform as a service but unfortunately, I got distracted with yet another infrastructure problem. We were able to install the storage cloud appliance software in a virtual machine but how do you install this in a compute cloud instance? This brings up two issues. First, how do you run a Linux 7 - 3.10 kernel in the Oracle Compute Cloud Service. Second, how do you connect and manage this service both from an admin perspective and client from another compute engine in the cloud service. Let's tackle the first problem. How do you spin up a Linux 7 - 3.10 kernel in the Oracle Compute Cloud Service? If we look at the compute instance creation we can see what images that we can boot from. There is not Linux 7 - 3.10 kernel so we need to download and import and image that we can boot from. Fortunately, Oracle has gone through a good importing a bootable image tutorial. If we follow these steps, we need to first download a CentOS 7 bootable image from cloud.centos.org. The cloud instance that we use is the CentOS-7-x86_64-OracleCloud.raw.tar.gz. We first download this to a local directory then upload it to the compute cloud image area. This is done by going to the compute console and clicking on the "Images" tab at the top of the screen.We then upload the tar.gz file that is a bootable image. This allows us to create a new storage instance that we can boot from. The upload takes a few minutes and once it is complete we need to associate it with a bootable instance. This is done by clicking on the "Associate Image" button where we basically enter a name to use for the operating system as well as description. Note that the OS size is 9 GB which is really small. We don't have a compute instance at this point. We either need to create a bootable storage element or compute instance based on this image. We will go through the storage create first since this is the easiest way of getting started. We first have to change from the Image tab to the Storage tab. We click on the Create Storage Volume and go through selection of the image, storage name, and size. We went with the storage size rather than resizing the storage we are creating. At this point we should be able to create a compute instance based on this boot disk. We can clone the disk, boot from it, or mount it on another instance. We will go through and boot from this instance once it is created. We do this by going to the Instance tab and clicking on Create Instance. It does take 5-10 minutes to create the storage instance and need to wait till it is completed before creating a compute instance. An example of a creation looks likeWe select the default network, the CentOS7 storage that we previously created, the 2016 ssh keys that we uploaded, and review and launch the instance.After about 15 minutes, we have a compute instance based on our CentOS 7 image. Up to this point, all we have done is create a bootable Linux 7 - 3.10 kernel. Once we have the kernel available we can focus on connecting and installing the cloud storage appliance software. This follows the making backup better blog post. There are a couple of things that are different. First, we connect as the user centos rather than oracle or opc. This is a function of the image that we downloaded and not a function of the compute cloud. Second, we need to create a second user that allows us to login. When we use the centos user and install the oscsa_install.sh script, we can't login with our ssh keys for some reason. If we create a new user then whatever stops us from logging in as the centos user does not stop us from logging in as oracle, for example. The third thing that we need to focus on is creating a tunnel from our local desktop to the cloud instance. This is done with ssh or putty. What we are looking for is routing the management port for the storage appliance. It is easier to create a tunnel rather than change the management port and opening up the port through the cloud firewall.From this we execute the commands we described in the maker backup better blog. We won't go through the screen shots on this since we have done this already. One thing is missing from the screenshot, you need to disable selinux vy editing /etc/sysconfig/selinux. You need to disable SELINUX by editing the file and rebooting. Make sure that you add a second user before rebooting otherwise you will get locked out and the ssh keys won't work once this change is made. The additional steps that we need to do are create a user, copy the authorized_keys from an existing user into the .ssh directory, change the ownership, and assign a password to the new user, and add the user to /etc/sudoers.useradd oraclemkdir ~oracle/.sshcp ~centos/.ssh/authorized_keys ~oracle/.sshchown -R oracle ~oraclepasswd oraclevi /etc/sudoersThe second major step is to create an ssh tunnel to allow you to connect in from your localhost into the cloud compute service. When you create the oscsa instance it starts up a management console using port 32769. To tunnel this port we use putty to connect. At this point we should be able to spin up other compute instances and mount this file system internally using the command mount -t nfs -o vers=4,port=32770 e53479.compute-metcsgse00028.oraclecloud.internal:/ /local_mount_pointWe might want to use the internal ip address rather than the external dns name. In our example this would be the Private IP address of 10.196.89.62. We should be able to mount this file system and clone other instances to leverage the object storage in the cloud. In summary, we did two things in this blog. First, we uploaded a new operating system that was not part of the list of operating systems presented by default. We selected a CentOS instance that conforms to the requirements of the cloud storage appliance. Second, we configured the cloud storage appliance software on a newly created Linux 7 - 3.10 kernel and created a putty tunnel so that we can manage the directories that we create to share. This gives us the ability to share the object storage as an nfs mount internal to all of our compute servers. It allows for things like spinning up web servers or other static servers all sharing the same home directory or static pages. We can use these same processes and procedures to pull data from the Marketplace and configure more complex installations like JD Edwards, PeopleSoft, or E-Business Suite. We can import a pre-defined image, spin up a compute instance based on that image, and provision higher level functionality onto infrastructure as a service. Up next, platform as a service explained.

Last week we focused on getting infrastructure as a service up and running. I wanted to move up the stack and talk about platform as a service but unfortunately, I got distracted with yet...

Industry generic technologies

private cloud vs public cloud

Today is our last day to talk about infrastructure as a service. We are moving up the stack into platform as a service after this. The higher up the stack we get, the more value it has to the business and end users. It is interesting to talk about storage and compute in the cloud but does this help us treat patients in a medical practice any better, find oil and gas faster, deliver our manufactured product any cheaper? Not really but not having them will negatively effect all of them. We need to make sure that these services are there so that we can perform higher functions without having to worry about triple redundant mirroring of a disk or load balancing compute servers to handle higher loads. One of the biggest complaints with cloud services is that there are perception problems of security, latency, and governance. Why would I put my data on a computer that I don't control. There is a noisy neighbor issue where I am renting part of a computer, part of a disk, part of a network. If someone wants to play heavy metal at the highest volume (remember our apartment example) while I am trying to file a monthly report or do some analytics, my resources will suffer and it will take me longer to finish my job due to something out of my control.Many people have decided to go with private hosted cloud solutions like VCE, VBlock, Cisco UCS clusters, and other products that provide raw compute, raw storage, and "hyper-converged" infrastructure to solve the cloud problem. I can create a golden master on my VMWare server and provision a database to my configuration as many times as I want. Yes, I run into a licensing issue. Yes, it is a really big licensing issue running Oracle on VMWare but Microsoft is not far behind with SQL Server licensing on a per core basis either. Let's look at the economics of putting together one of these private cloud servers.It is important to dive into the licensing issue. The Oracle Database and WebLogic servers are licensed either on a processor or named user basis. Database licensing is detailed in a pdf on the Oracle site. The net of the license says that the database is licensed based on the core count of the processor running on the server. There is a multiplication factor (0.5 for X86) based on the chip type that factors into the license cost. A few years ago it was easy to do this calculation. If I have a dual core, dual socket system, this is a four core computer. The license price of the computer would be 4 cores x 0.5 (Intel x86 chip) x $47,500. The total price would be $95K. Suddenly the core count of computers went to 8, 16, or 32 cores per chip. A single system could easily have 64 cores on a single board. If you aggregate multiple boards as is done in a Cisco UCS system you can have 8 board or 256 cores that you can use. There are very few applications that can take advantage of 256 cores so a virtualization engine was placed on top of these cores so that you could sub-divide the system into smaller chunks. If you have a 4 core database problem, you can allocate 4 cores to it. If you need 8 cores, allocate 8 cores. Products like VMWare and HyperV took advantage of this and grew rapidly. These virtualization packages added features like live migration, dynamic sizing, and bursting utilization. If you allocate 4 cores and the processor goes to 90%, two more cores will be made available for a short burst. The big question comes up as to how you now license on a per core basis. If you can flex to more processors without rebooting or live migrate from a 2 core to a 24 core system, which do you license for? Unfortunately, Oracle took a different position from the rest of the industry. None of the Oracle products contain a license key. None of the products require that you go to a web site and get a token to allow you to run the software. The code is wide open and freely available to load and run on any system that you want. Unfortunately, companies don't do a good job of tracking utilization. If someone from the sales or purchasing department rolls out a golden master onto a new virtual machine, no one is really tracking that. People outside of IT can easily spin up more licenses. They can provision a database in a cloud service and assume that the company has enough licenses to cover their project. After a while, licensing gets out of control and a license review is done to see what is actually being used and how it is being used. Named user licenses are great but you have to have a ratio of users to cores to meet minimums. You can't for example, buy a 5 user license and deploy it on a 64 core system. You have to maintain a typical ratio of 25 users to a core of 40 users to a core based on the product that you are using. You also need to make sure that you understand soft partitioning vs hard partitioning. Soft partitioning is the ability to flex or change the core count without having to reconfigure or reboot the system. A hard partition puts hard limits on the core count and does not allow you exceed it. Products like OracleVM, Solaris, and AIX contain hard partition virtualization. Products like HyperV and VMWare contain soft partitions. With soft partitions, you need to pay for all of the cores in the cluster since in theory you can consume all of the cores. To be honest, most people don't understand this and get in trouble with license reviews. When we talk about cloud services, licensing is also important to understand. Oracle published cloud license rules to detail limits and restrictions. The database is still licensed on a per core basis. The Linux operating system is licensed on per server instance and is limited to 8 virtual cores. If you deploy the Oracle database or WebLogic server in AWS or Azure or any other cloud vendor, you have to own a perpetual license for the database using the formulas above. The license must correlate to the high water mark for the core count that you provision. If you provision a 4 core system, you need a 2 processor license. If you run the database for six months and shut it off, you still need to own the perpetual license. The only way to work around this is to purchase database as a service in the Oracle cloud. You can pay for the database license on an hourly or monthly basis with metered services or on an annual basis with non-metered services. This provides a great cost savings because if we only need a database for 6 months we only need to pay for 6 months x the number of cores x the database edition type. If, for example, we want just the Database Enterprise Edition, it is $3K/core/month. If we want 4 cores that is $12K per month. If we want 6 months then we get it for $72K. We can walk away from the license and not have to pay the 22% annual maintenance on the $95K. We save $23K the first year and $20K annually by only using the database in the cloud for six months. If we wanted to use the database for 9 months, it is cheaper to own the license and lease processor and storage. If we go to the next higher level of database, Database High Performance Edition at $4K/core/month, it becomes cheaper to use the cloud service because it contains so many options that cost $15K/processor. Features like partitioning, compression, diagnostics, tuning, real application testing, and encryption are part of this cloud service. Suddenly the economics is in favor of cloud hosting a database rather than hosting in a data center.Let's go back to the Cisco UCS and network attached storage discussion. If we purchase a UCS 8 blade server with 32 cores per blade we are looking at $150K or higher for the server. We then want to attach a 100 TB disk array to it at about $300K (remember the $3K/TB). We will then have to pay $300K for VMWare. If we add 10% for hardware and 20% for software we are at just over $1M for a base solution. With a three year amortization we are looking at about $330K per year just to have a compute and storage infrastructure. We have to have a VMWare admin who doles out processors and storage, loads operating systems, creates golden masters, and acts as a traffic cop to manage and allocate resources. We still need to pay for the Oracle database license which is licensed on a per core basis. Unfortunately, with VMWare we must license all of the cores in the cluster so we either have to sub-divide the cluster into one blade and license all 32 cores or end up paying for all 256 cores. At roughly $25K/core that gets expensive quickly. Yes, you can run OracleVM or Solaris on one of the blades and subdivide the database into two cores and only pay for two cores since they both support hard partitioning but you would be amazed at how many people fight this solution. You now have two virtualization engines that you need to support with two different file formats. No one in mass wants two solutions just to solve a licensing issue.Oracle has taken a radically different approach to this. Rather than purchasing hardware, storage, and a virtualization platform, run everything in the cloud and pay for it on a monthly basis. The biggest objection is that the cloud is in another city and security, latency, ... you get the picture. The new solution is to run this hardware in your data center with the Oracle Public Cloud Machine. The cost of this solution is roughly $260K/year with a three year commit. You get 200 plus cores and 100 ish TB of storage to use as you want. You don't manage it with VSphere but manage it with the same web page that you manage the public cloud services. If you want to script everything then you can manage it with REST apis or perl/java/insert your lanaguage scripts. The key benefit to this is that you no longer need to worry about what virtualization engine you are using. You manage the higher level and lease the database or weblogic or SOA license on an hourly or monthly basis. Next week we will move up the stack and look at database hosting. Today we talked about infrastructure choices and how it impacts database license cost. Going with AWS or Azure still requires that you purchase the database license. Going with the Oracle public cloud or public cloud machine allows you to not own the database license but effectively lease it on an hourly or monthly basis. It might not be the right solution for 7x24x365 operation but it might be. It really is the right solution for bursty needs list holiday peak periods, student registration systems, development and testing where you only need a large footprint for a few weeks and don't need to buy for your highwater mark and run at 20% the rest of the year.

Today is our last day to talk about infrastructure as a service. We are moving up the stack into platform as a service after this. The higher up the stack we get, the more value it has to the...

Industry generic technologies

Archive Storage Services vs Archive Cloud Services

Yesterday we started talking about the cost comparison for storage in the cloud. We briefly touched on the cost of long term archive in the cloud. How much does it cost to backup data for long term archive and what is the best way to do this? Years ago the default way of doing this was to copy your data on disk to a tape unit and put the tape in a box. The box was then put in an environmentally controlled room to extend the lifetime of tape and a person was put on staff to pull the data off the shelf when the data was needed. The data might be a backup of data on disk or a secondary copy just in case the disk failed. Tape was typically used to provide separation of duties required by Sarbanes-Oxly to keep people who report on financial data separate from the financial data. It also allowed companies to take large volumes of data, like seismic data, and not keep it on spinning disks. The traces were reloaded when geophysicists wanted to look at the data. The first innovation in this technology was to provide a robot to load and unload tapes as a tape unit gets full or needs to be reloaded. Magazines were created that could hold eight tapes and the robots had bar code readers so that they could seek to the right tape in the magazine, pull it out of the series of tapes and inserted into the tape unit for reading or writing. Management software got more advanced and understood the bar code values and could sequence the whopping 800 GB of data that could be written to an LT04 tape. Again, technology gets updated and the industry moved to LT05 and LT06 tapes with significantly higher densities. A single LT06 could hold 2.5 TB per tape unit. Technology marches on and compression allows us to store 6 TB on these disks. If we go back to our 120 TB case that we talked about yesterday this means that we will need 20 tapes (at $30-$45 for each tape) and $25K for a single tape drive unit. Most tape drive systems support 8 tapes per magazine so we are talking about something that will support three magazines. To support three magazines, we need a second shelf in our tape storage so the price goes up by about $20K. We are sitting at about $55K to backup our 120 TB and $5.5K in support annually for the hardware. We also need about $1K in tape for the number of full and incremental backups that we want which would be $20K for four months of retention before we recycle the tapes. These tapes are good for a dozen re-writes so every three years we will need to repurchase tapes. If we spread the cost of the tape unit, tape drives, and tapes across three years we are looking at $2K/month to backup our 120 TB. We also need to factor in $60/week for tape pickup and storage fees at a service like Iron Mountain and a couple of $250 charges to retrieve tapes in the event of a catastrophic failure to drive tapes back to our data center from cold storage. This bumps the cost to $2.2K/month which is significantly cheaper than the $10K/month for network storage in our data center or $3.6K/month for cloud storage services. Unfortunately, a tape unit requires someone to care and feed it and you will pay that person more than $600/month but not $7.8K/month which you would with the cloud or disk solutions.If you had a ton of data to archive you could purchase a tape silo that supported hundreds or thousands of magazines. Unfortunately, this expandability cones at a cost. The tape backup unit grew from an eighth of a rack to twenty full racks. There isn't much in between. You can get an eighth of a rack solution, a full rack solution, or a twenty full rack solution. The larger solution comes in at hundreds of thousands of dollars rather than tens of thousands.Enter cloud solutions. Amazon and Oracle offer tape solutions in the cloud. Both companies offer the twenty full rack solution but only charge a per tape charge to consumers. Amazon Glacier charges $7/TB/month to store data. Oracle charges $1/TB/month for the same service. Both companies charge for data restoration and outbound transfer of data. The Amazon Glacier cost of writing 120 TB and reading back 10% of it comes in at $2218/month. This is the same cost as having the tape unit on site. The key difference is that we can recover the data by requesting it from Amazon and get it back in less than four hours. There is no emergency recovery charges. There is not the weekly pickup charges. We can expand the amount that we backup and the bulk of this cost is reading back the data ($1300). Storage is relatively cheap for our backups, we just need to plan on the cost of recovery and try to limit this since it is the bulk of the cost. We can drop this cost even more using the Oracle Archive Cloud Services. The price from Oracle is $1/TB/month but the recovery and transmission charges are about the same. The same archive service with Oracle is $1560/month with roughly $1300 being the charges for restoring and outbound transfer of the data. Unfortunately, Oracle does not offer an un-metered archive service so we have to guestimate how much we are going to restore on a monthly basis. Both services use REST apis to write, restore, and read data. When a container (Oracle Archive) or bucket (Amazon Glacier) is created, a PUT call is done to the endpoint of the service. The first step required by both are authentication to provide credentials into the service. Below we show the Oracle authentication and creation process through the REST api. The important part of this is the archive header extension. This differentiates if the container is spinning disk or if it is tape in the cloud.Amazon recommends using a windows based tool like s3browser, CloudBerry, or using a language like Java, .NET, or Ruby and their published SDKs. CloudBerry works for the Oracle Archive as well. When you create a container you have the option of pulling down storage or archive as the container type. Both services allow you to encrypt and compress the data as it is written with HTML Headers changing the characteristics and parameters of the container. Both services require you to issue a PUT request to write the data to tape. Below we show the Oracle REST api.For CloudBerry and the other gui based tools, uploading is just a drag and drop from your local file system to the tape storage in the cloud.Amazon details the readback procedure and job system that shows the status of the restore request. Oracle has a similarly defined retrieval policy as well as an archive tutorial. Both services offer a 4 hour window to allow for restoration. Below is an example of a restore request and checking on the job status of the job spawned to load the tape and transfer the data for reading. The file is ready to read when the completedPercentage is 100.We can do the same thing with the S3 browser and Amazon Glacier. We need to request the restore, check the job status, then download the restored files. The files change color when they are ready to read.In summary, we have looked at how to reduce cost of archives and backups. We looked at using a secondary disk at our data center or another data center. We looked at using on site tape units. We looked at disk in the cloud. Today we looked at tape in the cloud. It is important to remember that not one of these solutions is the answer. A combination of any or all of them are needed. Daily and weekly backups should happen to a secondary disk locally. This data is most likely to be restored on a regular basis. Once you get a full backup or two under your belt, move the data to another site. It might be spinning disk, it might be tape but something needs to be offsite in the event of a true catastrophic failure like a communication link going out (think Dell PowerVault and a thunderstorm) and you loose your primary lun and secondary lun that contains your backups. The whole idea of offsite backups are not for restore but primary for insurance and regulation compliance. If someone needs to see the old data, it is there. You are betting that you won't need to read it back and the cloud vendors are counting on that. If you do read it back on a regular basis you might want to significantly increase your budget, pass the charges onto the people who want to read data back, or look for another solution. Tape storage in the cloud is a great way of archiving data for a long time at a low cost.

Yesterday we started talking about the cost comparison for storage in the cloud. We briefly touched on the cost of long term archive in the cloud. How much does it cost to backup data for long...

Industry generic technologies

metered vs un-metered vs dedicated services

One of the newest concepts that has been introduced for cloud services is the concept of un-metered or dedicated services. Before we dive into this subject, let's review what a cloud service really is. When you boil it all down, you are basically leasing computer resources on a computer that you don't own. You are taking a slice of a compute engine, slice of a disk drive, part of a network connection. You are renting space. Think of it as living in an apartment. Yes, this is a silly analogy but if you think about it, it makes sense. You can rent an efficiency, one, two, or three bedroom apartment. You can get parking with or without a roof over your car. You can get a storage closet or a garage to store stuff that you are not using but want to keep around. There are benefits to apartments. You don't have to cut the grass. You typically have access to a pool but don't need to maintain it. If the toilet backs up or the gas stops working you call the super and they come fix it. You still have to replace your own light bulbs that burn out. You still need to clean your own bathroom and kitchen and take out your trash on a regular basis. On the grander scale, you don't need to drop 10% down and get a mortgage to live there. Monthly rents are typically cheaper than paying down a mortgage. Your taxes are bundled into your rent cost. You basically show up, use the apartment and go on with your life. On the flip side, there are drawbacks to apartments. If your upstairs neighbor likes to play heavy metal at 2am or throw wild parties on the weekends it does make it hard to sleep. Someone might park in your parking spot so you need to park farther away from your front door. You can't pull into your garage to unload your groceries and have to potentially carry them in the rain across the parking lot and up the stairs to your third floor apartment. The super might decide that Tuesday they are going to repaint all bathrooms another color and you need to be out of the way for a day and put up with the smell even though you planned a dinner party the next night. It is difficult to grill on your balcony and you can't really sit out without sharing the space with all of your neighbors. The true downfall is that twenty years from now, you will still be renting your apartment (and the rent probably went up every other year) while your college buddies are celebrating a mortgage burning party and the only thing that they owe on a monthly basis is the taxes that the government takes annually. Yes, our analogy is silly. Yes, our analogy is relevant. It is easy to decide that you want another job in another city so you hire a mover, pack up all your stuff, and move to another apartment. This is where our analogy breaks down. Cloud vendors charge you for every piece of furniture that you take out of the building. They charge you to use the stairs or elevator. They charge you every time a moving van exits the building full of furniture and boxes of clothes. It is free to bring stuff in because it locks you into the apartment. Just don't try to take anything out. Remember that storage closet or garage that you got with your apartment, you can open the door and put stuff in for free but if you carry anything out (even if you just relocate it to your apartment) you get charged per item that you carry across the threshold. If you look at storage from any cloud vendor they offer a metered storage service. The same is true for compute services. You can lease a virtual processor and memory and grind on data all that you want. The catch is when you want to transfer your files or report results of you analysis to your desktop computer, you get charged on a per gigabyte transferred across the internet. Cost calculators help you calculate these costs but they are a little hard to estimate and use to calculate outbound data charges. Amazon, for example, has a calculators that you can use. The AWS pricing calculator allows you to look at the cost of all cloud services. Let's walk through the cost of Amazon Glacier. The price list says that you should pay $0.007/GB/month or $7/TB/month to keep things in cold storage. We will use 120 TB as our basis for analysis. We put this as the amount to store and see the cost of storing the data is $860/month. If we plan on reading back 10% of this data during the month the price goes up to $2217. The bulk of these charges are the outbound charges. The cost goes to $921 if we read the data to an EC2 instance and not all the way back to our data center or desktop computer. To use our apartment analogy, you are paying $860 to get a storage garage. You pay $61 every time you take something out and move it to your apartment. You can put all you want into the storage area (as long as your don't exceed the space of the storage unit) but taking something out will cost you. If you put your recently retrieved item in your car or a truck and drive it out the gate you get a surcharge of $1300. It is important to remember that pulling more stuff out of your storage will cost you more. Putting this in terms of computer archive, you can store all of your emails, contracts, customer transactions, patient records in long term storage. If your on-site storage fails for some reason or if you get a legal request to review five years of data, you can pull the data back from cold storage. It will cost you to pull the data back but it is still cheaper than keeping the seven years of longer data on spinning disk in your data center (estimate $3K/TB plus 10% per year for spinning disk in your data center). We can do the same calculation for cloud storage using S3. We can store 120 TB for roughly $3950/month. If we want to read back 10%, or 12 TB, of that data, it will cost us $5150 or $1200 additional. We can reduce the cost by using lower speed storage in the cloud. We put the S3 data into the infrequent category to save money. This drops the cost to just over $3K which does save us about $2100/month. We agree to pay a lower cost to get higher latency and longer retrieval times. It is better than using tape in the cloud and we can save some money with this option. We can opt for reduced redundancy storage (aka non-mirrored and non-replicated data) but we risk data loss since we will only have a single copy in the cloud. This drops the cost to $4300 with the data retrieval but we have to weigh the cost vs data loss risks. Let's not pick on Amazon. How does this compare to Azure? Unfortunately, we can't start with Microsoft tape in the cloud, they don't offer the service. We must start with block storage in the Azure cloud. Microsoft has an Azure pricing calculator that you can use to perform the same calculations. The calculator and pricing is a little difficult to use when you first get into it. You basically need to put together the calculation a piece at a time. You need to factor in the cost of the storage and the cost of transferring the data from the cloud to your data center. This is done in two different pieces. An example of what we are looking for can be seen below.We need to piece together the calculator. First we add the storage component then the bandwidth component. There is a transaction component but this amount is trivial and we are going to ignore it for simplicity.If we look at the options for Azure storage, we can basically select blob storage in different zones. In the grand scheme of things, the cost is not siginificantly higher one way or another. The basic cost is about the same.The third class of storage that are going to look at is the Oracle Storage Cloud Service. We can look at Oracle Storage Cloud Service as well as Oracle Archive Cloud Service. The Archive service compares directly to the Amazon Glacier service except that it is $1/TB/month and suffers the same transfer charges for outbound data. The Oracle Storage Cloud Service is similar to the Amazon S3 and Azure Blob Storage Service but it is offered either as a metered service (as is S3 and Blobs) and un-metered services. Unfortunately, Oracle does not provide a cost calculator for general use. The Value Added Distributors are given a copy of the calculator but it is not generally available. The key difference with the Oracle storage services is that there are two significant flavor differences; metered and non-metered. The metered services are charged just like the Microsoft and Amazon services. You pay for what you use on a per GB basis and pay for outbound data transfer. An example of the pricing calculator is shown below. Note that we do need to have a good guestimate on how much data we will transfer outbound across the internet. These charges are not consumed if you are reading the data to a compute engine in the cloud unlike S3 which still consumes cost just for reading the data off the disk.The most significant differential in storage offerings is the non-metered storage. Oracle offers storage in blocks that you reserve and allocate for 12 months. This is different from the metered storage in that metered can start with 10 TB and grow to 120 TB over the year. With the non-metered storage, you start with 120 TB and end with 120 TB. You can extend your contract and grow storage but you basically extend a new contract for more storage. You can not shrink your storage and pay for less. The benefit of this is that you don't have to pay for outbound data transfer. You can read and write as much as you want and not get charged for transferring the data across the internet. A pricing calculator for this is simple. How much do you need and how long do you need it?If we piece all of this together and look a price comparison between the three service providers, the answer of which is cheapest comes down to it depends. Oracle non-metered storage has a significant advantage if you are planning on reading back your data at high or unpredictable rates. Amazon S3 infrequent is the cheapest if you don't plan on reading back your data and want it as an insurance policy only. I honestly would go with Glacier or Oracle Archive if this is the case since it is an order of magnitude cheaper. The chart below compares 120 TB of storage and the variable charge for reading back this data on a monthly basis. If you have 120 TB of storage and plan on reading back 120 TB on a monthly basis, the Oracle non-metered storage is significantly cheaper. If you are only planning on reading back 12 to 24 TB per month the cost is about the same for all of the services.In summary, one option is not clearly better than the other (except for high read rates) and this blog is intended to help you decide on what fits your needs best. Pricing calculators can help with the cost based on transfer rates. It is important to remember that storage transfer is a significant part of the calculation. It is also important to look at your usage model. We assumed that you started with 120 TB and ended with 120 TB for our analysis. If you start with 12 TB and grow to 120 TB, the pricing calculation will be a little different. Neither the Amazon nor Azure calculators will help you run this simulation and you will have to calculate everything on a month by month basis. It is also interesting to take 120 TB of on-premise storage and assume that each TB can be purchased at $3K/TB. If we assume 10% annual hardware maintenance and a three year amortization, the charge for on-premise storage is $1030/month which might be more or less than cloud based storage. Your results might vary.

One of the newest concepts that has been introduced for cloud services is the concept of un-metered or dedicated services. Before we dive into this subject, let's review what a cloud service...

Industry generic technologies

making backup better

Yesterday we looked at backing up our production databases to cloud storage. One of the main motivations behind doing this was cost. We were able to reduce the cost of storage from $3K/TB capex plus $300/TB/year opex to $400/TB/year opex. This is a great solution but some customers complain that it is not generic enough and latency to cloud storage is not that great. Today we are going to address both of these issues with the cloud storage appliance. First, let's address both of the typical customer complaints. The database backup cloud service is just that. It backs up a database. It does it really well and it does it efficiently. You replace one of the backup library modules that translates writes of backup data to the cloud REST api rather than a tape driver. The software works well with commercial products like Symantec or Legato and integrates well into that solution. Unfortunately, the critics are right. The database backup cloud service does that and only that. It backs up Oracle databases. It does not backup MySQL, SQL Server, DB2, or other databases. It is a single use tool. A very useful single use tool but a single use tool. We need to figure out how to make it more generic and backup more than just databases. It would be nice if we could have it backup home directories, email servers, virtual machines, and other stuff that is used in the data center.The second complaint is latency. If we are writing to an SSD or spinning disk attached to a server via high speed SCSI, iSCSI, or SAS, we should expect 10ms access time or less. If we are writing to a server half way across the country we might experience 80ms latency. This means that a typical read or write takes eight times longer when we read and write cloud storage. For some applications this is not an issue. For others this latency makes the system unusable. We need to figure out how to read adn write at 10ms latency but leverage the expandability of the cloud storage and lower cost. Enter stage left the Oracle Cloud Storage Appliance. The appliance is a software component that listens on the data center internet using the NFS protocol and talks to the cloud services using the storage REST api. Local disks are used as a cache front end to store data that is written to and read from the network shares exposed by the appliance. These directories map directly to containers in the Oracle Storage Cloud Service and can be clear text or encrypted when stored. Data written from network servers is accepted and released quickly as it is written to local disk and slow tricked to the cloud storage. As the cache fills up, data is aged and migrated from the cache storage into cloud storage. The metadata representing the directory structure and storage location is updated to show that the data is no longer stored locally but stored in the cloud. If a read occurs from the file system, the meta data helps the appliance locate where the data is stored and it is presented to the network client from the cache or pulled from the cloud storage and temporarily stored in the local cache as long as there is space. A block diagram of this architecture is shown belowThe concept of how to use this device is simple. We create a container in our cloud storage and we attach to it with the cloud storage appliance. This attachment is exposed via an nfs mount to clients on our corporate network and anyone on the client can read or write files in the cloud storage. Operations happen at local disk speed using the network security of the local network and group/owner rights in the directory structure. It looks, smells, and feels just like nfs storage that we would spend thousands of dollars per TB to own and operate. For the rest of this blog we are going to go through the installation steps on how to configure the appliance. The minimum requirements for the appliance areLinux 7 (3.10 kernel or later)Docker 1.6.1 or latertwo dual core x86 CPUs4 GB of RAMWe will be installing our configuration on a Windows desktop running VirtualBox. We will not go through the installation of Oracle Enterprise Linux 7 because we covered this a long time ago. We do need to configure the OS to have 4 GB of RAM and at least 2 virtual cores as shown in the images below.We also need to configure a network. We configure two networks. One is for the local desktop console and the other is for the public internet. We could configure a third interface to represent our storage network but for simplicity we only configure two.We can boot our Linux 7 system and will need to select the 3.10 kernel. By default it will want to boot to the 3.8 kernel which will cause problems in how the appliance works. What we would like to do is remove the 3.8 kernel from our installation. This is done by removing the packages with the rpm -e command. We then update the grub.cfg file to list only the 3.10 kernels. Once we have removed the kernels, we update the grub loader and enable additional options for the yum update.The next step that we need to take is to install docker. This is done with the yum install command.Once we have the docker package installed, we need to make sure that we have the nfs-client and nfs-server installed and started. It is important to note that the tar bundle is not generally available. It does require product manager approval to get a copy of the software for installation. The file that I got was labeled oscsa-1.0.5.tar.gz. I had to unzup and untar this file after loading it on my Linux VirtualBox instance. I did not do a screen capture of the download but did go through the installation process.We start the service with the oscsa command. When we start it it brings up a management web page so that we can make the connection to the cloud storage service. To see this page we need to start firefox and connect to the page.One of the things that we need to know is the end point of our storage. We can find this by looking at the management console for our cloud services. If we click on the storage cloud service details link we can find it. Once we have the end point we will need to enter this into the management console of the appliance as well as the cloud credentials.We can add encryption and a container name for our network share and start reading and writing.We can verify that everything is working from our desktop by mounting the nfs share or by using cloudberry to examine our cloud storage containers. In this example we use cloudberry just like we did when we looked at the generic Oracle Storage Cloud Services.We can examine the properties of the container and network share from the management console. We can look at activity and resources available for the caching.In summary, we looked at a solution to two problems offered by our database backup solution. The first was single purpose and the second was latency. By providing a network share to the data center we can not only backup or Oracle database but all of the databases by having the backup software write to the network share. We can backup other files like virtual machines, email boxes, and home directories. Disk latency operates at the speed of the local disk rather than the speed of the cloud storage. This software does not cost anything additional and can be installed on any virtual platform that supports Linux 7 with kernel 3.10 or greater. When we compare this to the Amazon Storage Gateway which requires 2x the processing power and $125/month to operate it looks significantly better. We did not compare it to the Azure solution because it is an iSCSI hardware solution and not easy to get a copy of for testing.

Yesterday we looked at backing up our production databases to cloud storage. One of the main motivations behind doing this was cost. We were able to reduce the cost of storage from $3K/TB capex...

Industry generic technologies

backing up a database to the cloud

Up to this point we have talked about the building blocks of the cloud. Today we are going to look into the real economics of using some of the cloud services that we have been examining. We have looked at moving compute and storage to the cloud. Let's look at some of the reasons why someone would look at storage in the cloud. Storage is one of those funny things that everyone asks for. Think of uses for storage. You save emails that come in every day. If you host your email system in your corporation, you have to consider how many emails someone can keep. You have to consider how long you keep files associated with email. At Oracle we have just over 100,000 employees and limit everyone to 2GB for email. This means that we need 200 TB to store email. If we increase this to 20 GB this grows to 2 PB. At $3K/TB we are looking at $600K capex to handle email messages. If we grow this to 2 PB we are looking at $6M for storage. This is getting into real money. Associated with this storage is a 10% support cost ($60K opex annually) as well as part of a full time employee to replace defective disks, tune and feed the storage system, allocate disks and partitions not only to our storage but other projects at a cost of $80K payroll annually. If we use a 4 year depreciation, our email boxes will cost us ($150K capex + $60K opex + $80K opex) $290K per year or $29/user just to store the email. If we expand the email limits to 20 GB we grow almost everything by a factor of 10 as well so the email boxes cost us $220/user annually (we don't need 10x the storage admins). Pile on top of this home directories that people want to save attachments into and this number explodes. We typically do want to give everyone 20 GB for a home directory since this stores documents associated with operation of the company. We typically want people storing these documents on a network share and not on a disk on their laptop. Storing data on their laptop opens up security and data protection discussions as well as access to data if the laptop fails. Putting it on a shared home directory allows the company to backup the files as well as define protection mechanisms for sensitive data. We have b