Tuesday Feb 12, 2013

SnapManager for Oracle DB for ZFSSA is out and ready

A few weeks ago, Oracle announced the Oracle database SnapManager software for ZFSSA.

It is a license just like the Clone or the Replication license. It's just a one-time, yes-or-no, on-or-off license per controller. Better yet, you can go ahead and get the software and try it out for free for 30 days. Go check it out with the link below.

The Snap Management Utility combines the underlying snapshot, clone, and rollback capabilities of the Oracle ZFS Storage Appliance with standard host-side processing so all operations are consistent.

Downloading the Oracle Snap Management Utility for Oracle Database Software

A. Customers who purchased the license need to download the software from eDelivery (see

instructions below)

B. Customers who wish to evaluate for 30-days prior to purchase may download from the same

site. The license allows a 30-day evaluation period. Follow instructions below.

Instructions to download software:

1. Go to eDelivery link: https://edelivery.oracle.com/EPD/Search/handle_go

2. Login

3. Accept Terms and Restrictions

4. In the “Media Pack Search” window:

a. Under Product Pack, select “Sun Products”

b. Under Platform, select “Generic”

c. Click “Go”

5. From the results, select the “Oracle Snap Management Utility for Oracle Database”

6. There are two files for download:

a. The “Oracle Snap Management Utility for Oracle Database, Client v 1.1.0” is required

b. The “Sun ZFS Storage Software 2011.1.5.0” is the latest version of the ZFS Storage

Appliance SW provided for customers who need to upgrade their software.

UPDATE-7-27-13- Just found out that if you buy the SMU license, you do NOT need to buy the clone license. The cloning is included in SMU, so that's cool.

Monday Feb 11, 2013

Oracle Iaas now includes the ZFS Backup Appliance

Ok, so this is pretty cool. If you didn't know, Oracle has this great program called Iaas, which is Infrastructure As A Service. You can go check it out here: http://www.oracle.com/us/products/engineered-systems/iaas/overview/index.html

What this means it that someone who really wants an Oracle engineered system, such as an Exadata, but can't come up with the up-front cost, can do Iaas and put it in their datacenter for a low monthly fee. This can be really cool. Some people can now change their entire budget from a Cap-ex to an Op-ex, save a bunch of up-front costs, and still get the hardware they need and want.

As of this week, the ZFSBA is now included in the Iaas offering. So one can get the ZFS Backup Appliance and use it to backup their engineered system (Exadata, Exalogic, or SuperCluster) over infiniband. They can also use it to then make snaps and clones of that data for their testing and development, as well as use it for general-purpose storage over 10Gig, 1Gig or FC. Pretty sweet way to get the ZFS Storage system in your site without the up-front costs. You can get the ZFSBA in a Iaas all by itself if you want, without the engineered system at all, just to get the ZFS storage.

Now, some of you may be asking, "What the heck is the ZFSBA and how is it different than the ZFSSA?"

I haven't talked about the ZFSBA before. The ZFS Backup appliance. I probably should have. You can get more info on it here: http://www.oracle.com/us/products/servers-storage/storage/nas/zfs-backup-appliance/overview/index.html
Here is the low-down. It's a 7420 cluster with drive trays, all pre-cabled and in a rack, ready-to-go. The 7420 has IB cards in place and the whole system is a single line-item to make it easy for the sales team to have a single line-item part number to use as an easy way to add a ZFSSA to an engineered system deal for backing up the engineered system. There are two versions, one with high-capacity drives and the other with high-performance drives. Either one you get can add additional trays of either type later. Unlike the other engineered systems, the ZFSBA does allow one to use the extra space in the rack, which is nice. 
Sun ZFS Storage 7420

So, if you want a 7420 cluster and a rack, is there a downside to always using the ZFSBA to order a 7420? Not many. Same price, easier to order with less part numbers. You can still customize it and add more stuff. There is one downside, and that's the fact that the ZFSBA does use the 32-core version of the 7420, not the 40-core version. The backup of an Exadata does not require more cores, so they went with the smaller of the two. If you need more power and more DRAM for faster workloads, however, you may want to build a 7420 ZFSSA the normal way.

If this doesn't make sense, please add a comment below or just email me.  


New trays- better pictures

Ok, here are some much better pictures of our two new trays. 
The DE2-24P is the 2u performance model, meaning that it holds 2.5" 10,000 RPM drives (and up to four LZ SSDs, of course). These are currently either 300GB or 900GB drives.

The DE2-24C is the 4u capacity model, which holds the larger 3.5" 7,200 RPM drives and LZ drives. These are currently 3TB drives.

One of these days, I really need to update my storage eye charts with these new trays. I just haven't had the time!!! 

Tuesday Jan 08, 2013

New code and new disk trays !!!

Hey everybody, happy new year and some great news for the ZFSSA...

The new 2u disk trays have come out early. I was not expecting them until later this quarter, but was surprised yesterday that Oracle announced them ready for sale. Sweet. So we now have a 4u capacity tray for 3TB drives (soon to be 4TB drives), and a 2u high-performance tray with either 300GB or 900GB 10K speed drives. These new 900GB 10K speed drives have the same IOPS as our current 600GB 15K speed drives, since the form factor went from 3.5" to 2.5". So you now can have 24 drives in a 2u tray. Very cool. These new trays require OS 2011.1.5, and right now you can NOT mix them with the older DS2 trays. Being able to mix them will be supported later, however.

To go along with that, the new 2011.1.5 code has been released. you can download it right now in MOS. It fixes a ridiculous amount of  issues, as well as supports these new 2u drive trays. You can read all about the new code here: https://updates.oracle.com/Orion/Services/download?type=readme&aru=15826899


**Update 1-18-13 - I need to correct myself, and I'm adding this note instead of changing what I wrote up above and trying to hide that I messed up... Hey it happens...
At first I was lead to believe the the smaller size platter made up for the slower speed on the new 2.5" drives. This is not the case. It does help, but the 10K speed drives do get slightly less IOPS and throughput then the 3.5" 15K speed drives.  Not that this matters too much for us, since we pride ourselves on the fact we drive performance with the ZFSSA via our cache, not our spindle speed, but it's important to point out.  Now, the power savings and space savings are real, and very much worth using the smaller form factor. Also, you do understand that Oracle does not have a whole lot to do with this? This is the way drive manufacturers are going. They just don't make 2.5" drives at 15K speed. So this is the way it is. Now, at some point sooner rather than later, we will also be putting out an all SSD tray. So if you need fast IOP speeds on the spindles, we will have you covered there, too.

Monday Dec 03, 2012

My error with upgrading 4.0 to 4.2- What NOT to do...

Last week, I was helping a client upgrade from the 2011.1.4.0 code to the newest 2011.1.4.2 code. We downloaded the 4.2 update from MOS, upload and unpacked it on both controllers, and upgraded one of the controllers in the cluster with no issues at all. As this was a brand-new system with no networking or pools made on it yet, there were not any resources to fail back and forth between the controllers. Each controller had it's own, private, management interface (igb0 and igb1) and that's it. So we took controller 1 as the passive controller and upgraded it first. The first controller came back up with no issues and was now on the 4.2 code. Great. We then did a takeover on controller 1, making it the active head (although there were no resources for it to take), and then proceeded to upgrade controller 2.

Upon upgrading the second controller, we ran the health check with no issues. We then ran the update and it ran and rebooted normally. However, something strange then happened. It took longer than normal to come back up, and when it did, we got the "cluster controllers on different code" error message that one gets when the two controllers of a cluster are running different code. But we just upgraded the second controller to 4.2, so they should have been the same, right???

Going into the Maintenance-->System screen of controller 2, we saw something very strange. The "current version" was still on 4.0, and the 4.2 code was there but was in the "previous" state with the rollback icon, as if it was the OLDER code and not the newer code. I have never seen this happen before. I would have thought it was a bad 4.2 code file, but it worked just fine with controller 1, so I don't think that was it. Other than the fact the code did not update, there was nothing else going on with this system. It had no yellow lights, no errors in the Problems section, and no errors in any of the logs. It was just out of the box a few hours ago, and didn't even have a storage pool yet.

So.... We deleted the 4.2 code, uploaded it from scratch, ran the health check, and ran the upgrade again. once again, it seemed to go great, rebooted, and came back up to the same issue, where it came to 4.0 instead of 4.2. See the picture below.... HERE IS WHERE I MADE A BIG MISTAKE....

I SHOULD have instantly called support and opened a Sev 2 ticket. They could have done a shared shell and gotten the correct Fishwork engineer to look at the files and the code and determine what file was messed up and fixed it. The system was up and working just fine, it was just on an older code version, not really a huge problem at all.

Instead, I went ahead and clicked the "Rollback" icon, thinking that the system would rollback to the 4.2 code.   Ouch... What happened was that the system said, "Fine, I will delete the 4.0 code and boot to your 4.2 code"... Which was stupid on my part because something was wrong with the 4.2 code file here and the 4.0 was just fine. 

So now the system could not boot at all, and the 4.0 code was completely missing from the system, and even a high-level Fishworks engineer could not help us. I had messed it up good. We could only get to the ILOM, and I had to re-image the system from scratch using a hard-to-get-and-use FishStick USB drive. These are tightly controlled and difficult to get, almost always handcuffed to an engineer who will drive out to re-image a system. This took another day of my client's time. 

So.... If you see a "previous version" of your system code which is actually a version higher than the current version... DO NOT ROLL IT BACK.... It did not upgrade for a very good reason.

In my case, after the system was re-imaged to a code level just 3 back, we once again tried the same 4.2 code update and it worked perfectly the first time and is now great and stable.  Lesson learned. 

By the way, our buddy Ryan Matthews wanted to point out the best practice and supported way of performing an upgrade of an active/active ZFSSA, where both controllers are doing some of the work. These steps would not have helpped me for the above issue, but it's important to follow the correct proceedure when doing an upgrade.

1) Upload software to both controllers and wait for it to unpack
2) On controller "A" navigate to configuration/cluster and click "takeover"
3) Wait for controller "B" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software.
4) Wait for controller "B" to apply the update and finish rebooting
5) Login to controller "B", navigate to configuration/cluster and click "takeover"
6) Wait for controller "A" to finish restarting, then login to it, navigate to maintenance/system, and roll forward to the new software.
7) Wait for controller "A" to apply the update and finish rebooting
8) Login to controller "B", navigate to configuration/cluster and click "failback"

Thursday Nov 15, 2012

New code release today - 2011.1.4.2

Wow, two blog entries in the same day! When I wrote the large 'Quota' blog entry below, I did not realize there would be a micro-code update going out the same evening.

So here it is. Code 2011.1.4.2 has just been released. You can get the readme file for it here: https://wikis.oracle.com/display/FishWorks/ak-2011.

Download it, of course, through the MOS website.

It looks like it fixes a pretty nasty bug. Get it if you think it applies to you. Unless you have a great reason NOT to upgrade, I would strongly advise you to upgrade to 2011.1.4.2. Why? Because the readme file says they STRONGLY RECOMMEND YOU ALL UPGRADE TO THIS CODE IMMEDIATELY using LOTS OF CAPITAL LETTERS.

That's good enough for me. Be sure to run the health check like the readme tells you to. 

**Updated after I posted the above... What worries me is that 2011.1.5.0 was supposed to be out pretty soon, as in weeks. So if they put this 1.4.2 version out now, instead of just adding these three fixes to the 1.5.0 code, they must be pretty important.  

Quotas - Using quotas on ZFSSA shares and projects and users

So you don't want your users to fill up your entire storage pool with their MP3 files, right? Good idea to make some quotas. There's some good tips and tricks here, including a helpful workflow (a script) that will allow you to set a default quota on all of the users of a share at once.

Let's start with some basics. I mad a project called "small" and inside it I made a share called "Share1". You can set quotas on the project level, which will affect all of the shares in it, or you can do it on the share level like I am here. Go the the share's General property page.

First, I'm using a Windows client, so I need to make sure I have my SMB mountpoint. Do you know this trick yet? Go to the Protocol page of the share. See the SMB section? It needs a resource name to make the UNC path for the SMB (Windows) users. You do NOT have to type this name in for every share you make! Do this at the Project level. Before you make any shares, go to the Protocol properties of the Project, and set the SMB Resource name to "On". This special code will automatically make the SMB resource name of every share in the project the same as the share name. Note the UNC path name I got below. Since I did this at the Project level, I didn't have to lift a finger for it to work on every share I make in this project. Simple.

So I have now mapped my Windows "Z:" drive to this Share1. I logged in as the user "Joe". Note that my computer shows my Z: drive as 34GB, which is the entire size of my Pool that this share is in. Right now, Joe could fill this drive up and it would fill up my pool. 

Now, go back to the General properties of Share1. In the "Space Usage" area, over on the right, click on the "Show All" text under the Users & Groups section. Sure enough, Joe and some other users are in here and have some data. Note this is also a handy window to use just to see how much space your users are using in any given share. 

Ok, Joe owes us money from lunch last week, so we want to give him a quota of 100MB. Type his name in the Users box. Notice how it now shows you how much data he's currently using. Go ahead and give him a 100M quota and hit the Apply button.

If I go back to "Show All", I can see that Joe now has a quota, and no one else does.

Sure enough, as soon as I refresh my screen back on Joe's client, he sees that his Z: drive is now only 100MB, and he's more than half way full.

 That was easy enough, but what if you wanted to make the whole share have a quota, so that the share itself, no matter who uses it, can only grow to a certain size? That's even easier. Just use the Quota box on the left hand side. Here, I use a Quota on the share of 300MB.

 So now I log off as Joe, and log in as Steve. Even though Steve does NOT have a quota, it is showing my Z: drive as 300MB. This would effect anyone, INCLUDING the ROOT user, becuase you specified the Quota to be on the SHARE, not on a person.

 Note that back in the Share, if you click the "Show All" text, the window does NOT show Steve, or anyone else, to have a quota of 300MB. Yet we do, because it's on the share itself, not on any user, so this panel does not see that.

Ok, here is where it gets FUN....

Let's say you do NOT want a quota on the SHARE, because you want SOME people, like Root and yourself, to have FULL access to it and you want the ability to fill the whole thing up if you darn well feel like it. HOWEVER, you want to give the other users a quota. HOWEVER you have, say, 200 users, and you do NOT feel like typing in each of their names and giving them each a quota, and they are not all members of a AD global group you could use or anything like that.  Hmmmmmm....

No worries, mate. We have a handy-dandy script that can do this for us. Now, this script was written a few years back by Tim Graves, one of our ZFSSA engineers out of the UK. This is not my script. It is NOT supported by Oracle support in any way. It does work fine with the 2011.1.4 code as best as I can tell, but Oracle, and I, are NOT responsible for ANYTHING that you do with this script. Furthermore, I will NOT give you this script, so do not ask me for it. You need to get this from your local Oracle storage SC. I will give it to them. I want this only going to my fellow SCs, who can then work with you to have it and show you how it works. 

Here's what it does...
Once you add this workflow to the Maintenance-->Workflows section, you click it once to run it. Nothing seems to happen at this point, but something did. 

 Go back to any share or project. You will see that you now have four new, custom properties on the bottom.

 Do NOT touch the bottom two properties, EVER. Only touch the top two. Here, I'm going to give my users a default quota of about 40MB each. The beauty of this script is that it will only effect users that do NOT already have any kind of personal quota. It will only change people who have no quota at all. It does not effect the Root user.

 After I hit Apply on the Share screen. Nothing will happen until I go back and run the script again. The first time you run it, it creates the custom properties. The second and all subsequent times you run it, it checks the shares for any users, and applies your quota number to each one of them, UNLESS they already have one set. Notice in the readout below how it did NOT apply to my Joe user, since Joe had a quota set.

 Sure enough, when I go back to the "Show All" in the share properties, all of the users who did not have a quota, now have one for 39.1MB. Hmmm... I did my math wrong, didn't I?  

 That's OK, I'll just change the number of the Custom Default quota again. Here, I am adding a zero on the end.

 After I click Apply, and then run the script again, all of my users, except Joe, now have a quota of 391MB

 You can customize a person at any time. Here, I took the Steve user, and specifically gave him a Quota of zero. Now when I run the script again, he is different from the rest, so he is no longer effected by the script. Under Show All, I see that Joe is at 100, and Steve has no Quota at all. I can do this all day long. es, you will have to re-run the script every time new users get added. The script only applies the default quota to users that are present at the time the script is ran. However, it would be a simple thing to schedule the script to run each night, or to make an alert to run the script when certain events occur.

 For you power users, if you ever want to delete these custom properties and remove the script completely, you will find these properties under the "Schema" section under the Shares section. You can remove them here. There's no need to, however, they don't hurt a thing if you just don't use them.

 I hope these tips have helped you out there. Quotas can be fun. 

Sunday Oct 28, 2012

Our winners- and some BBQ for everyone

Please also see "Allen's Grilling Channel" over to the right in my Bookmarks section...

Congrats to our two winners for the first two comments on my last entry. Steve from Australia and John Lemon. Steve won since he was the first person over the International Date Line to see the post I made so late after a workday on Friday. So not only does he get to live in a country with the 2nd most beautiful women in the world, but now he gets some cool Oracle Swag, too. (Yes, I live on the beach in southern California, so you can guess where 1st place is for that other contest…Now if Steve happens to live in Manly, we may actually have a tie going…)

OK, ok, for everyone else, you can be winners, too. How you ask? I will make you the envy of every guy and gal in your neighborhood or campsite. What follows is the way to smoke the best ribs you or anyone you know have ever tasted. Follow my instructions and give it a try. People at your party/cookout/campsite will tell you that they’re the best ribs they’ve ever had, and I will let you take all the credit. Yes, I fully realize this post is going to be longer than any post I’ve done yet. But let’s get serious here. Smoking meat is much more important, agreed? J In all honesty, this is a repeat of another blog I did, so I’m just copying and pasting.

Step 1. Get some ribs. I actually really like Costco’s pack. They have both St. Louis and Baby Back. (They are the same ribs, but cut in half down the sides. St. Louis style is the ‘front’ of the ribs closest to the stomach, and ‘Baby back’ is the part of the ribs where is connects to the backbone). I like them both, so here you see I got one pack of each. About 4 racks to a pack. So these two packs for $25 each will feed about 16-20 of my guests. So around 3 bucks a person is a pretty good deal for the best ribs you’ll ever have.

Step 2. Prep the ribs the night before you’re going to smoke. You need to trim them to fit your smoker racks, and also take off the membrane and add your rub. Then cover and set in fridge overnight. Here’s how to take off the membrane, which will not break down with heat and smoke like the rest of the meat, so must be removed. Use a butter knife to work in a ways between the membrane and the white bone. Just enough to make room for your finger. Try really hard not to poke through the membrane, you want to keep it whole.

See how my gloved fingers can now start to lift up and pull off the membrane? This is what you are trying to do. It’s awesome when the whole thing can come off at once. This one is going great, maybe the best one I’ve ever done. Sometime, it falls apart and doesn't come off in one nice piece. I hate when that happens.

Now, add your rub and pat it down once into the meat with your other hand. My rub is not secret. I got it from my mentor, a BBQ competitive chef who is currently ranked #1 in California and #3 in the nation on the BBQ circuit. He does full-day classes in southern California if anyone is interested in taking his class. Go to www.slapyodaddybbq.com to check him out. I tweaked his run recipe a tad and made my own. It’s one part Lawry’s, one part sugar, one part Montreal Steak Seasoning, one part garlic powder, one-half part red chili powder, one-half part paprika, and then 1/20th part cayenne. You can adjust that last ingredient, or leave it out. Real cheap stuff you can get at Costco. This lets you make enough rub to last about a year or two. Don’t make it all at once, make a shaker’s worth and use it up before you make more. Place it all in a bowl, mix well, and then add to a shaker like you see here. You can get a shaker with medium sized holes on it at any restaurant supply store or Smart & Final. The kind you see at pizza places for their red pepper flakes works best.

Now cover and place in fridge overnight.

Step 3. The next day. Ok, I’m ready to go. Get your stuff together. You will need your smoker, some good foil, a can of peach nectar, a bottle of Agave syrup, and a package of brown sugar. You will need this stuff later. I also use a clean spray bottle, and apple juice.

Step 4. Make your fire, or turn on your electric smoker. In this example I’m using my portable charcoal smoker. I got this for only $40. I then modified it to be useful. Once modified, these guys actually work very well. Trust me, your food DOES NOT KNOW how expensive your smoker is. Someone who tells you that you need to spend a bunch of money on a smoker is an idiot. I also have an electric smoker that stays in my backyard. It’s cleaner and larger so I can smoke more food. But this little $40 one works great for going camping. Here is what my fire-bowl looks like. I leave a space in the middle open, and place cold charcoal and wood chucks in a circle going outwards. This makes it so when I dump the hot coals down the middle, they will slowly burn outwards, hitting different wood chucks at different times, allowing me to go 4-5 hours without having to even touch my fire. For ribs, I use apple and pecan wood. Pecan works for anything. Apple or any fruit wood is excellent for pork.

So now I make my hot charcoal with a chimney only about half-full. I found a great use for that side-burner on my grill that I never use. It makes a fantastic chimney starter. You never use fluids of any kind, nor ever use that stupid charcoal that has lighter fluid built into it. Never, ever, ever.

Step 5. Smoke. Add your ribs in the racks and stack them up in your smoker. I have a digital thermometer on a probe that I use to keep track of the temp in the smoker. I just lay the probe on the top rack and shut the lid. This cheap guy is a little harder to maintain the right temperature of around 225 F, so I do have to keep my eye on it more than my electric one or a more expensive charcoal one with the cool gadgets that regulate your temp for you.

Every hour, spray apple juice all over your ribs using that spray bottle. After about 3 hours, you should have a very good crust (called the Bark) on your ribs. Once you have the Bark where you want it, carefully remove your ribs and place them in a tray. We are now ready for a very important part to make the flavor.

Get a large piece of foil and place one rib section on it. Splash some of the peach nectar on it, and then a drizzle of the Agave syrup. Then, use your gloved hand to pack on some brown sugar. Do this on BOTH sides, and then completely wrap it up TIGHT in the foil. Do this for each rib section, and then place all the wrapped sections back into the smoker for another 4 to 6 hours. This is where the meat will get tender and flavorful. The first three hours is only to make the smoke bark. You don’t need smoke anymore, since the ribs are wrapped, you only need to keep the heat around 225 for the next 4-6 hours. Obviously you don’t spray anymore. Just time and slow heat. Be patient. It’s actually really hard to overdo it. You can let them go longer, and all that will happen is they will get even MORE tender!!! If you take them out too soon, they will be tough.

How do you know? Take out one package (use long tongs) and open it up. If you grab a bone with your tongs and it just falls apart and breaks away from the rest of the meat, you are done!!! Enjoy!!!

Step 6. Eat. It pulls apart like this when it’s done.

By the way, smoking tri-tip is way easier. Just rub it with the same rub, and put in your smoker for about 2.5 hours at 250 F. That’s it. Low-maintenance. It comes out like this, with a fantastic smoke ring and amazing flavor.

Thanks, and I will put up another good tip, about the ZFSSA, around the end of November.


Friday Oct 26, 2012

Replication - between pools in the same system

OK, I fully understand that's it's been a LONG time since I've blogged with any tips or tricks on the ZFSSA, and I'm way behind. Hey, I just wrote TWO BLOGS ON THE SAME DAY!!! Make sure you keep scrolling down to see the next one too, or you may have missed it. To celebrate, for the one or two of you out there who are still reading this, I got something for you. The first TWO people who make any comment below, with your real name and email so I can contact you, will get some cool Oracle SWAG that I have to give away. Don't get excited, it's not an iPad, but it pretty good stuff. Only the first two, so if you already see two below, then settle down.

Now, let's talk about Replication and Migration.  I have talked before about Shadow Migration here: https://blogs.oracle.com/7000tips/entry/shadow_migration
Shadow Migration lets one take a NFS or CIFS share in one pool on a system and migrate that data over to another pool in the same system. That's handy, but right now it's only for file systems like NFS and CIFS. It will not work for LUNs. LUN shadow migration is a roadmap item, however.

So.... What if you have a ZFSSA cluster with multiple pools, and you have a LUN in one pool but later you decide it's best if it was in the other pool? No problem. Replication to the rescue. What's that? Replication is only for replicating data between two different systems? Who told you that? We've been able to replicate to the same system now for a few code updates back. These instructions below will also work just fine if you're setting up replication between two different systems. After replication is complete, you can easily break replication, change the new LUN into a primary LUN and then delete the source LUN. Bam.

Step 1- setup a target system. In our case, the target system is ourself, but you still have to set it up like it's far away. Go to Configuration-->Services-->Remote Replication. Click the plus sign and setup the target, which is the ZFSSA you're on now.

Step 2. Now you can go to the LUN you want to replicate. Take note which Pool and Project you're in. In my case, I have a LUN in Pool2 called LUNp2 that I wish to replicate to Pool1.

 Step 3. In my case, I made a Project called "Luns" and it has LUNp2 inside of it. I am going to replicate the Project, which will automatically replicate all of the LUNs and/or Filesystems inside of it.  Now, you can also replicate from the Share level instead of the Project. That will only replicate the share, and not all the other shares of a project. If someone tells you that if you replicate a share, it always replicates all the other shares also in that Project, don't listen to them.
Note below how I can choose not only the Target (which is myself), but I can also choose which Pool to replicate it to. So I choose Pool1.

 Step 4. I did not choose a schedule or pick the "Continuous" button, which means my replication will be manual only. I can now push the Manual Replicate button on my Actions list and you will see it start. You will see both a barber pole animation and also an update in the status bar on the top of the screen that a replication event has begun. This also goes into the event log.

 Step 5. The status bar will also log an event when it's done.

Step 6. If you go back to Configuration-->Services-->Remote Replication, you will see your event.

Step 7. Done. To see your new replica, go to the other Pool (Pool1 for me), and click the "Replica" area below the words "Filesystems | LUNs" Here, you will see any replicas that have come in from any of your sources. It's a simple matter from here to break the replication, which will change this to a "Local" LUN, and then delete the original LUN back in Pool2.

Ok, that's all for now, but I promise to give out more tricks sometime in November !!! There's very exciting stuff coming down the pipe for the ZFSSA. Both new hardware and new software features that I'm just drooling over. That's all I can say, but contact your local sales SC to get a NDA roadmap talk if you want to hear more.  

Happy Halloween,

New Write Flash SSDs and more disk trays

In case you haven't heard, the Write SSDs the ZFSSA have been updated. Much faster now for the same price. Sweet.

The new write-flash SSDs have a new part number of 7105026 , so make sure you order the right ones. It's important to note that you MUST be on code level 2011.1.4.0 or higher to use these.

They have increased in IOPS from 6,000 to 11,000, and increased throughput from 200MB/s to 350MB/s.  

 Also, you can now add six SAS HBAs (up from 4) to the 7420, allowing one to have three SAS channels with 12 disk trays each, for a new total of 36 disk trays. With 3TB drives, that's 2.5 Petabytes. Is that enough for you?

Make sure you add new cards to the correct slots. I've talked about this before, but here is the handy-dandy matrix again so you don't have to go find it. Remember the rules: You can have 6 of any one kind of card (like six 10GigE cards), except IB which is still four max. You only really get 8 slots, since you have two SAS cards no matter what. If you want more than 12 disk trays, you need two more SAS cards, so think about expansion later, too. In fact, if you're going to have two different speeds of drives (in other words you want to mix 15K speed and 7,200 speed drives in the same system), I would highly recommend two different SAS channels. So I would want four SAS cards in that system, no matter how many trays you have. 

Thursday Aug 16, 2012

New 2011.1.4.0 code is ready

Ok, folks, the newest code for the ZFSSA is now out and available on My Oracle Support. It's pretty easy to find under the 'Patches and Updates" tab. Or you can look for it under it's patch number:


So this is now version  2011.1.4.0, also called AK-2011. It has a VERY large list of fixes for various issues. You can find the readme file with all of the issues fixed in this release here:


**Update 8-27-12 ** IMPORTANT news for Exadata clients backing up to a ZFSSA over Infiniband: 
Earlier, I reported on this blog about a special IDR code release to fix a certain issue called "CR 7162888, IB infiniband interface stop communicating on both heads". It turns out that this is a duplicate issue of CR 7013410, which IS fixed in this public code release of 2011.1.4.0.
So, go ahead and use this public release, and do not worry about any special IDR unless you are working with support on some other issue. - Steve

Thursday Aug 09, 2012

Phone Home- just like E.T.

Hmmm, still no update, so they have changed the ETA from "July" to "It will be out when it's ready". I have not heard of a new ETA, so please don't ask.

In the meantime, there are plenty of you that do not have the Automated Service Request (ASR) feature turned on for your ZFSSA systems. This is better known as "Phone Home". It's not only extremely handy and free, but it could possibly save your job and your company lots of time and money. You really, really want to turn it on. The Phone Home feature on the ZFSSA does two things. It obviously creates a support ticket with Oracle support in the event of some failure on the ZFSSA. That's good. You will see an email that this happened, and can then go track it with your account on the MOS (My Oracle Support) website. If you're wondering what issues will force an ASR, then go back and read my blog entry from last year here: https://blogs.oracle.com/7000tips/entry/asr_automated_service_request_aka

You need to make sure your ZFSSA system can get to the internet and has access through your firewalls to the following sites and ports:
1. inv-cs.oracle.com  on port 443
2. asr-services.oracle.com  on port 443

The other thing it does is send heartbeats to the Oracle ASR phone home database. As a pre-sales engineer, I find this very handy for my clients. I'll show you a few screenshots below of what I can see in the phone home database for my clients. This lets me keep track of minor, major, and critical issues my customers are having with their ZFSSA systems. I can see how their storage pools are setup and if they are becoming too full. This is something I like to track and keep my eye on for my customers. I will offer to create summary reports for them on a monthly basis to help them keep track of their systems, as many of them don't look at them very often or have too many other things to worry about. Your local Storage SC can also access this database and show you what he or she sees for your systems on Phone Home.

Here is where you setup Phone Home in the ZFSSA BUI interface:

Here is an example of what I can see in the Phone Home database. 

In this example, just lower down on the same screen, you can see the list of issues this system has had over the years, and a click will get you more detail. This system had some hard drives replaced in January, and one in May, but nothing of note since then. Many of the other 'Major' alerts you see below were actually just cluster peer takeovers done for testing and upgrades.

Friday Jul 27, 2012

Installing new ZFSSA systems

I had not realized how long it has been since my last blog entry. I've been so busy installing ZFSSAs that I've been a flake on my blog. I havn't hit anything Earth-shattering to report, and like most of you, I'm really waiting for the release of the new code, which I heard was supposed to be in July. Hey, we have two more days before we can say it's late, right?  :)

This week, I spent two days setting up and configuring a large 7420 cluster. The funny thing is, a day later, I spent only 30 minutes setting up two different 7420 single-head systems. It's really funny how easy it is to setup a single head, single tray system. 

Monday Jun 04, 2012

New Analytic settings for the new code

If you have upgraded to the new 2011.1.3.0 code, you may find some very useful settings for the Analytics. If you didn't already know, the analytic datasets have the potential to fill up your OS hard drives. The more datasets you use and create, that faster this can happen. Since they take a measurement every second, forever, some of these metrics can get in the multiple GB size in a matter of weeks. The traditional 'fix' was that you had to go into Analytics -> Datasets about once a month and clean up the largest datasets. You did this by deleting them. Ouch. Now you lost all of that historical data that you might have wanted to check out many months from now. Or, you had to export each metric individually to a CSV file first. Not very easy or fun. You could also suspend a dataset, and have it not collect data at all. Well, that fixed the problem, didn't it? of course you now had no data to go look at. Hmmmm....

All of this is no longer a concern. Check out the new Settings tab under Analytics...

Now, I can tell the ZFSSA to keep every second of data for, say, 2 weeks, and then average those 60 seconds of each minute into a single 'minute' value. I can go even further and ask it to average those 60 minutes of data into a single 'hour' value.  This allows me to effectively shrink my older datasets by a factor of 1/3600 !!! Very cool. I can now allow my datasets to go forever, and really never have to worry about them filling up my OS drives.

That's great going forward, but what about those huge datasets you already have? No problem. Another new feature in 2011.1.3.0 is the ability to shrink the older datasets in the same way. Check this out. I have here a dataset called "Disk: I/O opps per second" that is about 6.32M on disk (You need not worry so much about the "In Core" value, as that is in RAM, and it fluctuates all the time. Once you stop viewing a particular metric, you will see that shrink over time, just relax). 

When one clicks on the trash can icon to the right of the dataset, it used to delete the whole thing, and you would have to re-create it from scratch to get the data collecting again. Now, however, it gives you this prompt:

As you can see, this allows you to once again shrink the dataset by averaging the second data into minutes or hours.

Here is my new dataset size after I do this. So it shrank from 6.32MB down to 2.87MB, but i can still see my metrics going back to the time I began the dataset.

Now, you do understand that once you do this, as you look back in time to the minute or hour data metrics, that you are going to see much larger time values, right? You will need to decide what size of granularity you can live with, and for how long. Check this out.

Here is my Disk: Percent utilized from 5-21-2012 2:42 pm to 4:22 pm:

After I went through the delete process to change everything older than 1 week to "Minutes", the same date and time looks like this:

Just understand what this will do and how you want to use it. Right now, I'm thinking of keeping the last 6 weeks of data as "seconds", and then the last 3 months as "Minutes", and then "Hours" forever after that. I'll check back in six months and see how the sizes look.


Friday May 18, 2012

New code is out- Version 2011.1.3.0


The newest version of the ZFSSA code, 2011.1.3.0, is now out and availiable on MOS.

I will be writing more about one of it's many new, useful features coming early next week. It's very cool, and has to do with how you can now change the size of your analytic datasets.


ak-2011. Release Notes


This minor release of the Sun ZFS Storage Appliance software contains significant bug fixes for all supported platforms. Please carefully review the list of CRs that have been addressed and all known issues prior to updating.

Among other issues, this release fixes some memory fragmentation issues (CRs 7092116 and 7105404), includes improvements to DTrace Analytics, and failover improvements to DNS, LDAP, and the SMB Domain Controller.

This release requires appliances to be running the 2010.Q3.2.1 micro release or higher prior to updating to this release. In addition, this release includes update health checks that are performed automatically when an update is started prior to the actual update from the prerequisite 2010.Q3.2.1 micro release or higher. If an update health check fails, it can cause an update to abort. The update health checks help ensure component issues that may impact an update are addressed. It is important to resolve all hardware component issues prior to performing an update.

Deferred Updates

When updating from a 2010.Q3 release to a 2011.1 release, the following deferred updates are available and may be reviewed in the Maintenance System BUI screen. See the "Maintenance:System:Updates#Deferred_Updates" section in the online help for important information on deferred updates before applying them.

1. RAIDZ/Mirror Deferred Update (Improved RAID performance)
This deferred update improves both latency and throughput on several important workloads. These improvements rely on a ZFS pool upgrade provided by this update. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 29.

2. Optional Child Directory Deferred Update (Improved snapshot performance)
This deferred update improves list retrieval performance and replication deletion performance by improving dataset rename speed. These improvements rely on a ZFS pool upgrade provided by this update. Before this update has been applied, the system will be able to retrieve lists and delete replications, but will do so using the old, much slower, recursive rename code. Applying this update is equivalent to upgrading the on-disk ZFS pool to version 31.

Supported Platforms

Issues Addressed

The following CRs have been fixed in this release:

4325892 performance decrease if 1st nameserver is down
4377911 RFE for improved DNS resolver failover performance
6822262 Windows Media Service/SQL server cannot connect to cifs share
6822586 SMB should try UDP first to get LDAP SRV record and retry with TCP if truncated UDP response
6941854 idmap_getwinnamebygid() and idmap_getwinnamebyuid() need to work for builtin names
6953716 Exception 'Native message: datasets is undefined' while append dataset to current worksheet
6973870 Specify retention time for Analytics data
6991949 panic will happen during some error injection stress
6996698 The SRAO may terminate irrelevant memory copy
6997450 gcpu_mca_process() doesn't return a right disp for poisoned error
7023548 replacement failed for faulted readzilla
7040757 smb_com_write_andx NULL pointer dereference panic in mbc_marshal_get_uio
7044065 Replay records within a dataset in parallel
7047976 zil replay assertion failure with full pool
7048780 I/O sent from iSCSI initiator embedded in VirtualBox completes with a status TASK SET FULL
7052406 NFS Server shouldn't take zero copy path on READ if no write chunk list provided
7052703 zl_replay_lock needs to be initialised and destroyed
7066080 s11n code generated by fcc should include strings.h
7066138 fcc must define _INT64_TYPE
7066170 configurable max- and min-units for akInputDuration
7066552 NFS Server Fails READS with NFS4ERR_INVAL when using krbi or krb5p
7071147 DC failover improvements
7071628 ldap_cachemgr exits after failure in profile refresh, only when using sasl/GSSAPI authentication
7071916 ztest/ds_3 missing log records: replayed X < committed Y
7074722 dataspan can be marked read-only even if it has dirty subspans
7080443 LDAP client failover doesn't work
7080790 ZIL: Assertion failed: zh->zh_replay_seq < *replayed_seq (0x1a < 0x1a)
7084762 ztest/ds_3 missing log records: replayed X < committed Y - Part 2
7089422 ldap client uses bindTimeLimit instead of searchTimeLimit when searching for entries
7090133 Large READs are broken with krb5i or krb5p with NFS Zero-Copy turned on
7090153 table-free akInputRadio
7090166 ak_dataspan_stashed needs a reality check
7091223 command to prune datasets
7092116 Extremely sluggish 7420 node due to heap fragmentation
7093687 LDAP client/ldap_cachemgr: long delays in failover to secondary Directory Server
7098553 deadlock when recursive zfs_inactive collides with zfs_unmount
7099848 Phone Home logs still refer to SUN support
7102888 akShow() can clobber CSS
7103620 akInputRadio consumers must be explicit in their use of subinputs as labels
7104363 Influx of snapshots can stall resilvering
7105404 appliance unavailable due to zio_arena fragmentation
7107750 SMB kernel door client times out too early on authentication requests
7108243 ldap_cachemgr spins on configuration error
7114579 Operations per second broken down by share reports "Datum not present" for most time periods
7114890 Ak Build tools should accommodate double-slashes in paths
7117823 RPC: Can't decode result after READ of zero bytes
7118230 Need to deliver CMOS images for Lynxplus SW 1.5
7121760 failure to post an alert causes a umem double-free
7122403 akCreateLabel(): helper function for creating LABEL elements
7122405 several Analytics CLI commands do not check for extra arguments
7122426 akParseDateTime() could be a bit more flexible
7123096 panic: LU is done with the task but LPORT is not done, itask ffffff9f59c540a0 itask_flags 3204
7125626 fmtopo shows duplicate target-path for both sims in a tray starting in 2010.q3.4 and 2011.1.1
7126842 NTLMSSP negotiation fails with 0xC00000BB (NT_STATUS_NOT_SUPPORTED)
7128218 uio_to_mblk() doesn't check for esballoca() failure
7129787 status could be uninitialized in netlogon_logon function
7130441 CPU is pegging out at 98%
7131965 SMB stops serving data (still running) Need to reset smb to fix issue
7133069 smbserver locks up on 7320 running 2011.1. Can't kill the service
7133619 Need to deliver CMOS image for SW 1.3 for Otoro
7133643 Need to deliver CMOS image for SW 1.2 for Otoro+
7142320 Enable DNS defer-on-fail functionality
7144155 idmap kernel module has lock contention calling zone_getspecific()
7144745 Online help Application Integration - MOS should be replaced with OTN
7145938 Add maximum cards for 7420 10GbE, FC, Quad GbE, and InfiniBand
7146346 Online Help: Document Sun ZFS Backup Appliance
7149992 Update doc to include 7320 DRAM
7152262 double-digit firefox version number throws off appliance version checks
7153789 ldapcachemgr lint warnings
7154895 Remove CMOS images for Lynx+ SW 1.5
7155512 lint warning in usr/src/cmd/ldapcachemgr/cachemgr_change.c
7158091 BUI Alert Banner and Wait Dialog are not functioning correctly
7158094 Still not ready for 9005
7158519 dataset class authorization doesn't work as expected
7158522 pruning an unsaved dataset does nothing but looks like it working continuously
7160553 NMI does not panic appliance platforms using apix
7161060 system hang due to physical memory exhaustion seen when major shift in workload
7165883 arc data shrinks continuously after arc grew to reach its steady state

Monday May 14, 2012

a break for something more important---

So we should have some news later this week on a minor code release with some helpful features in it. Can't say more until it comes out, but watch my blog this week.

In the meantime....  I have always been the grill-master at our camps with friends and family. My boys and I camp about 25-30 times a year. As much as I enjoy grilling, I was woefully behind in my smoking/BBQ skills. The difference being that grilling is cooking fast over high heat (think burgers, steak, and most seafood), and real BBQ involves smoke and slow-cooking over hours. Smoking is better for ribs, chicken, brisket and tri-tip. So I went to a real BBQ day-long class, got a small beginner's smoker, and now I'm smoking meat a lot more. Here's a pic of my last tri-tip in the smoker. Homemade rub and sauce cost just pennies compared to store-bought, and the meat is cheap at Costco. This may have been the best tri-tip I've ever made. Great smoke ring and flavor in only 1.5 hours. I was trying to tie this into the ZFSSA, but I just can't, so I stopped caring and now just showing off my new BBQ skills. Ha ha. Enjoy.

Thursday May 03, 2012

Analytics & Threshold Alerts

Alerts are great for not only letting you know when there's some kind of hardware event, but they can also be pro-active and let you know there's a bottleneck coming BEFORE it happens. Check these out. There are two kinds of Alerts in the ZFSSA. When you go to Configuration-->Alerts, you fist see the plus sign by the "Alert Actions" section. These are pretty self-explanatory and not what I'm talking about today. Click on the "Threshold Alerts", and then click the plus sign by those.

This is what I'm talking about. The default one that comes up, "CPU: Percent Utilization" is a good one to start with. I don't mind if my CPUs go to 100% utilized for a short time. After all, we bought them to be used, right? If they go over 90% for over 10 minutes, however, something is up, and maybe we have workloads on this machine it was not designed for, or we don't have enough CPUs in the system and need more. So we can setup an alert that will keep an eye on this for us and send us an email if this were to occur. Now I don't have to keep watching it all the time. For an even better example, keep reading...

What if you want to keep your eyes on whether your Readzillas or Logzillas are being over-utilized? In other words, do you have enough of them? Perhaps you only have 2 Logzillas, and you think you may be better off with 4, but how do you prove it? No problem. Here in Threshold Alerts, click on the Threshold drop-down box, and choose your "Disk: Percent Utilization for Disk: Jxxxxx 013" choice, which is my Logzilla drive in the Jxxxxx tray.

Wait. What's that? You don't have a choice in your drop-down for the Threshold item you are looking for, such as an individual disk?
Well, we will have to fix that.

Leave Alerts for now, and join me over in Analytics. Start with a worksheet with "Disk: Percent utilization broken down by Disk" chart. You do have this, as it's already one of your built-in datasets.

Now, expand it so you can see all of your disks, and find one of your Readzilla or Logzilla drives. (Hint: It will NOT be disk 13 like my example here. Logzillas are always in the 20, 21, 22, or 23 slots of a disk tray. Go to your Configuration-->Hardware screens and you can easily find out which drives are which for your system).

Now, click on that drive to highlight it, like this: 

 Click on the Drill Button, and choose to drill down on that drive as a raw statistic. You will now have a whole new data chart, just for that one drive.

 Don't go away yet. You now need to save that chart as a new dataset, which will keep it in your ZFSSA analytic metrics forever. Well, until you delete it.
Click on the "Save" button, the second to last button on that chart. It looks like a circle with white dots on it (it's supposed to look like a reel-to-reel tape spindle).

Now go to your "Analytics-->Datasets", and you will see a new dataset in there for it. 

 Go back to your Threshold Alerts, and you will now be able to make an alert that will tell you if this specific drive goes over 90% for more than 10 minutes. If this happens a lot, you probably need more Readzillas or Logzillas.

I hope you like these Alerts. They may take some time to setup at first, but in the long run you may thank yourself. It might not be a bad idea to send the email alerts to a mail distribution list, instead of a single person who may be on vacation when the alert is hit.  Enjoy. 

Thursday Apr 19, 2012

Route Table Stuff

Let's talk about your Routing Table.

I have never installed a ZFSSA, ever, without having to edit this table. If you believe that you do not need to edit your routing table, then you are wrong.
:)  Ok, maybe not. Maybe you only have your ZFSSA connected to one network with only a few systems on it. I guess it's possible. Even in my simulator, however, I had to edit the routing table so I could use it no matter how I had my laptop connected, at home over a VPN or at work or using a public Wifi. So I'm going to bet a nice dinner that you, or someone, should be checking this out.

First things first. I'm going to assume you have a cluster. I try really hard to only sell clusters, but yes I know there are plenty of single-nodes out there too. Single-node people can skip these first two paragraphs. It's very important in your cluster to have a 1GigE management interface to each of the two controllers. You really want to be able to manage each controller, even when one of them is down, right? So best practice is to use the 'igb0' port for controller 1 management and to use the 'igb1' port for controller 2 management. It's important to make these ports 'Private' in the cluster configuration screen, so they do NOT failover to the other controller when a cluster takeover takes place for whatever reason. Igb0 and igb1 are two of the four built-in 1GigE ports. You can still use igb2 and igb3 for data, either alone or as an aggregate, and don't make them private, so they DO failover in a cluster takeover event. Now go to your remote workstation, which may be over a different subnet, and you should be able to ping and connect to Controller 1 using igb0.
Now, back to the routing table. You have probably noticed that you can not ping or connect to the other controller, and you think something is wrong. Not to worry, everything is fine. You just need to tell your routing table, which is shared between the heads, how to talk to that other port, igb1. You see, you have a default route setup already for port igb0, that's why it works. Your new, private, igb1 however, does not know how to speak back to your remote system you are now using to manage via the BUI from a different subnet. So, make a new default route for igb1 and point it to the default gateway, which is the router it needs to use in order to cross subnets. See the picture below. Note how I have a default route for "ZFS1-MGMT" for port igb0. This shows a green light because I'm currently on ZFS1, and it sees this port just fine. I also have a default route for "ZFS2-MGMT" from port igb1. This route has a blue light, showing it as inactive. That's because this controller, ZFS1, has nothing plugged into it's igb1 port. That's perfect. Hit "Apply". Now count to 10. Now from your remote host, go ahead and ping or connect to Controller 2, and it works!!! This is because your controllers share a routing table, and when you added that igb1 route, it propagated over to the other controller, where igb1 is plugged in, and that route has a green light over there and it works fine. You will see from Controller 2's point of view that igb1 has a green light and igb0 has a blue light.  (continued below the picture)

Now it's time to setup any static routes you may need. If you have different subnets for your 1GigE management and your IB or 10GigE data (a very good idea), then you will need to make these. It's important to have routes for this, as you do not want data coming in over the 10GigE pipe, but then returning over the 1GigE pipe, right? That will happen if this is not setup correctly. Make your routes, as the picture example shows with a 10Gig aggragate here we called "Front-end-IP". Any traffic coming in from subnet 172.20.69 will use this pipe.

Lastly, check your multi-homing model button up top. I like 'Adaptive'. Loose is the default, and makes it so your packets can traverse your routes, even though they may go over the wrong route, so it seems like your system is working. This can very well be an illusion. Your ping may work, but it may be coming from the wrong interface, as "Loose" basically means the ZFSSA just doesn't care or enforce any rules. "Strict", on the other hand, is great if you want total enforcement. If you are very good with your routes, and are positive you have it right, and want to ensure that a packet never goes the wrong way, even if that means dropping the packet, then use Strict. I'm using Adaptive here, which is a happy medium.  From the help file: The "Adaptive" choice will prefer routes with a gateway address on the same subnet as the packet's source IP address: 1) An IP packet will be accepted on an IP interface so long as its destination IP address is up on the appliance. 2) An IP packet will be transmitted over the IP interface tied to the route that most specifically matches an IP packet's destination address. If multiple routes are equally specific, prefer routes that have a gateway address on the same subnet as the packet's source address. If no eligible routes exist, drop the packet.

Update 4/23/12- My colleague, Darius (https://blogs.oracle.com/si/), rightfully wanted me to point out how important it was to setup a static route for replication. You do not want replication to go over a private management port by mistake, as this will cause it to fail when one controller or the other goes down for maintenance.

I hope this helps. Routing can be fun. 

Saturday Apr 14, 2012

New SPC2 benchmark- The 7420 KILLS it !!!

This is pretty sweet. The new SPC2 benchmark came out last week, and the 7420 not only came in 2nd of ALL speed scores, but came in #1 for price per MBPS.

Check out this table. The 7420 score of 10,704 makes it really fast, but that's not the best part. The price one would have to pay in order to beat it is ridiculous. You can go see for yourself at http://www.storageperformance.org/results/benchmark_results_spc2
The only system on the whole page that beats it was over twice the price per MBPS. Very sweet for Oracle.

So let's see, the 7420 is the fastest per $.
The 7420 is the cheapest per MBPS.
The 7420 has incredible, built-in features, management services, analytics, and protocols. It's extremely stable and as a cluster has no single point of failure. It won the Storage Magazine award for best NAS system this year.

So how long will it be before it's the number 1 NAS system in the market? What are the biggest hurdles still stopping the widespread adoption of the ZFSSA? From what I see, it's three things: 1. Administrator's comfort level with older legacy systems. 2. Politics 3. Past issues with Oracle Support.  

I see all of these issues crop up regularly. Number 1 just takes time and education. Number 3 takes time with our new, better, and growing support team. many of them came from Oracle and there were growing pains when they went from a straight software-model to having to also support hardware. Number 2 is tricky, but it's the job of the sales teams to break through the internal politics and help their clients see the value in oracle hardware systems. Benchmarks like this will help.

Thursday Apr 12, 2012

Hybrid Columnar Compression

You heard me in the past talk about the HCC feature for Oracle databases. Hybrid Columnar Compression is a fantastic, built-in, free feature of Oracle 11Gr2. One used to need an Exadata to make use of it. However, last October, Oracle opened it up and now allows it to work on ANY Oracle DB server running 11Gr2, as long as the storage behind it is a ZFSSA for DNFS, or an Axiom for FC.

If you're not sure why this is so cool or what HCC can do for your Oracle database, please check out this presentation. In it, Art will explain HCC, show you what it does, and give you a great idea why it's such a game-changer for those holding lots of historical DB data.

Did I mention it's free? Click here:


Monday Apr 02, 2012

New ZFSSA code release - April 2012

A new version of the ZFSSA code was released over the weekend.

In case you have missed a few, we are now on code 2011.1.2.1. This minor update is very important for our friends with the older SAS1 cards on the older 7x10 systems. This 2.1 minor release was made specifically for them, and fixes the issue that their SAS1 card had with the last major release. They can now go ahead and upgrade straight from the 2010.Q3.2.1 code directly to 2011.1.2.1.

If you are on a 7x20 series, and already running 2011.1.2.0, there is no real reason why you need to upgrade to 1.2.1, as it's really only the Pandora SAS1 HBA fix. If you are not already on 1.2.0, then go ahead and upgrade all the way to 2011.1.2.1.

I hope everyone out there is having a good April so far. For my next blog, the plan is to work off the Analytic tips I did last week and expand on which Analytics you want to really keep your eyes on, and also how to setup alerts to watch them for you.

You can read more and keep up on your releases here: https://wikis.oracle.com/display/FishWorks/Software+Updates



Wednesday Mar 28, 2012

Fun tips with Analytics

If you read this blog, I am assuming you are at least familiar with the Analytic functions in the ZFSSA. They are basically amazing, very powerful and deep.

However, you may not be aware of some great, hidden functions inside the Analytic screen.

Once you open a metric, the toolbar looks like this:

Now, I’m not going over every tool, as we have done that before, and you can hover your mouse over them and they will tell you what they do. But…. Check this out.
Open a metric (CPU Percent Utilization works fine), and click on the “Hour” button, which is the 2nd clock icon. That’s easy, you are now looking at the last hour of data. Now, hold down your ‘Shift’ key, and click it again. Now you are looking at 2 hours of data. Hold down Shift and click it again, and you are looking at 3 hours of data. Are you catching on yet?
You can do this with not only the ‘Hour’ button, but also with the ‘Minute’, ‘Day’, ‘Week’, and the ‘Month’ buttons. Very cool. It also works with the ‘Show Minimum’ and ‘Show Maximum’ buttons, allowing you to go to the next iteration of either of those.

One last button you can Shift-click is the handy ‘Drill’ button. This button usually drills down on one specific aspect of your metric. If you Shift-click it, it will display a “Rainbow Highlight” of the current metric. This works best if this metric has many ‘Range Average’ items in the left-hand window. Give it a shot.

Also, one will sometimes click on a certain second of data in the graph, like this:

 In this case, I clicked 4:57 and 21 seconds, and the 'Range Average' on the left went away, and was replaced by the time stamp. It seems at this point to some people that you are now stuck, and can not get back to an average for the whole chart. However, you can actually click on the actual time stamp of "4:57:21" right above the chart. Even though your mouse does not change into the typical browser finger that most links look like, you can click it, and it will change your range back to the full metric.

Another trick you may like is to save a certain view or look of a group of graphs. Most of you know you can save a worksheet, but did you know you could Sync them, Pause them, and then Save it? This will save the paused state, allowing you to view it forever the way you see it now. 

Heatmaps. Heatmaps are cool, and look like this: 

Some metrics use them and some don't. If you have one, and wish to zoom it vertically, try this. Open a heatmap metric like my example above (I believe every metric that deals with latency will show as a heatmap). Select one or two of the ranges on the left. Click the "Change Outlier Elimination" button. Click it again and check out what it does. 

Enjoy. Perhaps my next blog entry will be the best Analytic metrics to keep your eyes on, and how you can use the Alerts feature to watch them for you.


Wednesday Mar 21, 2012

Using all Ten IO slots on a 7420

So I had the opportunity recently to actually use up all ten slots in a clustered 7420 system. This actually uses 20 slots, or 22 if you count the clusteron card. I thought it was interesting enough to share here. This is at one of my clients here in southern California.

You can see the picture below. We have four SAS HBAs instead of the usual two. This is becuase we wanted to split up the back-end taffic for different workloads. We have a set of disk trays coming from two SAS cards for nothing but Exadata backups. Then, we have a different set of disk trays coming off of the other two SAS cards for non-Exadata workloads, such as regular user file storage. 
We have 2 Infiniband cards which allow us to do a full mesh directly into the back of the nearby, production Exadata, specifically for fast backups and restores over IB. You can see a 3rd IB card here, which is going to be connected to a non-production Exadata for slower backups and restores from it.
The 10Gig card is for client connectivity, allowing other, non-Exadata Oracle databases to make use of the many snapshots and clones that can now be created using the RMAN copies from the original production database coming off the Exadata. This allows for a good number of test and development Oracle databases to use these clones without effecting performance of the Exadata at all.
We also have a couple FC HBAs, both for NDMP backups to an Oracle/StorageTek tape library and also for FC clients to come in and use some storage on the 7420.

 Now, if you are adding more cards to your 7420, be aware of which cards you can place in which slots. See the bottom graphic just below the photo. 
Note that the slots are numbered 0-4 for the first 5 cards, then the "C" slots which is the dedicated Cluster card (called the Clustron), and then another 5 slots numbered 5-9.

Some rules for the slots:

  • Slots 1 & 8 are automatically populated with the two default SAS cards. The only other slots you can add SAS cards to are 2 & 7.
  • Slots 0 and 9 can only hold FC cards. Nothing else. So if you have four SAS cards, you are now down to only four more slots for your 10Gig and IB cards. Be sure not to waste one of these slots on a FC card, which can go into 0 or 9, instead. 
  • If at all possible, slots should be populated in this order: 9, 0, 7, 2, 6, 3, 5, 4

Monday Mar 12, 2012

Good papers and links for the ZFSSA

So I have a pretty good collection of links and papers for the ZFSSA, and instead of giving them out one-at-a-time when asked, I thought it may be easier to do it this way. Many of the links from my old blog last May no longer work, so here is an updated list of some good spots to check out.

These are for ZFS, in general, not the ZFSSA, but it gives one good insight to how ZFS functions:

Tuesday Mar 06, 2012

New 7420 hardware released today

Some great new upgrades to the 7420 were announced and released today. You can now get 10-core CPUs in your 7420, allowing you to have 40 cores in each controller. Even better, you can now also go to a huge 1TB of DRAM for your L1ARC in each controller, using the new 16GB DRAM modules.

So your new choices for the new 7420 hardware are 4 x 8-core or 4 x 10-core models. Oracle is no longer going to sell the 2 x CPU models, and they are also going to stop selling the 6-core CPUs, both as of May 31st. Also, you can now order 8GB or 16GB modules, meaning that the minimum amount of memory is now 128GB, and can go to 1TB in each controller. No more 64GB, as the 4GB module has also been phased out (starting today, actually).

Now before you get upset that you can no longer get the 2-CPU model, be aware that there was also a price drop, so that the 4 x 8-core CPU model is a tad LESS then the old 2 x 8-core CPU model. So stop complaining.

It's the DRAM that I'm most excited about. I don't have a single ZFSSA client that I know of that has a CPU bottleneck. So the extra cores are great, but not amazing. What I really like is that my L1ARC can now be a whole 1TB. That's crazy, and will be able to drive some fantastic workloads. I can now place your whole, say 800GB, database entirely in DRAM cache, and not even have to go to the L2ARC on SSDs in order to hit 99% of your reads. That's sweet. 

Friday Feb 24, 2012

New ZFSSA code release today

The first minor release of the 2011.1.1 major release for the ZFSSA came out yesterday.

You can get the code via MOS, under the "Patches and updates" tab. Just click the "Product or Family (advanced)" link, and then type "ZFS" in the search window and it really takes you right to it. Or search on it's patch ID, which is 13772123

Along with some other fixes, the most important piece of this update is the RPC flow control fix, which will greatly help those using the ZFSSA to backup an Exadata over Infiniband. 

If you're not already on the major release of 2011.1.1, I urge you to update to it as soon as you can. You can jump right to this new 2011.1.1.1 code, as long as you are already on 2010.Q3.2.1 or higher. You don't need to go to 2011.1.1 first, just jump to 2011.1.1.1.

If you are using your ZFSSA to backup an Exadata, I urge you to get on 2011.1.1.1 ASAP, even if it means staying late and scheduling special time to do it.

It's also important to note that if you have a much older ZFSSA (one of the 7x10 models that are using the older SAS1 HBAs, and not the SAS2 HBAs), that you do NOT upgrade to 2011.1 code. The latest code that supports your SAS1 systems is 2010Q3.4.2.

 **Update 2-26-12:  I noted a few folks saying the link was down, however that may have been a burp in the system, as I just went into MOS and was able to get 2011.1.1.1 just fine. So delete your cookies and try again. - Steve

Thursday Feb 23, 2012

Great new 7320 benchmark

A great new benchmark has been put up on SPEC for our mid-class 7320. You can see it here:


What's cool about this benchmark is the fact this is not only our middle-sized box, but it used only 136 drives to reach this rather high 134,140 NFS Ops/sec number. If you look at the other systems tested here, you will notice that they must use MANY more drives (at presumably a much higher cost) in order to meet or beat those IOPS.

Check these out here... http://www.spec.org/sfs2008/results/sfs2008nfs.html

For example, a FAS6080 should be far faster then our smaller 7320, right? But it only scored 120,011 even though it used 324 disks. The Isilon S200 with 14 nodes and 679 drives only scored 115,911. I would hate to find out what that system's street price is. I'm pretty sure it's higher then our 7320 with 136 drives. Now, of course all of these benchmark numbers are unrealistic to most people, as they are done in perfect conditions with each manufacture's engineers tuning and tweaking the system the best they can, right? True, but if that's the case, and the other folks tuned and configured those other boxes just like we did, it still seems like a fair fight to me, and our results are just heads and tails above the rest on a cost per IOP basis. I don't see anything on this site that touches our IOPS with the same amount of drives and presumably the same cost price range. Please point out if I missed anything here, I might be wrong.

I really love the ones that go so far overboard on this site... Check out the 140 node Isilon. Let's see... Wow, it's over one million IOPS!!!! That's impressive, until you see it's using 3,360 disk drives. That's funny. PLEASE let me know if you have a 140 node Isilon up and running. I'd love to see it. I'd also love to know what it costs.

Tuesday Feb 07, 2012

Tip- Setting up a new cluster

I haven’t given out a real tip for a while now, but this issue popped up on my last week, so thought I would pass it along. I had a horrible time setting up a new 7320 cluster; for the sole reason that I screwed it up by not doing it in the right order. This caused my install, which should have been done in 1 hour, to take me over 3 hours to complete.

So let me tell you what I did wrong, and then I'll tell you the way I should have done it.

Out of the box, my client's two new 7320 controller heads were one software revision behind, at 2010.Q3.4.2, so I wanted to upgrade them to the newest version of 2011.Q1.1.1. So far, so good, right? Well here was my mistake. I configured controller A via the serial interface, gave it IP numbers, went into the BUI, and did the upgrade to 2011.Q1.1.1. No problem. Now, I wanted to bring the other one up and do the same thing. However, I knew that controller B in a cluster must be in the initial, factory-reset state in order to be joined to a cluster.  You can't configure it, first, or if you do, you must factory-reset it in order to join a cluster. So I bring controller B up, but I don't configure it, and I go to controller A to start the cluster setup process. Big mistake. The process starts, but because the two controllers are on two different software versions, the cluster process cannot continue. This hoses me (that's southern California slang for "messes me up"), because now controller B has started the cluster setup process, and going to the serial connection just has it hung up in a "configuring cluster" state. Rebooting it does not help, as it's still in the "configuring cluster" state once it comes back up.

So.... now I have 2 choices. I can downgrade controller A back to 2010.Q3.4.2, or I can factory-reset controller B, bring it up as a single controller, upgrade it to 2011.Q1.1.1, and then factory reset again, and then finally be able to add it to the cluster via controller A's cluster setup process. I opt for the second choice, as I do not want to downgrade controller A, which is working just fine. Remember, controller B is currently hosed, messed up, or wanked, depending on how you want to say it.
It's stuck. So to get it back to a state I can work with, I need to do the trick I talked about way back in this blog on May 31, 2011 (http://blogs.oracle.com/7000tips/entry/how_to_reset_passwords_on). I had to use the GRUB menu, use the -c trick on the kernel line, and reset the machine and erase all configuration on it. Now I could bring it up as a single controller, upgrade it, factory reset it, and then have it join the cluster. That all worked fine, it just took be two hours to do it all.

Here's what I should have done.

Bring up controller A, config it and log into the BUI. Now bring up controller B. Do NOT config it in any way. Using controller A, setup clustering in the cluster menu.

Once the two controllers are clustered and all is well, NOW go ahead and upgrade controller A to the latest code. Once it reboots, go ahead and upgrade controller B. Everything's fine. You see, if the cluster has already been made, it's perfectly fine to upgrade one controller at a time. The software lets you do that. The software does NOT let you setup a NEW cluster if the controllers are not on the same software level. 

So that is the cluster setup safety tip of the day, kids. Have fun. 

Tuesday Jan 31, 2012

New Power Calculator is up

The Oracle Power Calculator for the new 3TB, 600GB, and 300GB drive versions of the ZFSSA is now up and running.


From this page, you can click on the "Power Calculators" link on top to go back out to the main screen where you will find power calculators for all of Oracle hardware. 

Friday Jan 20, 2012

New Storage Magazine awards for NAS... Check this out...

Well, it's hard to be quiet about this. Storage Magazine just came out with the January 2012 issue, showing Oracle Storage doing quite well (#1) with the Oracle ZFSSA 7420 and 7320 family. Check out pages 37-43 of this month's Storage Magazine.

Storage Magazine: http://docs.media.bitpipe.com/io_10x/io_103104/item_494970/StoragemagOnlineJan2012final2.pdf (pages 37-43)



This blog is a way for Steve to send out his tips, ideas, links, and general sarcasm. Almost all related to the Oracle 7000, code named ZFSSA, or Amber Road, or Open Storage, or Unified Storage. You are welcome to contact Steve.Tunstall@Oracle.com with any comments or questions


« May 2016