Saturday Apr 14, 2012

New SPC2 benchmark- The 7420 KILLS it !!!

This is pretty sweet. The new SPC2 benchmark came out last week, and the 7420 not only came in 2nd of ALL speed scores, but came in #1 for price per MBPS.

Check out this table. The 7420 score of 10,704 makes it really fast, but that's not the best part. The price one would have to pay in order to beat it is ridiculous. You can go see for yourself at http://www.storageperformance.org/results/benchmark_results_spc2
The only system on the whole page that beats it was over twice the price per MBPS. Very sweet for Oracle.

So let's see, the 7420 is the fastest per $.
The 7420 is the cheapest per MBPS.
The 7420 has incredible, built-in features, management services, analytics, and protocols. It's extremely stable and as a cluster has no single point of failure. It won the Storage Magazine award for best NAS system this year.

So how long will it be before it's the number 1 NAS system in the market? What are the biggest hurdles still stopping the widespread adoption of the ZFSSA? From what I see, it's three things: 1. Administrator's comfort level with older legacy systems. 2. Politics 3. Past issues with Oracle Support.  

I see all of these issues crop up regularly. Number 1 just takes time and education. Number 3 takes time with our new, better, and growing support team. many of them came from Oracle and there were growing pains when they went from a straight software-model to having to also support hardware. Number 2 is tricky, but it's the job of the sales teams to break through the internal politics and help their clients see the value in oracle hardware systems. Benchmarks like this will help.

Thursday Apr 12, 2012

Hybrid Columnar Compression

You heard me in the past talk about the HCC feature for Oracle databases. Hybrid Columnar Compression is a fantastic, built-in, free feature of Oracle 11Gr2. One used to need an Exadata to make use of it. However, last October, Oracle opened it up and now allows it to work on ANY Oracle DB server running 11Gr2, as long as the storage behind it is a ZFSSA for DNFS, or an Axiom for FC.

If you're not sure why this is so cool or what HCC can do for your Oracle database, please check out this presentation. In it, Art will explain HCC, show you what it does, and give you a great idea why it's such a game-changer for those holding lots of historical DB data.

Did I mention it's free? Click here:

http://hcc.zanghosting.com/hcc-demo-swf.html

Monday Apr 02, 2012

New ZFSSA code release - April 2012

A new version of the ZFSSA code was released over the weekend.

In case you have missed a few, we are now on code 2011.1.2.1. This minor update is very important for our friends with the older SAS1 cards on the older 7x10 systems. This 2.1 minor release was made specifically for them, and fixes the issue that their SAS1 card had with the last major release. They can now go ahead and upgrade straight from the 2010.Q3.2.1 code directly to 2011.1.2.1.

If you are on a 7x20 series, and already running 2011.1.2.0, there is no real reason why you need to upgrade to 1.2.1, as it's really only the Pandora SAS1 HBA fix. If you are not already on 1.2.0, then go ahead and upgrade all the way to 2011.1.2.1.

I hope everyone out there is having a good April so far. For my next blog, the plan is to work off the Analytic tips I did last week and expand on which Analytics you want to really keep your eyes on, and also how to setup alerts to watch them for you.

You can read more and keep up on your releases here: https://wikis.oracle.com/display/FishWorks/Software+Updates

Steve 

 

Wednesday Mar 21, 2012

Using all Ten IO slots on a 7420

So I had the opportunity recently to actually use up all ten slots in a clustered 7420 system. This actually uses 20 slots, or 22 if you count the clusteron card. I thought it was interesting enough to share here. This is at one of my clients here in southern California.

You can see the picture below. We have four SAS HBAs instead of the usual two. This is becuase we wanted to split up the back-end taffic for different workloads. We have a set of disk trays coming from two SAS cards for nothing but Exadata backups. Then, we have a different set of disk trays coming off of the other two SAS cards for non-Exadata workloads, such as regular user file storage. 
We have 2 Infiniband cards which allow us to do a full mesh directly into the back of the nearby, production Exadata, specifically for fast backups and restores over IB. You can see a 3rd IB card here, which is going to be connected to a non-production Exadata for slower backups and restores from it.
The 10Gig card is for client connectivity, allowing other, non-Exadata Oracle databases to make use of the many snapshots and clones that can now be created using the RMAN copies from the original production database coming off the Exadata. This allows for a good number of test and development Oracle databases to use these clones without effecting performance of the Exadata at all.
We also have a couple FC HBAs, both for NDMP backups to an Oracle/StorageTek tape library and also for FC clients to come in and use some storage on the 7420.

 Now, if you are adding more cards to your 7420, be aware of which cards you can place in which slots. See the bottom graphic just below the photo. 
Note that the slots are numbered 0-4 for the first 5 cards, then the "C" slots which is the dedicated Cluster card (called the Clustron), and then another 5 slots numbered 5-9.

Some rules for the slots:

  • Slots 1 & 8 are automatically populated with the two default SAS cards. The only other slots you can add SAS cards to are 2 & 7.
  • Slots 0 and 9 can only hold FC cards. Nothing else. So if you have four SAS cards, you are now down to only four more slots for your 10Gig and IB cards. Be sure not to waste one of these slots on a FC card, which can go into 0 or 9, instead. 
  • If at all possible, slots should be populated in this order: 9, 0, 7, 2, 6, 3, 5, 4




Monday Mar 12, 2012

Good papers and links for the ZFSSA

So I have a pretty good collection of links and papers for the ZFSSA, and instead of giving them out one-at-a-time when asked, I thought it may be easier to do it this way. Many of the links from my old blog last May no longer work, so here is an updated list of some good spots to check out.

These are for ZFS, in general, not the ZFSSA, but it gives one good insight to how ZFS functions:


Tuesday Mar 06, 2012

New 7420 hardware released today

Some great new upgrades to the 7420 were announced and released today. You can now get 10-core CPUs in your 7420, allowing you to have 40 cores in each controller. Even better, you can now also go to a huge 1TB of DRAM for your L1ARC in each controller, using the new 16GB DRAM modules.

So your new choices for the new 7420 hardware are 4 x 8-core or 4 x 10-core models. Oracle is no longer going to sell the 2 x CPU models, and they are also going to stop selling the 6-core CPUs, both as of May 31st. Also, you can now order 8GB or 16GB modules, meaning that the minimum amount of memory is now 128GB, and can go to 1TB in each controller. No more 64GB, as the 4GB module has also been phased out (starting today, actually).

Now before you get upset that you can no longer get the 2-CPU model, be aware that there was also a price drop, so that the 4 x 8-core CPU model is a tad LESS then the old 2 x 8-core CPU model. So stop complaining.

It's the DRAM that I'm most excited about. I don't have a single ZFSSA client that I know of that has a CPU bottleneck. So the extra cores are great, but not amazing. What I really like is that my L1ARC can now be a whole 1TB. That's crazy, and will be able to drive some fantastic workloads. I can now place your whole, say 800GB, database entirely in DRAM cache, and not even have to go to the L2ARC on SSDs in order to hit 99% of your reads. That's sweet. 

Friday Feb 24, 2012

New ZFSSA code release today

The first minor release of the 2011.1.1 major release for the ZFSSA came out yesterday.

You can get the code via MOS, under the "Patches and updates" tab. Just click the "Product or Family (advanced)" link, and then type "ZFS" in the search window and it really takes you right to it. Or search on it's patch ID, which is 13772123

Along with some other fixes, the most important piece of this update is the RPC flow control fix, which will greatly help those using the ZFSSA to backup an Exadata over Infiniband. 

If you're not already on the major release of 2011.1.1, I urge you to update to it as soon as you can. You can jump right to this new 2011.1.1.1 code, as long as you are already on 2010.Q3.2.1 or higher. You don't need to go to 2011.1.1 first, just jump to 2011.1.1.1.

If you are using your ZFSSA to backup an Exadata, I urge you to get on 2011.1.1.1 ASAP, even if it means staying late and scheduling special time to do it.

It's also important to note that if you have a much older ZFSSA (one of the 7x10 models that are using the older SAS1 HBAs, and not the SAS2 HBAs), that you do NOT upgrade to 2011.1 code. The latest code that supports your SAS1 systems is 2010Q3.4.2.

 **Update 2-26-12:  I noted a few folks saying the link was down, however that may have been a burp in the system, as I just went into MOS and was able to get 2011.1.1.1 just fine. So delete your cookies and try again. - Steve

Thursday Feb 23, 2012

Great new 7320 benchmark

A great new benchmark has been put up on SPEC for our mid-class 7320. You can see it here:

http://www.spec.org/sfs2008/results/res2012q1/sfs2008-20120206-00207.html

What's cool about this benchmark is the fact this is not only our middle-sized box, but it used only 136 drives to reach this rather high 134,140 NFS Ops/sec number. If you look at the other systems tested here, you will notice that they must use MANY more drives (at presumably a much higher cost) in order to meet or beat those IOPS.

Check these out here... http://www.spec.org/sfs2008/results/sfs2008nfs.html

For example, a FAS6080 should be far faster then our smaller 7320, right? But it only scored 120,011 even though it used 324 disks. The Isilon S200 with 14 nodes and 679 drives only scored 115,911. I would hate to find out what that system's street price is. I'm pretty sure it's higher then our 7320 with 136 drives. Now, of course all of these benchmark numbers are unrealistic to most people, as they are done in perfect conditions with each manufacture's engineers tuning and tweaking the system the best they can, right? True, but if that's the case, and the other folks tuned and configured those other boxes just like we did, it still seems like a fair fight to me, and our results are just heads and tails above the rest on a cost per IOP basis. I don't see anything on this site that touches our IOPS with the same amount of drives and presumably the same cost price range. Please point out if I missed anything here, I might be wrong.

I really love the ones that go so far overboard on this site... Check out the 140 node Isilon. Let's see... Wow, it's over one million IOPS!!!! That's impressive, until you see it's using 3,360 disk drives. That's funny. PLEASE let me know if you have a 140 node Isilon up and running. I'd love to see it. I'd also love to know what it costs.

Tuesday Feb 07, 2012

Tip- Setting up a new cluster

I haven’t given out a real tip for a while now, but this issue popped up on my last week, so thought I would pass it along. I had a horrible time setting up a new 7320 cluster; for the sole reason that I screwed it up by not doing it in the right order. This caused my install, which should have been done in 1 hour, to take me over 3 hours to complete.

So let me tell you what I did wrong, and then I'll tell you the way I should have done it.

Out of the box, my client's two new 7320 controller heads were one software revision behind, at 2010.Q3.4.2, so I wanted to upgrade them to the newest version of 2011.Q1.1.1. So far, so good, right? Well here was my mistake. I configured controller A via the serial interface, gave it IP numbers, went into the BUI, and did the upgrade to 2011.Q1.1.1. No problem. Now, I wanted to bring the other one up and do the same thing. However, I knew that controller B in a cluster must be in the initial, factory-reset state in order to be joined to a cluster.  You can't configure it, first, or if you do, you must factory-reset it in order to join a cluster. So I bring controller B up, but I don't configure it, and I go to controller A to start the cluster setup process. Big mistake. The process starts, but because the two controllers are on two different software versions, the cluster process cannot continue. This hoses me (that's southern California slang for "messes me up"), because now controller B has started the cluster setup process, and going to the serial connection just has it hung up in a "configuring cluster" state. Rebooting it does not help, as it's still in the "configuring cluster" state once it comes back up.

So.... now I have 2 choices. I can downgrade controller A back to 2010.Q3.4.2, or I can factory-reset controller B, bring it up as a single controller, upgrade it to 2011.Q1.1.1, and then factory reset again, and then finally be able to add it to the cluster via controller A's cluster setup process. I opt for the second choice, as I do not want to downgrade controller A, which is working just fine. Remember, controller B is currently hosed, messed up, or wanked, depending on how you want to say it.
It's stuck. So to get it back to a state I can work with, I need to do the trick I talked about way back in this blog on May 31, 2011 (http://blogs.oracle.com/7000tips/entry/how_to_reset_passwords_on). I had to use the GRUB menu, use the -c trick on the kernel line, and reset the machine and erase all configuration on it. Now I could bring it up as a single controller, upgrade it, factory reset it, and then have it join the cluster. That all worked fine, it just took be two hours to do it all.

Here's what I should have done.

Bring up controller A, config it and log into the BUI. Now bring up controller B. Do NOT config it in any way. Using controller A, setup clustering in the cluster menu.

Once the two controllers are clustered and all is well, NOW go ahead and upgrade controller A to the latest code. Once it reboots, go ahead and upgrade controller B. Everything's fine. You see, if the cluster has already been made, it's perfectly fine to upgrade one controller at a time. The software lets you do that. The software does NOT let you setup a NEW cluster if the controllers are not on the same software level. 

So that is the cluster setup safety tip of the day, kids. Have fun. 

Tuesday Jan 31, 2012

New Power Calculator is up

The Oracle Power Calculator for the new 3TB, 600GB, and 300GB drive versions of the ZFSSA is now up and running.

http://www.oracle.com/us/products/servers-storage/sun-power-calculators/calc/s7420-power-calculator-180618.html

From this page, you can click on the "Power Calculators" link on top to go back out to the main screen where you will find power calculators for all of Oracle hardware. 

Friday Jan 20, 2012

New Storage Magazine awards for NAS... Check this out...

Well, it's hard to be quiet about this. Storage Magazine just came out with the January 2012 issue, showing Oracle Storage doing quite well (#1) with the Oracle ZFSSA 7420 and 7320 family. Check out pages 37-43 of this month's Storage Magazine.

Storage Magazine: http://docs.media.bitpipe.com/io_10x/io_103104/item_494970/StoragemagOnlineJan2012final2.pdf (pages 37-43)

award

Thursday Jan 12, 2012

New ZFSSA simulator download

I've just been informed that the simulator download has been updated to the latest version of 2011.1.1.

So instead of trying to upgrade your older simulator, it is possible to download and install the new one at the latest code. Mine upgraded just fine, but some people report errors during upgrading, which occurs when using a computer or laptop without enough memory or a variety of other problems. You can get the simulator here:

http://www.oracle.com/webapps/dialogue/ns/dlgwelcome.jsp?p_ext=Y&p_dlg_id=10521841&src=7299332&Act=45

Tuesday Jan 10, 2012

Even more ZFSSA announcements

The new announcements for the ZFSSA just keep on coming.

Oracle has released today the 3TB drives for the 7420 and 7320 disk trays. So you now can choose 2TB and 3TB 7,200 RPM drives and 300GB and 600GB 15,000 RPM drives in your 7420 and 7320 systems.

Now, the 2TB drive have a last order date of May 31, 2012, so after that it will be 3TB only for the slower-speed drives.

Also, has anyone checked out the new local replication feature that just came out in the 2011.1.1 software release? I'm going to play with it this week and I'll do a write up on it soon.

Steve 

Thursday Jan 05, 2012

New ZFSSA firmware release is available in MOS

 

In case you have not been paying attention, the new 2011.1.1.0 software release for the ZFSSA is out and available for download inside the My Oracle Support website.

To find it, go to the "Patches & Updates" tab, and then do the advanced family search. Type in "ZFSSA" and it will take you right to it (choose 2011.1 in the next submenu).

You need to have your systems on 2010.3.2.1 or greater in order to upgrade to 2011.1.1, so be prepared.

It also includes a new OEM grid control plug-in for the ZFSSA.

Here are some details about it from the readme file: 

Sun ZFS Storage Software 2011.1.1.0 (ak-2011.04.24.1.0)This major software update for Sun ZFS Storage Appliances contains numerous bug fixes and important firmware upgrades. Please carefully review all release notes below prior to updating.
Seven separate patches are provided for the 2011.1.1.0 release:

Features

This release includes a variety of new features, including:

  • Improved RMAN support for Oracle Exadata
  • Improved ACL interoperability with SMB
  • Replication enhancements - including self-replication
  • InfiniBand enhancements - including better connectivity to Oracle Exalogic
  • Datalink configuration enhancements - including custom jumbogram MTUs
  • Improved fault diagnosis - including support for a variety of additional alerts
  • Per-share rstchown support

Performance

This release also includes major performance improvements, including:

  • Significant cluster rejoin performance improvements
  • Significant AD Domain Controller failover time improvements
  • Support for level-2 SMB Oplocks
  • Significant zpool import speed improvements
  • Significant NFS, iSER, iSCSI and Fibre Channel performance improvements due to elimination of data copying in critical datapaths
  • ZFS RAIDZ read performance improvements
  • Significant fairness improvements during ZFS resilver operations
  • Significant Ethernet VLAN performance improvements

Bug Fixes

This release includes numerous bug fixes, including:

  • Significant clustering stability fixes
  • ZFS aclmode support restored and enhanced
  • Assorted user interface and online help fixes
  • Significant ZFS, NFS, SMB and FMA stability fixes
  • Significant InfiniBand, iSER, iSCSI and Fibre Channel stability fixes
  • Important firmware updates

Wednesday Dec 28, 2011

New Storage Eye Charts

My new Storage Eye Chart is out. You can get it from the bookmark link on the right-hand side of this page.

Version 10 adds the Axiom and 2500M2 to a new page and also updates the ZFSSA with the new updates.

I hope everyone out there has a very happy New Year. See you in January. 

Tuesday Dec 06, 2011

New SSDs announced today

Thought you should know about the 3 new announcements for the ZFSSA.

--Write-flash-cache SSDs have gone from 18GB to 73GB each.
--New long-range transceivers for the 10GigE cards are now available
--3TB drives for the 7120 model are here today. The 3TB drives for the 7320 and 7420 are NOT here yet, but close.


Overview
Effective December 6, 2011, we are pleased to announce three new options for Oracle’s Sun ZFS Storage portfolio:
1. Availability of a 73GB Write Flash Cache for 7320 and 7420.  This new SSD features 4X the capacity and almost double the write throughput and IOPS performance of its
predecessor.  In comparison to the current 18GB SSD, this new 73GB SSD significantly enhances the system write speed.  As an example, a recent test on a particular 7420
system demonstrated a 7% improvement in system write performance while using half the number of SSDs.  The 73GB SSD is also available to our customers at a lower list
price point.  This is available as an ATO or X Option.
2. Availability of the standard Sun 10 GbE Long Range Transceiver for the current 1109A-Z 10GbE card as a configurable option for ZFS Storage Appliance.  This Long Range Transceiver enables 10 GbE optical connectivity for distances greater than 2 KM.
3. Availability of a new 7120 base model featuring integrated 11 x 3TB HDDs and a 73GB Write Flash Cache.  (Note that availability of the 3TB drive is limited to the 7120 base model internal storage only – it is not available in the disk shelves at this time.)


Additionally, we are announcing End-of-Life for the following two items:
1. 2TB drive-equipped base model of the 7120, with a Last Order Date of December 31, 2011.
2. 18GB Write Flash Cache, with a Last Order Date of January 10, 2012.

Thursday Oct 20, 2011

Shadow Migration

Still not talking about VDEVs? I know, I know, but hey, there's only so many hours in a day, folks, and I do have a life... So something came up this week and I want to talk about Shadow Migration, instead.

Now, built-into the ZFSSA you have both Replication and Shadow Migration. Be sure to use the right one for the right job. Replication is used from one 7000 family system to a different 7000 system. This is important: It can NOT be used on two clustered controllers of the same system. That will mess you up. It is only for other 7000's, and can not replicate to anything other than another ZFSSA. ***UPDATE- This is no longer the case. Replication inside the same system between two clustered controllers has been supported since October 2012.

Shadow Migration, on the other hand, is really handy for both migrating the data from any, non-ZFSSA, NFS source (think from a filer made by someone other than Oracle), or even from a different pool between controllers on the SAME clustered ZFSSA system. This can be very cool when you have an important share on one pool, and you want to move it (and the data inside it) to a different pool. Maybe it's because you want it on your RAIDz2 pool instead of your Mirrored pool. Maybe it's because you want ControllerA in charge of the share but it got made months ago by mistake in the pool owned by ControllerB. I don't care, you just want data from some share, either local to the system or from a NFS share on a different system, to come over into a brand-new share in some pool. Maybe you want to suck in the data from an older, non-Oracle filer, but you know it will take a while, and you want people to be able to still get to the data while the migration is taking place.

Great. That's Shadow Migration. It can get data from both a local source (another share of the same system) or from any NFS mount from anywhere. While the migration is taking place, the original source turns read-only, and users start to mount and use your new share being created. If the data being requested by a user has not been migrated over yet, the ZFSSA will go get it, while continuing to migrate in the background.

Here's how to do a Local Shadow Migration, moving data from a share in one pool to another pool on the same system.

1. Check out the Shadow Migration Service. Under Services, one can change how many threads the background service will use to do the migration. Make sure the green light is on here, while you're at it. **Update: I have been told that our internal team took this down from 8 to 2 for our large (13PB) migrations from various older filers to new 7000s for our Oracle data center. Oracle IT and our Oracle DC is now 100% ZFSSA. 

2. I have a share called Share1A, inside Pool1, which is a mirrored pool. Note that I have about 85MB of stuff in it.
Be careful NOT to choose the replication area from here, or at all, from anywhere. You're not doing replication, remember? 
Do not confuse replication with shadow migration.

3. Now, I don't want that data inside pool1, I really want it in Pool2, which is a RAIDz1 pool. So, switch to Pool2, and create a brand-new share, just like normal.
Change pools with the Pools drop-down in the upper left, then click the plus sign.

 4. Now, in the new Share box, first choose the pool you want the new share to be in, and then be sure to choose "LOCAL" as your data migration source.
Instead of typing in the path to some external NFS share, you will type in the local path of another share on the same system, in this case it's "/export/Share1A"

5. Now it gets cool. Check out my new Shadow1 share. As the migration begins (right away), you will see the progress bar here on the left. You can actually stop it, and even change the source from here, mid-stream (although that would be strange and I don't think I would recommend that).  ***Update: To be fair, it was explained to me that this process may take a while to start. The process may have to read a large amount of metadata before you see the bar move. If you have very large directories in the share, especially at the top, then be patient.

6. When the migration is done (The Local version should go quite quickly), the Shadow Migration section goes away, and you will get an alert message on the top of the screen like this:

7. Also, you can view some Shadow Migration specific Analytics while it's running:

8. Now that it's done, I have 2 shares. My original Share1A, and my new Shadow1 in a different pool with the same data copied over.
I could now delete the first share or pool in order to rebuild the pool a different way. Or, if this was a migration from an older filer, I could re-purpose that filer as a nice planter in my garden.


Wednesday Oct 12, 2011

ARC- Adaptive Replacement Cache

I know, I know, I told you I was going to talk about the very important VDEVs next, but this other article came up in another blog, and it’s a rather good read about the ZFSSA cache system, called our ARC, or Adaptive Responsive Cache.

So, if you want to learn more about the ARC in a ZFSSA, go check it out. Our ARC has two levels. Level 1 ARC is our RAM. Almost the entire RAM in a ZFSSA is used for data caching, and that’s the ARC, or L1ARC. Now, we go further by having a L2ARC. Once RAM is full, our L2ARC can hold even more cache by using any Readzillas you have in the system. That’s right; our Readzillas SSDs are the L2ARC. We use SSDs for cache, not as storage space. (Logzillas, on the other hand, are for fast synchronous write acknowledgements, and have nothing to do with ARC at all).

So a 7420 with 512GB memory and four Readzillas has about a 500GB L1ARC and a 2TB L2ARC to use as an Adaptive Responsive Cache to work with. 500GB of that 2.5TB of space will be nano-second speed while 2TB of it will be micro-second speed. Still much faster than the milli-second speed you get when you have to get data off a hard drive.

So Cache is cool, and it’s nice to have a high cache hit ratio, and it’s easier to have a high cache hit ratio if you have more cache, right? With the new, lower priced Readzillas, this should be easier to do.

Now, this other blog I’m pointing you to says we call our cache something else, but don’t worry about it, we use the name “Adaptive responsive Cache” in the Oracle ZFSSA world.

Go check out:
http://www.c0t0d0s0.org/archives/5329-Some-insight-into-the-read-cache-of-ZFS-or-The-ARC.html

Ok, VDEVs will come next!

Maybe.

Steve

Tuesday Oct 11, 2011

Where can you find info on updates?

Someone asked where one could find info on what is updated in each update. Once you download any update from MOS, there is a readme file inside of it with this info.

However, if you want to see the readme file first, go here:
http://wikis.sun.com/display/FishWorks/Software+Updates

Thursday Oct 06, 2011

New ZFSSA code release today - 2010.Q3.4.2

A new code was released on MOS today. We are now on code 2010.Q3.4.2. (ak-2010.08.17.4.2)

Our minimum recommended version is still 2010.Q3.4.0, but if you have the time and opportunity to upgrade to this new Q3.4.2 release, it would be a very good idea. It includes many minor bug fixes. You can view the readme file it comes with to see what it includes.

Download it under the "patches & upgrades" tab in My Oracle Support. 

Tuesday Oct 04, 2011

How to calculate your usable space on a ZFSSA

So let’s say you’re trying to figure out the best way to setup your storage pools on a ZFSSA. So many choices. You can have a Mirrored pool, a RAIDz1, RAIDz2, or RAIDz3 pool, a simple striped pool, or (if you’re REALLY anal) you can even have a Triple Mirrored pool.

How can you choose which pool to make? What if you want more than one pool on your system? How much usable space will you have when it’s all done?

All of these questions can be answered with Ryan Mathew’s Size Calculator. Ryan made a great calculator a while back that allows one to use the ZFSSA engine to give you back all sorts of pool results. You simply enter how many disk trays you have, what size drives they are, how many pools you want to make, and the calculator does the rest. It even shows you a nice graphical layout of your trays. Now, it’s not as easy as a webpage, but it’s not too bad, I promise. It’s a python script, but don’t let that scare you. I never used Python before I got my hands on this calculator, and it was worth loading it up for this. First, you need to go download and install Python 2.6 here: http://www.python.org/getit/releases/2.6/ Make sure you have 2.6 installed, as the calculator will not work with the newer 3.0 Python. In fact, I had both loaded, and had to completely uninstall 3.0 before it would work with my installed 2.6.

Now, get your hands on the Size Calc script. Ryan is making a new one that is for the general public. It will be out soon. In the meantime, ask your local Oracle Storage SC to do a calculation for you.

This is a copy from Ryan’s, but I fixed a few things to make it work on my Windows 7 laptop. If you’re not using Windows 7, you may find Ryan’s original blog and files here: http://blogs.oracle.com/rdm/entry/capacity_sizing_on_7x20

So now you’re ready. Go to a command line and get to the Python26 directory, where you have also placed the “size3.py” script.

Type “size3.py ZFSipaddress password 20”
Use your ZFSSA for the IP address and your root password for the password. You can use the simulator for this. Remember, the simulator is the real code and has no idea it's not a 'real' system.

Mine looks like this: “Size3.py 192.168.56.102 changeme 20” Now, you will see the calculator present a single tray with 20 drives, and all the types of pools you can make with that.

So now, make it bigger. Along with the first tray that has 20 drives (because of the Logzillas, right?), we also want to add a 2nd and a 3rd tray, each full with 24 drives. So type “Size3.py 192.168.56.102 changeme 20 24 24”  You could do this all day long. Notice that now you have some extra choices, as the NSPF (no single point of failure) pools are now allowed, since you have more than two trays.

That’s it for the basics. Pretty simple. Now, we can get more complicated. Say you don’t want one big pool, but want to have an active/active cluster with two pools. Type “Size3.py 192.168.56.102 changeme 10/10 12/12 12/12”


This will create two even pools. They don’t have to be even. Check this out. I want to make two pools, one with the first 2 disk trays with 8 logzillas plus half of full trays 3 and 4. So the second pool would only be the other half of trays 3 and 4. I used “Size3.py 192.168.56.102 changeme 20/0 20/0 12/12 12/12”

Here’s the last one for today- Say you already have a 2-disk shelf system, with 2 pools, and you set it up like this: “Size3.py 192.168.56.102 changeme 10/10 12/12” Simple. Now, you go out and buy another tray of 24 drives, and you want to add 12 drives to each pool. You can use the “add” command to add a tray onto an existing system. It’s very possible that adding a tray will give you different results than if you configured 3 trays to begin with, so be careful. This is a good example. Note that you get different results if you do “10/10 12/12 12/12” then if you do “10/10 12/12 add 12/12”.

Our next lesson will be about VDEVs. When you add the “-v” command right after “size3.py”, you may notice a new column in the output called “VDEVS”. These are the most important aspect of your pool. It’s very important to understand what these are, how many you need and how many you have.

It’s so important, I’m going to save it for another blog topic. Have a great day!!!! J

Monday Oct 03, 2011

New SPC benchmark for the 7420

Oracle announced today a great new benchmark on SPC (Storage Performance Council) for our 7420. Instead of re-writing everything already written, please go see this excellent blog entry by Roch at http://blogs.oracle.com/roch/entry/fast_safe_cheap_pick_3

It explains the new results and why they're so cool.

Go to http://www.storageperformance.org/results/benchmark_results_spc1 to see the results. Scroll down to the "O" section for oracle, and the 7420 results is the first one.

Friday Sep 30, 2011

Hooray! The HCC announcement came out today...

Check this out.

http://www.oracle.com/us/corporate/press/508020

The 7000 now supports Oracle 11gR2 HCC feature. This is something that, until now, you could only get inside an Exadata. Now, one can use HCC and see a huge savings not only in space used, but also in performance, so long as your database is being stored on Oracle hardware families of ZFSSA 7000 or the Axiom 600.

Very cool. No license. It's ready to go.

Friday Sep 16, 2011

Resetting your 7420 password

So, you have a 7420 demo or test system, and either forgot which password you used or it came from another department or company and it still has a password.

You're up a creek, right?

No, there is a way to fix it. We had to do this with a demo box that was wiped clean of all data, but still had a non-standard password on it. We could have sent it back for the demo pool engineers to fix, but then we would have had to wait a week or more to start testing. So here is a document I made with the steps I took to fix both the ILOM password and the Fishwork Appliance Kit password on this 7420.

Disclaimer--- Yes, physical security is VERY IMPORTANT. If someone can touch your system, they can do this. Of course, if someone can touch your system, they can also unplug it, remove the drives, or un-rackit and take it, couldn't they? 

http://blogs.oracle.com/7000tips/resource/Reset7420password.pdf

Monday Sep 12, 2011

New Q3.4.1 code out now for the 7000 ZFSSA family

Code release Q3.4.1 is available now on MOS for download.

This is very exciting news, even if you don't need to upgrade. Why? Because Q3.4.1 contains NO bug fixes. The only thing it does is allows your system to use the new drive trays with 15K speed drives.

You can mix-n-match drive trays in the same system, and create a new, faster ZFS pool using the 15K drives in the same system you already have. You do not want to mix the high-capacity, slower drives with these in the same pool, however. Make sure they are different pools to be most effective. Not everyone needs these faster drives to drive their performance, as the ZFSSA really tries to drive performance with it's huge amount of L1 and L2 cache. However, some people run workloads that either fill up the cache, or by-pass the Logzillas during writes altogether, and the faster, 15K speed spindles will matter a lot to them.

Steve 

Tuesday Aug 02, 2011

Important - Clean your OS drives. Before you update your systems.

Some folks out there are not reading the readme file before they update, and are running into trouble. Remember the "Important Safety Tip" line in Ghostbusters about not crossing the streams? This is kind of like that.

Have you cleaned up your OS drives lately?

It's important. You should do this every now and then, but especially before you do a big update, such as the update to Q3.4 (which may have you running multiple updates, as explained in my previous post)

You want to remove system updates that are getting old. You may want to keep the previous system software in order to roll back, but do you really need the one before that and the one before that? Are you really going to roll back your 7000 to the system code you used in September 2010? I doubt it. Don't be a hoarder. Delete it. Call Dr. Drew if you need to.
Then, let's check your analytic datasets. These can get big before you know it. If you leave them running all the time, even when you don't need to collect data, you're asking for trouble. Either only run them when you need to collect data (you can do this manually or via a script or alert), or for goodness sake export them as a .CSV file at the end of each month and delete them off the system so they don't get too large. There have been issues reported of problems with the systems once these files got much too large. Part of those issues have been addressed in Q3.4, but you still need to keep it clean.

Very bad things will happen if you let your OS drives fill up. Think of all the atoms of your body moving away from each other at the speed of light. Furthermore, when you do an update, the update will go much more smoothly if you clean up the OS drives, first.
See the readme section below, which I'm sure you forgot to read...

The following section comes from the Q3.4 readme file:

1.4 System Disk Cleanup

Remove any old unused Analytics datasets especially any over 2G in size. The following command can be used to list the datasets.

node:> analytics datasets show
Datasets:

DATASET STATE INCORE ONDISK NAME
dataset-000 active 745K 2.19M arc.accesses[hit/miss]
dataset-001 active 316K 1.31G arc.l2_accesses[hit/miss]
dataset-002 active 238K 428K arc.l2_size
dataset-003 active 238K 428K arc.size
dataset-004 active 1.05M 2.80M arc.size[component]
dataset-005 active 238K 428K cpu.utilization

Also remove old support bundles or old software updates. It's important that the system disk has at least 20 to 25 percent free space. The following commands can be used to compare the amount of free space to the system disk size. In this case, 289G free / 466G size = 0.62 or 62% free space, which is reasonable. If you have trouble getting enough system disk free space call Oracle Support.

node:> maintenance system disks show
Properties:
                       profile = mirror
                          root = 4.44G
                           var = 14.3G
                        update = 1.12G
                         stash = 1.38G
                          dump = 131G
                         cores = 29.7M
                       unknown = 15.0G
                          free = 289G

Disks:

DISK        LABEL      STATE
disk-000    HDD 1      healthy
disk-001    HDD 0      healthy

node:> maintenance hardware select chassis-000 select disk select disk-000 show
Properties:
                         label = HDD 0
                       present = true
                       faulted = false
                  manufacturer = SEAGATE
                         model = ST95000NSSUN500G
                        serial = 9SP123EQ
                      revision = SF03
                          size = 466G
                          type = data
                           use = system
                        device = c2t0d0
                     interface = SATA
                        locate = false
                       offline = false
 
 

Friday Jul 08, 2011

Oracle DB high availability paper for ZFSSA

I read a great whitepaper today on best practices for snapshots and clones of an Oracle database on a ZFSSA.

If you use Oracle software, and are looking at the ZFSSA to store it, you want to check out this paper.  

**Sorry, I had the wrong link... Fixed... Thanks to whomever pointed that out**

http://www.oracle.com/technetwork/database/features/availability/maa-db-clone-szfssa-172997.pdf

clones

Tuesday Jun 21, 2011

New price reduction on some ZFSSA parts

For those of you keeping track, you may have noticed a price reduction today on the ZFSSA Readzillas, Logzillas, and spinning drives. Ask your friendly neighborhood Sales Consultant for a new quote.

Sweet.

 

Wednesday Jun 08, 2011

Upgrade to Q3.3.1 notes -

Ok, so there is a good reason why you folks want to upgrade. These upgrades fix some great bugs that other clients may have found, and you just have been lucky not to have had effect you yet. Another reason is that at some point when you DO want to upgrade, you may be too far behind to upgrade directly to the newest version. Check out this screenshot. In trying to upgrade to Q3.3.1, the update informs me that I won't be able to do this until I upgrade to Q3.2.0.

Just something to be aware of, so you can plan for additional time if you need to upgrade twice during your maintenance window.

 

 

Monday Jun 06, 2011

New Q3.3.1 code- DID YOU KNOW?

Did you know you can also upgrade your 7000 Simulator to the new 2010.Q3.3.1 code that came out last week?
Remember, your 7000 simulator is the real software of a ZFSSA. The only thing being simulated is the hardware.

If you're having a hard time finding the download in MOS, when you get into MOS, just click on the "Patches & Updates" tab, and type 12622199 in the "Patch Name or Number" search box. That will take you right to it. Thanks to Jon B for suggesting this tip.

 

About

This blog is a way for Steve to send out his tips, ideas, links, and general sarcasm. Almost all related to the Oracle 7000, code named ZFSSA, or Amber Road, or Open Storage, or Unified Storage. You are welcome to contact Steve.Tunstall@Oracle.com with any comments or questions

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today