August 9, 2007

Fencing - yet again

Sheesh. It is amazing to me how this topic continues to spin. Clearly people just like speculate, and I suppose a little controversy can serve to energize, but this topic seems to have taken on a life of its own. The real question in my mind is why do you care? Fencing is a core functionality of the cluster infrastructure. You cant control it, or influence it in any way. It has to be there in some form, or bad things will happen (corruptions being one of them). And if the particular fencing implementation in Oracle clusterware was fundamentally flawed, it would have been exposed long ago over the course of the 5+ years of existence, and the thousands of deployments.


So any discussion of fencing and the Oracle implementation is purely theoretical, and largely academic, since it has more than proven itself. ok. I enjoy lively academic or theortical technical debate, particularly over a beer or 2, but not at the expense of ignoring reality. So lets pull apart the discussions Ive seen, and address them point by point. Note that this discussion is focused solely on Oracle Clusterware used in conjunction with RAC.


Oracle Clusterware uses the Stonith algorithm. This is only partially true. Oracles fencing mechanism is based on the Stonith Algorithm. However, there is no general design rule of how that algorithm should be implemented. Strict use of the algorithm is complicated, or perhaps even prevented, by the fact that there is no API on many platforms for doing a remote power-off reset of the system. So the current implementation is in fact a suicide, as opposed to an execution. As system/OS vendors makes such APIs available, Oracle will be able to make use of them.


Suicide is not reliable because you are expecting an already unhealthy system to respond to some other directive. Sure. There are corner cases where this is a possibility, but these have proven to be very rare, they have been fixed when they appeared, and the real underlying concern, which is exposure to data corruption, in non-existent (see next point). This issue is actually related to the FUD we often see about some cluster managers running in Kernel mode vs user space where Oracle Clusterware runs. Well ... If the OS kernel is misbehaving, then it doesnt really matter where the clusterware runs - bad things are going to happen. (Weve seen this occur in several situations.) If someone makes a programming error in the clusterware code and it is running in kernel mode, then the OS kernel is exposed. (This is theoretical since Oracle clusterware does not run in kernel mode, but its not like this hasnt happened before in other envrionments where user/application code is allowed to run in kernel space). And lastly, if running in userspace, and other user space programs misbehave, then the obvious concern in the sensitivity the cluster has to that misbehaving application - like not being able to get CPU time to communicate in a timely manner. We have certainly seen this kind of scenario many times, but in general it is easily mitigated by renicing or increasing the priority of the key background communication processes. Bottom line is that suicide has proven sufficiently reliable. Any claim to the contrary is pure speculation.


Because suicide is unreliable, you are exposed to data corruptions. Not true. Either in theory, or in practice. Its no secret RAC does unbuffered IO (bypasses the OS cache), and any IO done in a RAC environment is in complete coordination with the other nodes in the cluster. Cache fusion assures this. And this holds true in a split brain condition. If RAC cant coordinate the write with the other nodes as a result of interconnect failure, then that write is put on hold until communication is restored, or until the eviction protocol is invoked.


This is obviously over simplified, but frankly, so are the criticisms in this area. The challenge to any non-believer is the following: Find me a repeatable test case where interconnect failure, and the resulting fencing algorithm implemented in Oracle clusterware, results in database corruption. If you are successful, I will:


1. Fall off my chair in disbelief


2. Write : They were right, I was wrong, 1000 times in my blog, and apologize profusely to anyone who may have taken offence to the claims made in this posting.


3. File a bug, and get the damn thing fixed.


Now that I think about it, it would probably be prudent to reverse 2. and 3. Note however, that in the off chance you are successful, it is a bug, and will be fixed as such. As opposed to a fundamental architectural flaw.


So lets put this one to bed. Next topic.

June 27, 2007

Who's afraid of the big bad RAC?

ok. This was a topic suggested by a reader. So lets see where it takes us.


People tend to fear what they dont know, dont understand, or what they perceive is a direct threat to them. Certainly, RAC introduces a bunch of new concepts, and technology components. So there is unquestionably a learning curve and that could invoke some fear, unless you are naturally adventurous. Im actually not certain how you can survive in this industry without being adventurous and open to learning new things almost every day, but I suppose that is a different topic. In terms of a threat, we (Oracle) probably did a bit of a disservice to customers by not fully respecting traditional job boundaries when we packaged and released 10g. In 9i, it was just RAC, as an option to the database, and nothing else. It required the cluster infrastructure (hw, and sw) to be fully in place and functional, prior to installing 9iRAC.


With 10g, we bundled our own clusterware, added in ASM, and shipped it all on the Oracle CD to the DBA. Now, instead of just learning RAC, the DBA had to learn cluster management which was historically the domain of sysadmins, ASM - which storage administrators percieved as a threat, and added a healthy dose of network administration for the VIPs and interconnect, which encroached into the domain of the network admin. This all worked just fine if you are a DBA in medium sized shop with a sysadmin that also does network and storage admin, and who also happens to be your weekend paintball buddy. Unfortunately, it flies much like a lead balloon if you are in a large shop with highly compartmentalized job functions and organization structures, and the inevitable politics and communication challenges that exist across organizational boundaries. So in many cases, fear of RAC is/was not totally unjustified.


So how to overcome it. First the knowledge gap/learning curve. It is impossible to know everything in this industry. So there is no crime in not knowing something. But there is fault in not recognizing what you dont know, and if necessary, taking steps to address  it. In the case of RAC, the things that are different from single instance Oracle are the shared, concurrent access storage, the interconnect infrastructure, and the concurrent access cluster framework. At a high level, not a big deal, but not something that should be trivialized from a budgetting and project planning perspective. It costs time and money to acquire the knowledge in these areas, both in terms of detailed implementation, and in terms of operational management. This is no different than the adoption of ANY new technology - whether it be in the new home you are building, or in your IT shop. Yet this issue is consistently not addressed, or insufficiently addressed, in the vast majority of implementations we see. There are many ways to acquire the knowledge - classroom training, enaging consulting expertise that have done it before, inhouse testing or proof of concepts, online research and discussion or special interest groups, etc. (1 obvious word of caution - dont believe everything you read - there is alot of miss-information out there, particularly in the RAC area).


Second, the threat aspect, and crossing job boundaries. Today, there is no really clean way of doing this, other than to establish a working team that consists of the necessary skill sets - system, database, storage, network, and work from there. In some shops, we have seen the sysadmins just leave it to the DBAs, because they have dedicated database clusters, and they dont want to learn anything Oracle. In other cases, the sysadmins have been adamant about learning, owning, and operating the Oracle cluster infrastructure. In most cases, when the storage administrators understand ASM, they are more than happy leaving that in the DBA domain - since it generally does make both of their lives easier. In upcoming releases, we are taking steps to create documentation that respects more traditional job boundaries, and presents the Oracle supplied technology in terms familiar to the specific domains.


In summary, there is nothing to fear. No guts, no glory. Recognize what you dont know. Build a plan to address it. Get management commitment to fund it. The technology is not perfect today, but once youre over the learning curve, you cant help but appreciate the engineering that has gone into creating a protocol for database cache synchronization across multiple machines. Scalability beyond the capacity of a single machine - it could well be database nirvana. )

June 13, 2007

Who are the RAC Pack?

Interesting that in the first 24 hrs of having this blog in place, I already had a few pings with suggestions for discussion topics. Maybe there really is pent-up demand. Keep the suggestions coming, and I will attempt to get to them.


I thought Id start with an obvious question. Its not so much that people really ask this question, but I see the term RAC Pack used in so many different contexts that are really out of context. I was reading a recent Gartner paper where they referred the RAC Pack as essentially Oracles set of RAC customers. LOL. That was a first. Its funny how a name like RAC Pack can seem to take on a life of its own. Its been quite entertaining to see how any issue even remotely related to RAC (architectural, implementation-related, sales positioning, trouble-shooting, etc.), would result in someone saying - we need to call the RAC Pack. Kinda like GhostBusters - Who Ya Gonna Call?. Im sure there is a lesson rooted on Marketing 101 in there somewhere - something like Simplicity Sells.


So the RAC Pack is a relatively small team of product specialists in Oracle Server Technologies. The team was formed shortly after RAC was first released with 9i back in 2001, and consisted of about 6 of us, under the leadership of Sohan Demel, as part of Angelo Pruscinos Cluster and Parallel Storage development org. The concept for the team was very simple, and based on the premise that reference selling is the shortest path to getting deals done, and growing revenue. So the charter for the team was to work directly with customers to establish real-world proof points for the technology. By offering assistance to customers directly from the Development organization, we were able to mitigate some of the obvious risk associated with adopting a first release of a new product, and in return the customers are willing to be referencable and tell others about their experiences with this new piece of technology. And while we dont write the code, we work side-by-side with the people that do write the code, so we have the luxury of not only being able to help customers successfully implement the technology, we are able to understand the details of how the code actually works, and get things fixed fairly quickly when things break.


So the team and the charter have evolved somewhat since then. We are now a team of 21 people, spread around the globe. There is a group that provides a Business Development Function, managed by Greg LeMay, and a group of us that provide an engineering and technical delivery function. And as you might have guessed, as the team has grown, it is now me that owns the engineering function. In addition to working with customers to continually raise the bar in terms of real-world proofpoints (bigger databases, higher transaction volumes, larger numbers of nodes or CPUs, latest releases, etc ...), we actively provide feedback back into the development teams based on the customer experiences, we help train the Support, Consulting, and Sales organizations, and perform a variety of testing functions, typically through the beta cycle. And of course the product stack in 9i was simply RAC, running on top of the various platforms. But in 10g the Oracle cluster stack has grown to include Oracle Clusterware and ASM, as well as RAC, and most customers also want to go beyond that and implement things like Grid Control, or DataGuard, or PQ, etc - so you can quickly see why the team has grown the way it has. The net result though, is a team of highly skilled, cluster database technology specialists. And while some of the skill sets may overlap with people in other LoBs, the function we perform is entirely distinct from Support, Consulting, and technical Pre-sales roles.


I think because we are part of Development, and the bulk of our customer and field interactions are very technically focused, people more commonly think of the RAC Pack as the engineering team. This is actually a bit unfortunate since the business development function is truly a critical success factor that takes the technical deliverables, and turns them into gold that other customers can use to create their own success, and that Oracle can leverage to continue to drive revenue. Without it, the engineering function would simply be a cost of Development with no tangible benefit.


Let me know if Ive missed any details, and Ill be happy to fill them in.


 

April 19, 2007

First Post



For this first entry,  I thought I would take a minute to provide an  overview of what I would like to discuss in this forum. FWIW, Im not a habitual blogger, but this blogging thing is actually quite an interesting vehicle. There are the usual, more formal vehicles like Press Releases, or Analyst Briefings, CVC presentations, or conference presentations that can be used to broadcast a message or some information. The problem with these, of course, is that they are relatively formal - in that they are editted, scrutinized, reviewed, censored, or whatever, before they are actually published. So smaller news items, pieces of interesting information, or perhaps more controversial discussion topics, are left without a communication vehicle. And in the course of our day-to-day activities, I often come across topics for which there is no easy outlet beyond my immediate team. So we will see how this goes.


I would like to discuss issues that came up at customer visits, provide feedback on comments that come from reactions to my postings, and provide some hints and tips on architectures, configurations, and technologies. Additionally, there is alot of mis-information out there in the RAC and/or GRID space- some of which is intentional FUD, some of which is just honest misunderstandings, and perhaps we can use this to clear some of that up. My hope is that we can have both technical entries and some higher-level type of conversations. Ask away, and I will do my best to respond.


More to come in the near future....

About

My Profile

Top Tags

Categories

Powered by
Movable Type and Oracle