Friday Feb 22, 2013

How to Configure the Linux Kernel's Out of Memory Killer

source

Operating systems sometimes behave like airlines. Since the airlines know that a certain percentage of the passengers won't show up for their flight, they overbook the flights. As anyone who has been to an airport in the last 10 years knows, they usually get it wrong and have to bribe some of us to get on the next flight. If the next flight is the next morning, we get to stay in a nice hotel and have a great meal, courtesy of the airline.

That's going to be my lodging strategy if I'm ever homeless.

Linux kernel does something similar. It allocates memory to its processes ahead of time. Since it knows that most of the processes won't use all the memory allocated to them, it over-commits. In other words, it allocates a sum total of memory that is more than it actually has. Once in a while too many processes claim the memory that the kernel promised them at the same time. When that happens, the Linux kernel resorts to an option that the airlines wish they had: it kills off processes one at a time. In fact, it actually has a name for this function: the out-of-memory killer.

Robert Chase explains.

How to Configure the Out of Memory Killer

Robert Chase describes how to examine your syslog and how to use the vmstat command for clues about which processes were killed, and why. He then shows you how to configure the OOM killer to behave the way you prefer. For instance, you can make certain processes less likely to be killed than others. Or more. Or you can instruct the kernel to reboot instead of killing processes.

More Oracle Linux Resources

- Rick

Follow me on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!)

Thursday Feb 21, 2013

Can You Figure Out Which Teenager Took the Cash?

source

Dads like me are familiar with a phenomenon known as Silent Dollar Disappearance. This tends to occur when there is a confluence of money in your wallet and teenage children in your home. You never actually see it happen, but if you are paying attention, you might detect that it has happened. As when, for instance, you try to pay for beer and brats at the grocer. It becomes difficult to know for sure whether it was the teenagers. What if you already spent the money on something else? That's what my teenage daughters always said. Or perhaps you had a wallet malfunction, and it flew out. So difficult to pin-point the actual cause.

Linux, like any OS, is vulnerable to a similar phenomenon. It's called silent data corruption. It can be caused by faulty components, such as memory modules or storage systems. It can also be caused by -God forbid- administrative error. As with Silent Dollar Disappearance, it's difficult to detect when data corruption is actually happening. Or what the exact cause was. But, as with Dads and teenagers, you eventually figure out that it has happened.

It may be impossible to identify the culprit after the data has been corrupted, but it's not impossible to stop the culprit ahead of time. Oracle partnered with EMC and Emulex to do just that. And they were kind enough to explain how the did it and how you can benefit. In this article:

Preventing Silent Data Corruption in Oracle Linux

An excerpt ...

"Data integrity protection is not new. ECC and CRC are available on most, if not all, servers, storage arrays, and Fibre Channel host bus adapters (HBAs). But these checks protect the data only temporarily within a single component. They do not ensure that the data you intended to write does not become corrupt as it travels down the data path from the application running in the server to the HBA, the switch, the storage array, and then the physical disk drive. When data corruption occurs, most applications are unaware that the data that was stored on the disk is not the data that was intended to be stored.

"Over the last several years, EMC, Emulex, and Oracle have worked together to drive and implement the Protection Information additions to the T10 SBC standard, which enables the validation of data as it moves through the data path to ensure that silent data corruption does not occur."

Interesting stuff. Give it a read.

- Rick

Follow us on:
Blog | Facebook | Twitter | YouTube

(psst! and don't forget to follow the Great Peruvian Novel!)

About

Contributors:
Rick Ramsey
Kemer Thomson
and members of the OTN community

Search

Archives
« February 2013 »
SunMonTueWedThuFriSat
     
1
2
3
4
6
8
9
10
12
13
14
16
17
20
23
24
25
27
28
  
       
Today
Blogs We Like