Tuesday May 20, 2008

Solaris 10 Throughput tested 50 \* Greater than Linux !! (Article)

The recent article from Enterprise Linux News demonstrates a customer that has migrated production systems to Solaris 10.. formerly being an ALL Linux shop.

The resons noted include :

  • Advanced Lights-Out Manager (ALOM) that enables them to manage the server without the need to shut it down first.
  • Zones (Containers) Partitioning, where they adopted Zones to create separate environments on the same machine.
  • The ZFS File system where they implemented ZFS to facilitate programs that perform massive file caching.
  • and MOST importantly .... Solaris 10 had a throughput that was 50 times better than production #'s using Fedora Linux on AMD cpu's.

 

One additional item that should be noted is that the target platform tested was using Sun's T1 cpu, which offers 8 cores, each with 4 HW threads of execution (aka.. 32 HW threads within one CPU).

Sun's latest Coolthreads/CMT CPU's (T2's) offer roughly double the performance of the former T1 cpu's (up to 8 cores per cpu, each with 8 HW threads = 64 threads/cpu).. and above and beyond single socket 5120 and 5220 T2 (cpu) based systems, Sun now offers dual socket T2 systems including the T5140 and T5240 systems (totaling up to 128 HW threads of execution in a single 1U or 2U system !!!).

 To read the article in it's entirety click here :

http://searchenterpriselinux.techtarget.com/news/article/0,289142,sid39_gci1313798,00.html

Wednesday Nov 07, 2007

Processors and Performance : Chips, MIPS, and Sizing blips..

The following post is a close proximity of an article that I published in this month's Sun "Technocrat" (November 2007) issue. Hopefully you'll enjoy this discussing regarding the past and present relationship of CPU's and architecture to system performance.

In today's fast paced world of ever increasing demands for system throughput, the foundation of discussion and expectations typically all hinge upon the same topic.. CPU performance.   This article will be an examination of CPU's and system architecture, as they relate to performance and capacity planning as a whole.   From our last discussion "The Many Flavors of System Latency ..." , we will extend the context to focus on past and present competing aspects of system/CPU architecture,  including a brief history of how we got to the current competitive landscape we find ourselves in today.  (The photo to the left is of a Sun T1 "coolthreads" 8 core, 32 thread cpu)

CISC vs. RISC

Going back to the early days of microprocessor design (and also likely a familiar topic from your Computer Science Bachelor's curriculum), you will recall much conversation, speculation, and competition surrounding two competing approaches to CPU architecture :  CISC vs. RISC.

In a nutshell, CISC (Complex Instruction Set Computers) designs emphasize the use of "complex" instructions within the HW to minimize the amount of Assembly code (SW) required.  Other benefits are that compilers don't need to be as complex, as well as CISC CPU's requiring less RAM to store instructions.  However, this approach sometimes requires more than one clock cycle of a processor to complete processing a complex instruction.

RISC (Reduced Instruction Set Computers), just as the name refers, offer a reduced number of simple instructions that can complete execution within a single clock cycle, but which might require multiple instructions for a complex operation such as "multiply".  The RISC approach has the nice side-effect of also requiring less chip "footprint" (in # of transistors, etc..)  reserving more die space for memory registers, while also typically offering streamlined execution within the same window of execution time as CISC counterparts.  Modern compilers (such as Sun's Studio 12 line) offer significant performance benefits that should not be overlooked, especially critical when developing and compiling binaries on RISC based architecture (Sun has seen performance optimized benefits of 200+ % when using the latest releases of Sun Studio Compilers vs. generic "gcc" compiled code).

Today, the market is segmented in more or less the same 2 camps, but they are divided down the lines of "x86" compatible CPU's (modern CISC designs from the Intel and AMD) and the modern RISC competition from Sun (SPARC) and IBM (Power), where HP and DEC (now gone altogether) have stepped back from RISC manufacturing.

Is there more to Moore's Law ?

For nearly the past 30+ years, we have seen processor performance double according to Moore's Law (approximately every 2 years), a movement from Uniprocessor based systems to those that required scalability vertically, into multi-processor systems that we've become familiar with in the Unix world, better known as SMP (Symmetric MultiProcessor) systems.   It's funny that Gordon Moore made his original claim in a 1965 article of electronics magazine originally regarding the trend of Integrated Circuit component counts doubling every year, only loosely predicting this trend might continue until 1975, though he was uncertain that any future projections could be made (he later changed his prediction in 1970 to "doubling every 2 years", which has held relatively steady ever since).   Since past trends of IC / CPU transistor counts have closely correlated to CPU performance, the association of Moore's Law to performance  was made.

In order to accommodate this rapid rate of growth in processor performance, the number of transistor's contained within a CPU has been just one of the key characteristics that have climbed (in addition to clock speeds, etc..).   Amazing as it may sound, but 45nm (nano-meter) manufacturing fabrication is expected by Intel and others hopefully in 2008 (when just 10 years ago we had 500nm manufacturing) .  Even at the levels that we were at in 2004 with 90nm manufacturing, the width of a transistor (50nm across) was 1/2 the diameter of a single celled influenza virus !   Given that we are approaching atomic dimensions of gate thickness (at < 1nm), the pace of transistor density and clock frequency increases (using current designs) appears to be approaching the physical limitations of manufacturing.  In addition to these concerns, power and related heat issues have become very pronounced in today's "green" computing campaigns.  Luckily, Sun has dealt a hand worthy of industry recognition that includes CPU innovations to keep us progressing forward, however addressing more than simple "transistor counts", but more aptly the efficiency of moving an SMP  like architecture "onto" a piece of silicon, hence.. a system on a chip (which is essentially what we have with the T2).

How many ways can you weave those THREADS ?

SMP systems required an Operating System kernel that had a means of fairly sharing the CPU resources among the processes awaiting execution within the system (via priority-based scheduling classes).  This capability within Unix for "time slicing" between "runnable" processes and available kernel and physical CPU resources on systems hinges on the kernel dispatcher associating processes with "light weight processes" that in turn get bound to kernel "Threads", in order to be run within the "context" of physical processor registers and an execution pipeline (aka, HW Threads).  For the most part, this is how today's Solairs OS functions, along with the appropriate dose of preemption and locking mechanisms.

Over the past decade, and along with the increased demands of internet traffic (and associated application workloads), applications have gradually become better able to scale vertically within systems, primarily through the use of SW multi-threading.   Within a multi-threaded application, many threads of execution can run simultaneously across available CPU's within a system, allowing for an application to scale as close to "linearly" as possible (doubling the application throughput as the # of available CPU's doubled).

    THREAD (of Execution) :
  • noun          One of many software "threads" of execution that can be processed simultaneously on a computer system.
    Linear Scalability :
  • noun          The ability to increase system performance (throughput) at the same rate that resources (cpu's,..)are added.

CPU's and Memory :  UMA and NUMA

Modern computing systems also offer a "shared memory" model that has very specific performance and latency related characteristics, depending upon the system design (physical system interconnect type [bus/crossbar], memory management controllers, proximity to physical cache/RAM, etc..), the Operating System Kernel memory management (memory mgmt libraries, JVM Garbage collection, etc..), as well as the Application SW execution characteristics and memory requirements (cache hit ratio's [Instruction vs. Data Cache], TLB miss characteristics, RAM requirements, ..).

Among the modern types of vertically scalable parallel computer architectures available for multiprocessor systems, one of the most common high performance designs has become NUMA/ccNUMA (Non Uniform Memory Access / cache coherent NUMA).  Sun's E25K systems fall within this category, as do most large systems that offer vertical scalability of many independent CPU / Memory boards, acting together and communicating across a shared system interconnect (backplane/centerplane/crossbar).

Previously, local proximity of memory and the latency associated with accessing that memory was uniform and predictable.  With the advent of system growth beyond single system (cpu/memory) boards, UMA (Uniform Memory Access) could no longer be guaranteed.  One common performance issue that must be addressed within large NUMA environments is the aggregate impact of both physical memory proximity, alongside the gap between processor speed and memory latency (see below).   Solaris addressed this issue (with a Solaris 9 update) by introducing MPO (Memory Placement Optimization) that associates memory physically closer to a cpu to minimize the additional cross interconnect memory latency (this couples Solaris and the underlying HW along with Cache Coherency within ccNUMA architectures).  Note, other optimizations have been used to address large Sun Enterprise system centerplane latencies, including kernel cage splitting, removal (if DR isn't required), and the introduction of S10 enhancements.   <\*"busstat" can be used to diagnose these issues\*>

Lucky for us, we are quicly approaching system architectures that once again allow for UMA designs, offering memory access predictability (with the T2 and related offerings moving forward, where density of cpu cores/threads brings greater multithreaded capacity within a small footprint).  However, the current state of the industry isn't quite as lucky if you look across the bow of the competition, and within our client production environments.

The growing CPU - Memory gap ...


As you can see from the diagram to the left, over the past several years as microprocessor design has moved rapidly, doubling the performance and clockspeed of CPU's roughly every 2 years, the same increases have NOT been matched within the Memory arena.  This "wait" time that threads of execution must incur, coupled with the additional latency required for electricity to travel greater distances to access memory across the system interconnect (not physically local to a CPU, but rather on another CPU board or memory bank) can impact processor efficiency and overall system / application throughput dramatically.   (this slide and the next are from http://www.OpenSparc.net )

Chip Multi-Processing + Hardware Multi-Threading = Chip Multi-Threading

Realizing the implications of the memory latency "lag" (and somewhat against the tide of relying upon ever-increasing CPU clock speed increases), several years ago Sun made the decision to address this with an acquisition of Afara Websystems to bolster it's processor lineup and chart the industry's new course toward multi-core CPU's.


From the diagram on the left, it can be seen that Sun's adoption of both CMP (US-IV/+), aka Chip level Multi-Processing (having multiple cores per cpu, each with an execution pipeline), alongside HMT .. Hardware Multi-Threading (adding a multi-threaded execution pipeline within a core/cpu), shores up and nearly eliminates the issue of idle CPU cycles waiting on memory operations (by offering many physical threads of simultaneous execution within a CPU).  This is the foundation of what Sun calls CMT (Chip Multi-Threading), which is reflected in Sun's CPU roadmap for both the Niagara (T1 and T2 CPU's), Rock CPU, as well as the Sun/Fujitsu Olympus CPU (as a follow-on to the US-IV+).

Why so much $cache ?

In order to further minimize system latency and ensure peak performance, modern architectures include Memory Management and CPU memory access mechanisms "on-chip", such as the L1 / L2 cache and MMU located in Sun's CMT products.  Below is a block diagram of Sun's latest "T2" SPARC Core Architecture (based on the 64 bit SPARC V9 instruction set).

    Critical components for optimal kernel CPU / memory performance:
          • Level 1 (L1) Data Cache / Instruction Cache ..
          • Level 2 (L2) Secondary Cache, on-chip and shared for Sun CMT processors
          • Level 3 (L3) <only Sun's US-IV+ cpu offers L3 cache on-chip>
          • I-TLB (L1 / L2) Instruction -Translation Lookaside Buffers
          • D-TLB (L1 / L2) Data -Translation Lookaside Buffers
          • Buffer cache (Filesystem Kernel "page cache", taken from the "free list" of available RAM; this is a kernel structure)
Note:  The kernel memory page "size" is determined in part by the CPU chosen, since x86/x64 CPU's
                (Intel/AMD) offer 4K pagesizes, while the UltraSparc I through IV... offer 8K / 64K /
                512K / 4MB page sizes.   (\*monitor with the pagesize, trapstat, cpustat, and pmap ..\*)

Examining the key attributes of your system's Workload as Requirements  :

Once again, in order to select the appropriate architecture for a production deployment, the all-inclusive entity that we need to examine in it's entirety is the "Application Environment", all of it's subsystems, as well as individual /concurrent workload characteristics :
  • Single Threaded vs. Multi-Threaded Applications (OLTP, DSS, HPC, Web/AppSvr, .. ?)
  • Compute / HPC Intensity (MPI, horizontally scaled compute farm requirements, ..)
  • Network I/O (long-lived connections vs. short-lived; large vs. small packets, # inbound RX pkts/sec..)
  • Memory Intensive Workload (shared / distributed memory req's, etc)
  • Storage Workload (R/W %'ages, Cache cfg, # Controller Interrupts/sec, # Files opened, shared FS,etc.)
  • Integer vs. Floating Point Calculations (T1's are not well suited to FP workloads)
  • 32 vs. 64 bit needs/benefits (address space needs beyond 4GB RAM ?)
  • Do SLA/SLC reqmt's focus on Throughput, BandWidth (IO/Net/Mem), Availability, and/or Response Times ?


Choosing the right CPU for your Workload :

Sun SPARC based CMP / CMT CPU's :

  • US-IV+ :  Well suited for very large vertically scaled configurations where the 32MB of L3 cache makes a big difference, such as DB Servers (many of these environments have large single threaded processes, including batch/OLTP).  Sun's E25K systems scale up to 72 CPU's per single domain (144 cores).
  • Sun / Fujitsu SPARC64 (US-VI Olympus) :  Follow-on to the US-IV+ CPU.  (~1.5\* performance of a US-IV+)  This CPU should have higher clockspeeds starting at 2.15GHz, with an additional HW thread/core, but no L3 cache on-chip.  Note that the up-coming CMT ROCK chip from Sun in 2008 will fall within this segment.
  • T1 (Niagara 1) :  Well suited for small to medium sized Multi-Threaded Workloads (that don't have much /any FP processing.  Best for : WebSvrs, AppSvrs, DNS, etc..).  Each single socket system offers up to 32 HW Threads of execution.
  • T2 (Niagara 2) :  Well suited for medium to large Multi-Threaded Workloads.  These systems should also offer good general purpose computing performance, given the addition of FGU's per CPU core, along with built-in 10G Ethernet, etc..  Each 5x20 system presently offers a single socket with up to 64 HW Threads.  Look for multi-socket systems based up on the T2 cpu in the not so distant future ;)

Sun's "World Class" T2 (Niagara 2) CPU :

At a glance, the new T2 CPU offering from Sun is a true "system on a chip" that lives up to it's reputation as "the worlds fastest processor" (the new world record benchmarks listed further down in this article can attest to the validity of that statement and show how Sun's latest CMT CPU's are changing the landscape of computing efficiency, part of the reason why Sun calls this "CoolThreads" and/or "Throughput" computing).

T2 CPU Highlights :
    • 8 cores \* 8 Threads each = 64 Threads of Execution
    • 65nm, initially running @ 1.4 GHz
    • 8 Floating Point / Graphics Units (FGU's, one per core)
    • on-chip Crossbar providing : 180 GB/s R + 90GB/s W
    • built in 2\* 10Gb Ethernet, MMU, Encryption, etc...

The following table provides a high level comparison of Sun's T2 and T1 CPU's  :

(for the complete Microprocessor Review report on the T2, click here)


(For further details comparing the T2 to other recent Sun CPU's for #transistors, etc., click here)
(For photos inside the new Sun T5x20 systems, based upon the T2 cpu, click here)

Solaris kernel (CPU related) Performance Metrics and Utilities :

The following table is only listed as a "high level" sample of common metrics frequently used as part of Solaris CPU-related performance analysis.  This is by no means a comprehensive list of metrics available, but rather an introduction for those that aren't familiar with the essentials.   An up-coming set of blogs will include much more detailed examples with command line output, also including discussions of kstat and Dtrace visibility available.

Note: \*  vmstat, cpustat, trapstat, intrstat  reflect system-wide statistics, while  mpstat, cputrack reflect per CPU statistics. \*

Metric
             Description  
Utility
Run Queue
 Kernel Threads Runnable, but not executing   (best if 0, or at most < # cores)
vmstat (r)
Blocked Kthr
 Blocked Kernel Threads             (typically ID's an IO bottleneck, see also %wt, lockstat,.. )
vmstat (b)
System Calls
 Number of System Calls (calls made into the OS kernel, accounting towards %Sys)
vmstat (sys)
Interrupts
 Number of System Interrupts per interval (interrupts have the highest priority on the system)
vmstat (in)
% CPU (U/S/I)
 % CPU utilization (% User space / % System kernel / % Idle);  % User  should be  2\*  % Sys
vmstat
Cross Calls
 Per CPU Cross-Calls (either for cross processor interrupts, and/or maintaining cpu virtual memory translation consistency .. aka cache consistency with MME and mapping TLB entries, etc.)
mpstat (xcal)
Cpu Interrupts
 Per CPU Interrupts   (also use intrstat, as well as lockstat for system correlation)
mpstat (intr)
Context Switches
 Involuntary context switching (icsw reflects preemption..) vs. voluntary context switching (csw)
mpstat (i/csw)
CPU Migrations
 Per Cpu Migrations .. A more inclusive migration off of and onto another CPU.
mpstat (migr)
Shared Mutex
 Mutex exclusion lock activity (per cpu)  p/lockstat gives the best visibility of this activity.
mpstat (smtx)
% CPU Waiting
 % of a single CPU spent Waiting (during the sampling interval).  See also  b kthr.
mpstat (%wt)
 Instr TLB Misses
 % of MMU related Instruction Translation Lookaside Buffer Misses (see also  pagesize, pmap, cpustat..)
trapstat -t/T
 Data TLB Misses
 % of MMU related Data Translation Lookaside Buffer Misses (see note above as pgsize is related)
trapstat -t/T
  CPU Counters
VARIOUS CPU specific HW event counters (Cache, Instruction level, FP, TLB; man cputrack for your HW specific counters available)
cputrack
  CPU Counters
 VARIOUS System Wide CPU event Counters (man cpustat for your HW specific counters)
cpustat
  BUS Statistics
 Available System Specific Bus Device / Instance Counters & Events (use busstat -l  for your HW)
busstat
 kernel Statistics
 ALL kernel statistics are available individually via kstat  (module:instance:name:class)
 kstat

\*\*NOTE: if you'd like to try a single Solaris utility that can run in minutes to automate the performance / workload correlation and reporting for you, take a look at sys_diag if you haven't already done so already (or the README).  It includes both high-level (vmstat, mpstat, iostat, netstat, kstat, ...) snapshot and analysis, as well as Deep analysis mode which includes extended Dtrace /dexplorer and lockstat probing.   (all output is summarized and color coded in an HTML report header/ Dashboard with a Table of Contents for analysis details) \*\*


Common CPU Benchmarks and What they mean :

The following list provides a set of definitions and examples for some of today's most common independent (industry accepted) computing benchmarks.

  • SPEC CPU2006  :  CPU-intensive benchmark suite, stressing a system's processor, memory subsystem and compiler. SPEC designed CPU2006 to provide a comparative measure of compute-intensive performance across the widest practical range of hardware.  This benchmark suite includes both the SPEC int_rate2006 and SPEC fp_rate2006 benchmark tests.
NOTE that SPEC is an independent, non-profit 3rd party benchmarking organization, providing the comparative examples that follow.


IBM System p570
HP ProLiant DL360 G5
SPECint_rate2006
78.5
60.9
61.3
SPECfp_rate2006
62.3
58
38.8

NOTE:  For the benchmark example above, and those that follow, this shows how Sun's latest CMT T2 cpu is changing the landscape of computing efficiency, part of the reason that Sun calls this "CoolThreads" and/or "Throughput" computing.
  • SPEC jbb2005 :  SPECjbb2005 (Java Business Benchmark) measures the performance of a Java implemented application tier (server-side Java). The benchmark is based on the order processing of a wholesale supplier application. The metrics given are number of SPECjbb2005 bops (Business Operations per Second) and SPECjbb2005 bops/JVM (bops per JVM instance). 

IBM p6 570
HP 2660
Dell 2950
Space (RU)
1
4
2
2
Power Consumption (Watts)
464
560
563
300
Performance (BOPS/JVM)
170,153
87,737
80,884
74,218
Performance / Watt
366.7
156.7
143.7
247.4
SWaP
366.7
39.2
71.8
123.7
  • SPEC jAppServer2004  :  SPECjAppServer2004 is the only industry-standard benchmark used for Java Enterprise Edition application servers. In addition to testing application server performance, it also tests the database performance of servers deployed to support the application tier.
Database Tier, SPECjAppServer2004 2-Node Comparison Table

HPrx2660
Dell 2900
IBMp5+550
Space (RU)
1
2
5
4
Power Consumption (Watts)
338
559
350
770
Performance (SPECjApp JOPS)
2,000.92
874.17
652.95
1,197.51
Performance / Watt
5.2
1.6
1.9
1.6
SWaP
5.2
0.8
0.4
0.4

A word on SWaP

Given the state of environmental (global warming concerns), not to mention Power, Cooling, and floorspace costs, Sun has created the SWaP metric to compare and reflect the relative performance when taking into account the "space" (Rack Units), as well as "power consumption" (Watts).

The calculation is :      SWaP    =       Performance (operations or transactions per interval)
                                                            Space (RU)  x    Power Consumption (Watts)

Sun Benchmarks and Comparative Methodology :

For the purpose of internal only comparative benchmarking, Sun provides and maintains(internal-only) results for AMP v.2 and M-value benchmarks.

As stated very clearly at both sites (from the URL's noted above) :

The most appropriate way to size a Sun server for a specific application is to engage Sun's Competency Centers.  These centers will recommend a system using their knowledge about the application performance on Sun systems, and based on specific information from the customer.

Over the past 12 years with Sun, on several occations I've been brought into a mission critical production environment having performance issues just after going live.   The reasons which were most commonly the cause of this include :

  • NOT doing any type of actual Pre-Production Testing on the "target" configuration to be deployed, including :
    • NOT doing any sort of formal POC (Proof of Concept) with the target configuration to be purchased or migrated to.  A proof of concept is typically not an all-out formal benchmark effort, but would minimally give you the opportunity to run conduct Functional Testing, in addition to some simulated "production like" load tests against the configuration, that could demonstrate that the staged target environment will meet a  representative sample of "production like" workload.
    • NOT doing formal benchmarking, using a copy of Production data, and simulating actual samples of the most active production workloads (DB queries, Client Access patterns, Network traffic, etc..) with a tool such as LoadRunner.
  • While the reason above is nearly always the case, the cause that frequently crops up in most of these scenarios is that the only "Sizing" done was to compare the "before" and "after" M-Values to generate quotes !   I have encountered this even in mission critical environments, where NO pre-production staging or load testing was done ! This is just WRONG, it's more than a bad practice, possibly one that could get you fired if your (or your customer's) production environment goes down in flames after lots of $$ was spent !  Push back if necessary !
Use proper process and methodology in actually staging a new configuration and perform representative "production like" load testing.

\*\* Single-Core to Multi-Core (CMT) M-Value ("on paper") Comparisons Should CAUTIOUSLY be evaluated !! \*\*


Regarding the trend of migrating production environments from single-core to multi-core architectures, Beware!  Even though a generic benchmark test might reflect much higher #'s with fewer HW threads (and/or cores) than a current production configuration has, realize that there is a lot more to proper sizing and capacity planning (realizing that each configuration is unique) than is reflected in M/GHz (or in M-values) ! A lot can be said for certain types of workload requiring a specific # of HW cores for their environment to perform optimally "on cpu" without much cpu/kernel contention (locking, High TLB misses, context switching, and/or cpu-migrations..).


Hopefully this article has helped you reflect on the wide variety of CPU options available, as well as how they play such a significant role as the "cornerstone" of system architecture and overall performance of our customer's production environments.  Enjoy, and "let the chips fall (or rise) as they may"... :)

For more information regarding Performance Analysis, Capacity Planning, and  related Tools, see Todd's Blog at : http://blogs.sun.com/toddjobson/category/Performance+and+Capacity+Planning

\* Copyright 2007 Todd A. Jobson \* 

Sunday Sep 30, 2007

The Many Flavors of System Latency.. along the Critical Path of Peak Performance


From an article that I wrote last month, published in the September 2007 issue of Sun's Technocrat, this examination of System Latency starts where we left off with the last discussion What is Performance ? .. in the Real World .  That discussion identified the following list of key attributes and metrics that most in the IT world associate with optimal system performance :
  • Response Times (Client GUI's, Client/Server Transactions, Service Transactions, ..) Measured as "acceptable" Latency.
  • Throughput (how much Volume of data can be pushed through a specific subsystem.. IO, Network, etc...)
  • Transaction Rates (DataBase, Application Services, Infrastructure / OS / Network.. Services, etc.).  These can be either rates per Second, Hour, or even Day... measuring various service-related transactions.
  • Failure Rates (# or Frequency of exceeding High or Low Water Marks .. aka Threshold Exceptions)
  • Resource Utilization (CPU Kernel vs. User vs. Idle, Memory Consumption, etc..)
  • Startup Time (System HW, OS boot, Volume Mgmt Mirroring, Filesystem validation, Cluster Data Services, etc..)
  • FailOver / Recovery Time (HA clustered DataServices, Disaster Recovery of a Geographic Service, ..)  Time to recover a failed Service (includes recovery and/or startup time of restoring the failed Service)
  • etc ...

Each of the attributes and perceived gauges of performance listed above have their own intrinsic relationships and dependencies to specific subsystems and components... in turn reflecting a type of "latency" (delay in response). It is these latencies that are investigated and examined for root cause and correlation as the basis for most Performance Analysis activities.

How do you define Latency ?

In the past, the most commonly used terminology relating to latency within the field of Computer Science had been "Rotational Latency". This was due to the huge discrepancy between the responsiveness of an operation requiring mechanical movement, vs. the flow of electrons between components, where previously the discrepancy was astronomical (nano seconds vs. milliseconds).  Although the most common bottlenecks do typically relate to physical disk-based I/O latency, the paradigm of latency is shifting.  With today's built in HW caching controllers and memory resident DB's, (along with other optimizations at the HW, media, drivers, and protocols...), the gap has narrowed. Realize that in 1 nanosecond (1 billionth of a second), electricity can travel approximately one foot down a wire (approaching the speed of light). 

However, given the industry's latest cpu's running multiple cores at clock speeds upwards of multiple GigaHertz (with >= 1 thread per core,  each theoretically executing > 1+ billion  instructions per second...), many bottlenecks can  now easily be realized within memory, where the densities have increased dramatically, the distances across huge supercomputer buses (and grids) have expanded dramatically, and most significantly.. the latency of memory has not decreased at the same rate as cpu speed increases. In order to best investigate system latency, we first need to define it and fully understand what we're dealing with.

LATENCY :

  • noun               The delay, or time that it takes prior to a function, operation, and/or transaction occurring.  (my own definition)
  • adj   (Latent)   Present or potential but not evident or active.
BOTTLENECK :
  • noun               A place or stage in a process at which progress is impeded.
THROUGHPUT :
  • noun              Output relative to input; the amount of data passing through a system from input to output.
BANDWIDTH :
  • noun              The amount of data that can be passed along a communications channel in a given period of time.

(definitions cited from www.dictionary.com)

 

The "Application Environment" and it's basic subsystems :

 

Once again, the all-inclusive entity that we need to realize and examine in it's entirety is the "Application Environment", and it's standard subsystems :

  • OS / Kernel (System processing)
  • Processors / CPU's
  • Memory
  • Storage related I/O
  • Network related I/O
  • Application (User) SW

 

The "Critical Path" of (End-to-End) System Performance :

Although system performance might frequently be associated with one (or a few) system metrics, we must take 10 steps back and realize that overall system performance is one long inter-related sequence of events (both parallel and sequential). Depending on the type of workload and services running within an Application Environment, the Critical Path might vary, as each system has it's own performance profile and related "personality. Using the typical OLTP RDBMS environment as an example, the Critical Path would include everything (and ALL Latencies incurred) between :

Client Node / User -> Client GUI -> Client Application / Services -> Client OS / Kernel -> Client HW -> NICs -> Client LAN -> (network / naming services, etc.. ) -> WAN (switches, routers, ...) -> ... Network Load Balancing Devices

-> Middleware / Tier(s) -> Web Server(s) -> Application Server(s) -> Directory, Naming, NFS... Servers/Services->

-> RDBMS Server(s) [Infrastructure Svcs, Application SW, OS / kernel, VM, FS / Cache, Device Drivers, System HW, HBA's, ...] -> External SAN /NAS I/O [ Switches, Zones/Paths, Array(s), Controllers, HW Cache, LUN(s), Disk Drives, .. ] -> RDBMS Svr ... LAN ...... -> ... and back to the Client Node through the WAN, etc... <<-

(NOTE: MANY sub-system components / interactions are left out in this example of a transaction and response between a client and DB Server)

 

Categories of Latency :

Latency, in and of itself, simply refers to a delay of sorts.  In the realm of Performance Analysis and Workload Characterization, an association can generally be made between certain types of latency and a specific sub-system "bottleneck".  However, in many cases the underlying "root causes of bottlenecks are the result of several overlapping conditions, none of which individually cause performance degradation, but together can result in a bottleneck. It is for this reason that performance analysis is typically an iterative exercise, where the removal of one bottleneck can easily result in the creation of another "hot spot elsewhere, requiring further investigation and /or correlation once a bottleneck has been removed.

 

Internal vs. External Latency ...

Internal Forms of Latency :

  • CPU Saturation (100% Utilization, High Run Queues, Blocked Kthreads, Cpu Contention ... Migrations / Context Switching / ... SMTX, ..)
  • Memory Contention (100% Utilization, Allocation Latency due to either location, Translation, and/or paging/swapping, ...)
  • OS Kernel Contention Overhead ( aka .. "Thrashing" due to saturation.. )
  • IO Latency ( Hot Spots, High Svc Times, ...)
  • Network Latency
  • OS Infrastructure Service Latency (Telnet, FTP, Naming Svcs, ...)
  • Application SW / Services (Application Libraries, JVM, DB, ...)

External Forms of Latency :

  • SAN or External Storage Devices (Arrays, LUNS, Controllers, Disk Drives, Switches, NAS, ...)
  • LAN/WAN Device Latency (Switches, Routers, Collisions, Duplicate IP's, Media Errors, ....)
  • External Services .. DNS, NIS, NFS, LDAP, SNMP, SMTP, DB, ....)
  • Protocol Latency (NACK's, .. Collisions, Errors, etc...)
  • Client Side Latency


Perceived vs. Actual Latency ...

For anyone that has worked in the field with end-users, they have likely experienced scenarios where users will attribute a change in application behavior to a performance issue, in many cases incorrectly. The following is a short list of the top reasons for a lapse in user perception of system performance :

  • Mis-Alignment of user expectations, vantage points, anticipation, etc.. (Responsiveness / Response Times, ...)
  • Deceptive expectations based upon marketing "PEAK" Throughput and/or CPU clock-speed #'s and promised increases in performance.  (high clock speeds do NOT always equate to higher throughput or better overall performance, especially if ANY bottlenecks are present)
  • PEAK Throughput #'s can only be achieved if there is NO bottleneck or related latency along the critical path as described above. The saturation of ANY sub-system will degrade the performance until that bottleneck is removed.

    The PEAK Performance of a system will be dictated by the performance of it's most latent and/or contentious components (or sub-systems) along the critical path of system performance. (eg. The PEAK bandwidth of a system is no greater than that of it's slowest components along the path of a transaction and all it's interactions.)

    As the holy grail of system performance (along with Capacity Planning.. and ROI) dictates, ... a system that allows for as close to 100% of CPU processing time as possible (vs. WAIT events that pause processing) is what every  IT Architect and System Administrator strives for.   This is where systems using CMT (multiple cores per cpu, each with multiple threads per core) shine, allowing for more processing to continue even when many threads are waiting on I/O.

     

     

    The Application Environment and it's Sub-Systems ... where the bottlenecks can be found

     

    Within Computing, or more broadly, Information Technology, "latency" and it's underlying causes can be tied to one or more specific "sub-systems". The following list reflects the first level of "sub-systems" that you will find for any Application Environment :

    Subsystem / Components

    Attributes and key Characteristics

    Related Metrics, Measurements, and/or Interactions

    System "Bus" / Backplane

    Backplane / centerplane, I/O Bus, etc.. (many types of connectivity and media are possible, all with individual response times and bandwidth properties).

    Busstat output, aggregated total throughput #'s (from kstat, etc..)

    CPU's

    # Cores, # HW Threads per core, Clock speed / Frequency in Ghz (cycles per second), Operations (instructions) per Sec, Cache, DMA, etc..

    vmstat, trapstat, cpustat, cputrack, mpstat, ... (Run Queue, Blocked Kthreads, ITLB_Misses, % S/U/Idle Utilization, # lwp's, ...)

    Memory / Cache

    Speed/Frequency of Bus, Bandwidth of Bus, Bus Latency, DMA Config, L1/L2/L3 Cache Locations/ Sizes, FS page cache, Physical Proximity of Cache and/or RAM, FS page caching, tmpfs, pagesizes, ..

    vmstat, pmap, mdb, kstat, prstat, trapstat, ipcs, pagesize, swap, ... (Cache Misses, DTLB_misses, Page Scan Rate, heap/stack/kernel sizes,..)

    Controllers (NIC's, HBA's, ..)

    NIC RX Interrupt Saturation, NIC Overflows, NIC / HBA Caching, HBA SW vs. HW RAID, Bus/Controller Bridges/Switches, DMP, MPxIO, ...

    netstat, kstat (RX Pkts / Sec, Network Errors, ...) , iostat, vxstat.. (Response Times, Storage device Svc_times..), lockstat, intrstat, ...

    Disk Based Devices

    Boot Devices, RAID LUN's, File Systems (types, block sizes, ...), Volumes, RAID configuration (stripes, mirrors, RAID Level, paths,...), physical fragmentation, Mpxio, etc..

    iostat, vxstat, kstat, dtrace, statspack, .. (%wait, Service Times, blocked kernel threads, ... FS/LUN Hot Spots)

    OS / Kernel

    Process Scheduling, Virtual Memory Mgmt, HW Mgmt/Control, Interrupt handling, polling, system calls, ...

    vmstat (utilization, interrupts, syscalls, %Sys / % Usr, ...), prstat, top, mpstat, ps, lockstat (for smtx, lock, spin.. contention), ...

    OS Infrastructure Services

    FTP, Telnet, BIND/DNS, Naming Svcs, LDAP, Authentication/Authoriz., ..

    prstat, ps, svcadm, .. various ..

    Application Services

    DB Svr, Web Svr, Application Svr, ...

     various...

     

Note, if you want a single Solaris utility to do the heavy lifting, performance / workload correlation, and reporting for you, take a look at sys_diag if you haven't already done so (or the README).

 

Media/ Transport Bandwidth and related Latencies :

 

The following table demonstrates the wide range of typical operating frequencies and latencies PER Sub-System, Component, and/or Media Type :

Component / Transport Media

Response Time / Frequency / Speed

 Throughput / Bandwidth

CPU

> 1+ Giga Hertz (1+ billion cycles per second)
\*  (# cores \* HW Threads / core)

>1 billion operations per second
(huge theoretical #ops/s per system)

Memory

DDR (PC-3200@200MHz/200MHz bus) ~5ns

DDR2 (PC2-5300@166MHz/333MHz bus) ~ 6ns

DDR2 (PC2-8500@266MHz/533MHz bus) ~ 3.75ns  <TBD>

nanoseconds (billionths of a second)

DDR-400 Peak Transfer 3.2 GB/s


DDR2-667 Pk Transfer 5.3GB/s

DDR2-1066 Pk Transfer 8.5GB/s <TBD>


Disk Devices

Service Times : ~5+ ms =
~ X ms Latency   +  Y ms Seek Times   
(1 millisecond = 1000th of a second)
[platter size, # cylinders/ platters, RPM,...]

varies greatly, see below

Ultra 320 SCSI (16 bit) parallel

(high performance, cable & dev limitations..)

Up to 320 MBps

SAS [Serial Attached SCSI]

Current
Future <TBD>

> 300 MBps (>3 Gbps)
Up to 1200 MBps <TBD>

SATA [Serial ATA]

low cost, higher capacity (poor performance)
Future <TBD>

Up to 300 MBps
Up to 600 MBps <TBD>

USB 2.0
10-200+ Microseconds
(1 microsecond [us] = 1 millionth of a second)
up to 480 Mbps (60 MBps)             ~40 MBps Real-World Usable
FireWire (IEEE 1394)

Up to 50 MBps

Fiber Channel (Dual Ch)

4 Gb  (4 / 2 / 1 Gb) \*2
8 Gb  (8 / 4 / 2 Gb) \*2  <TBD>

Up to 1.6 GBps (1 GB Usable)

Up to 3.2 GBps (1.8 GB Usable)

1 Gigabit Ethernet

\*\* Latency ~ 50 us [microseconds] \*\*

125 MBps (~1 Gbps) theoretical

10 Gigabit Ethernet

Up to 20 Gbps (<= 9 Gbps Usable)

Infiniband (Dual Ported HCA)

x4 (SDR / DDR) Dual Ported= \*2

\*\* Latency < 2 microseconds \*\*
x8 (DDR) \*2  <TBD>

2\*10Gb= 20 Gbps (16Gbps Usable)


Up to 40 Gbps (32 Gbps Usable)
PCI 2.2
32 bit @ 33 MHz
64 bit @ 33 MHz
64 bit @ 66MHz
133 MBps
266 MBps
533 MBps
PCI-X
64 bit bus width @ 100 MHz (parallel bus)
64 bit bus width @ 133 MHz (parallel bus)
Up to 800 MB/s
1066 MBps (1 GBps)
PCI-Express
v.1 serial bus / bi-directional @ 2.5 GHz


v.2  @ 5 GHz   <TBD>
(10's -100's of nanoseconds for latencies)
4 GBps (x16 lanes) one direction
8 GBps (x32 lanes) one direction
Up to 16 GBps bi-directional (x32)

32 GBps bi-directional (x32 lanes)

 

Other Considerations Regarding System Latency :

Other considerations regarding system latency that are often overlooked include the following, which offers us a more holistic vantage point of system performance and items that might work against "Peak system capabilities :

  • For Application SW that supports advanced capabilities such as Infiniband RDMA (Remote Direct Memory Access), interconnect latencies can be virtually eliminated via Application RDMA "kernel bypass".  This would be applicable in an HPC grid and/or possibly  Oracle RAC Deployments, etc. (confirming certifications of SW/HW..).
  • Level of Multi-Threading vs. Monolithic serial or "batch" jobs (If Applications are not Multi-Threaded, then SMP and/or CMT systems with multiple processors / cores will likely always remain under-utilized).
  • Architectural configurations supporting load distribution across multiple devices / paths (cpu's, cores, NIC's, HBA's, Switches, LUNs, Drives, ...)
  • System Over Utilization (too much running on one system.. due to under-sizing or over-growth, resulting in system "Thrashing" overhead)
  • External Latency Due to Network and/or SAN I/O Contention
  • Saturated Sub-Systems / Devices (NIC's, HBA's, Ports, Switches, ...) create system overhead handling the contention.
  • Excessive Interrupt Handling (vs. Polling, Msg passing, etc..), resulting in overhead where Interrupt Handling can cause CPU migrations / context switching (interrupts have the HIGHEST priority within the Solaris Kernel, and are handled even before RT processing, preempting running threads if necessary).   Note, this can easily occur with NIC cards/ports that become saturated (> ~25K RX pkts/sec), especially for older drivers and/or over-utilized systems.
  • Java Garbage collection Overhead (sub-par programming practices, or more frequently OLD JVM's, and/or missing compilation optimizations).
  • Use of Binaries that are compiled generically using GCC, vs. HW optimized compilations using Sun's Studio Compilers (Sun Studio 12 can give you 200% + better performance than gcc binaries).
  • Virtualization Overhead (significant overhead relating to traps and library calls... when using VmWare, etc..)
  • System Monitoring Overhead (the cumulative impact of monitoring utilities, tools, system accounting, ... as well as the IO incurred to store that historical performance trending data).
  • OS and/or SW ... Patches, Bugs, Upgrades (newly applied, or possibly missing)
  • Systems that are MIS-tuned, are accidents waiting to happen.  Only Tune kernel/drivers if you KNOW what you are doing, or have been instructed by support to do so (and have FIRST tested on a NON-production system).  I can't tell you how many performance issues I have encountered that were to do administrator "tweaks" to kernel tunables (to the point of taking down entire LAN segments !).  The defaults are generally the BEST starting point unless a world-class benchmarking effort is under-way.

 

The "Iterative" nature of Performance Analysis and System Tuning

No matter what the root causes are found to be, in the realm of Performance Analysis and system Tuning, ... once you remove one bottleneck, the system processing characteristics will change, resulting in a new performance profile, and new "hot spots" that require further data collection and analysis. The process is iterative, and requires a methodical approach to remediation.

Make certain that ONLY ONE (1) change is made at a time, otherwise, the effects ( + or - ) can not be quantified.

Hopefully at some point in the future we'll be operating at latencies measured in attoseconds (10 \^-18th, or 1 quintillionth of a second), but until then .... Happy tuning :)

For more information regarding Performance Analysis, Capacity Planning, and related Tools, review some of my other postings at :  http://blogs.sun.com/toddjobson/category/Performance+and+Capacity+Planning

 

Copyright 2007  Todd A. Jobson

Wednesday Aug 22, 2007

It's Time for a Change... "TIME TRAP .. Today"

333


Well, well.. for all you calendar viewers, daily planners, and clock watchers...

Today's blog is a poem that I jotted down in 2006 (8/31/2006), revised recently.. and just as it was then.. is still fresh in my mind and close to the heart..  reflections on TIME.. and Living for Today.   Realizing how absolutely TIME crazed our world is.. and that very few of us... actually do "Sieze the day" [Latin : "Carpe diem" Horace], living Today as if it were our last.. and valuing the precious (priceless) commodity that it is.

For all of us, we seem to reach for the future or get caught up reflecting on the past, with little effort spent on Living in the "present" (a gift not to be wasted).

We will never know which "today" will be the last.. so we best start Living .. and Loving more.. ENJOYING life and it's simple pleasures.. because any day could be our last.

Reflecting on the history and origin of all things, as I tend to do.., it amuses me how EVERYTHING is time centric.. given that it's a man-made instrument of humanity .. the illusive gauge that simultaneously enabled and handicapped mankind in our quest to control, understand, and harness the productivity of the world around us.

Take the definition of time itself as an example, ..seconds, minutes, hours, days, weeks, months, etc... all of which originated in prehistory and were refined over millennia.   The days of the week (7 days likely originated by dividing the period of the moon.. ~28 days by 4 phases = 7) , eventually personified.. named and associated with the gods of the times (Anglo-Saxon, Nordic, Greco-Roman, ..), as were the 7 known "heavenly" bodies :

Sun, Moon, Tiw/Mars, Woden/Mercury, Thor/Jupiter, Freya/Venus, Saturn

== Sunday/ Monday/ Tuesday/ Wednesday/ Thursday/ Friday/ Saturday

These were the key Gods worshiped by mankind as pagans (Polytheism) when the Julian Calendar was eventually finalized by Julius (July) and Augustus (August) Caesar.  This was before science started to take hold and Monotheism gained popularity post-AD (Constantine in 313AD), when science began it's uphill battle counteracting superstition by explaining reasons for why things happen (farming/agriculture, water/waste management, nutrition/medicine, ...), diminishing the need for many Gods as people understood and controlled more of the world around them.

You probably never realized that many variations in the number of days in a week have existed over time.. 7-10 being the standard range (the Roman calendar had 10 months with 9 days per week.. the 9th being market day, .. and with the French Revolutionary Calendar.. 10 days per week for 36 weeks a year).

The beginning of the year has also changed many, many times, initially marking time according to lunar/annual events (Solstice, equinox, etc..), then following the mandates of governments/ religions (Jan 1, March 1/ 15/ 25, May1, July 1, Aug 29-31, Sept 1, Dec 25,...), and even the birthday of Augustus Caesar (Sept 23).

The ancient Sumerians/Babylonians had an affinity to base-6 math, which resulted in the calculation of PI for calculating accuracy of geometric circles from a known radius (probably originating from drawing a circle from rotating a stick at it's center.. the stick's length, it's radius, able to outline a circle by creating a 6-sided hexagon.. the natural shape that bees use in a honeycomb.. with 6 uniform segments).  This base-6 math also gave us the 24 hour day, 12 months, 360 degrees in a circle.. and 360+ days in a year.. tweaked over time to 365, etc...

Once again.. mankind trying to control the world around us, creating order out of chaos, which grounds us.. and allowed for civilization to thrive (Imagine everything before sundials where people tried to coordinate things based on sunrise/sunset and high noon .. you get the concept .. lots of disorganized .. & LATE people).  Society and civilization as a whole couldn't become more productive and thrive without it (I suppose ;) ).

Well.. either way, for this blog....it's "time" for a change.. Let me know what you think.. hope you enjoy... and don't get sucked into that old "Time Trap...".


Todd  :)

"TIME TRAP .. Today :


As the clocks now click at internet speed,
slow them please to be there.. some "today" for me.


 
Time is my curse,
there every day.
 
Too young, too old,
they take it away.
 
 
Time Past,
Time Planned,
but none for today.

 
No Present, just reaching,
hearts and minds ..gone astray.



Sun-dials and sand-timers,
the urge to control..
a guage, the imaginary sage..
.. a need, within us all.


All the time,
Every Time,
our collusion, delay.
 
Are we late ?,
are we early ?, ..
.. perpetual disarray.


Quiet Time,
Past Time,
..but no time to Play.

Yours... hours,
the intangible price we pay.


 
Time for War,
and Peace,
but none for Love.

 
Break the clocks for me..
just give them a shove.

 
 
We've embraced the dillusion,
.. TIME, the grand illusion ..
though we waste the moments away.
 
While yesterday's tomorrow
..forever abscent.. 
is always another time, we call..  Today.

 

As you dance through life,
along the critical paths we weave,
ask yourself ..
  "is this the Time of my life ?
   or the Life of my Time ?" ..

to some.. disbelief, to others .. sublime.
 
 
Through Winter, Spring,
and Summer I call ..
to the dimension of time,
though invisible to all.

 
From the Sun and Moon,
the speed of light does speak,
as myths once told, every day anew ..
emparts on us, ... the strength in life we seek.


Upon the Earth, ever spinning,
we're speeding our lives away..
.. slow it down to a crawl,  so I can sow time's seeds..
... to blossom and harvest... Today.
 
 
We're here every day,
but just for the ride.
So come have a seat, right by my side.
 
But just as it slows, ... you get farther away,
so please speed it up, I can't wait for Today.
 
 
And while I wait, please make me some time.
Put it away for tomorrow, "Today" is not mine.
 
For I with You,
Here's my "present",
....  on borrowed time.
 
 
I simply dream for "Today" ... with you..
..just one more time.   "   

 
 
 Todd J.

( Copyright 2006-2007  Todd A. Jobson )

Leaving you with a little... TIME in a Bottle .. the other thing that recursively gets ReCycled .... ENJOY !! :)



Add to Technorati Favorites

Saturday Aug 04, 2007

Sun Cluster 3.2 Aug'07 News: ..Sun is OpenSource#1..

If you haven't been up to date on the latest news regarding Sun's High Availability offerings with Sun Cluster 3.2 .. (even Open Sourcing it !).. I though this would be the perfect opportunity for a quick recap with a few key articles, WP's, and related links ( many of these are specific to Oracle RAC integration with Sun Cluster ) :


\* Sun Cluster 3.2 Offers the Strongest Integration with Oracle RAC 10gR2..

> Sun Cluster 3.2 Software: Making Oracle Database 10G R2 RAC Even More Unbreakable

 

\* Sun offers up Solaris Cluster as Open Source Gem : (Network World link)

> Sun will post code to High Availability Clusters community on OpenSolaris.org

In case you didn't know, Sun is actually ranked #1 in terms of SW Contributions to the Open Source community! Even BEFORE the JAVA (OpenJDK) OR Sun Cluster contributions, the European Commission on FLOSS (Free/Libre Open Source Software) has reflected that Sun contributes to and participates in more open source projects than any other commercial company, including IBM, RedHat, Novell and HP.  See :  (page 51) for the breakdown. http://ec.europa.eu/enterprise/ict/policy/doc/2006-11-20-flossimpact.pdf



\*Sun™ Cluster Oracle 10g Grid Reference Architecture Optimizing Scalability and Performance


\*Sun™ Cluster and Oracle Real Application Clusters (RAC) : High Availability, Scalability, and Ease of Management (Oracle's internal use is quoted..)


\*Installation Guide for Solaris Cluster 3.2 Software and Oracle 10g Release 2 RAC



 
Sun Cluster 3.x is a Best In Class High Availability Suite that offers everything from single node clustering.. all the way up to Global / Geographic Clustering for Disaster Recovery. You can find Sun's external Availability
Suite page at : http://www.sun.com/software/solaris/cluster/index.xml



Give it a test drive for free .. and let me know what you think !

Cheers, Todd



Add to Technorati Favorites

Friday Aug 03, 2007

System Profiling 101 : Getting started using sys_diag v.7.04


The following entry is a variation of an article that I created for this month's Sun "Technocrat" publication (Aug. 2007).

This posting demonstrates the art of system profiling (from a high level overview) by introducing a few sample screenshots of the sys_diag  .html  report (it's header, Dashboard, and Table of Contents).. demonstrating how in a few minutes, sys_diag can present you with an accurately depicted system profile !

Note, the .html report snapshot samples presented here, match the command line output from my previous blog postings (from the same run of sys_diag).

If you haven't had the chance to try out sys_diag yet, this should give you the highlights of what you can expect in the .html report header sections.

Enjoy and let me know what you think,

Todd



Real world PROFILING ..

So.. what is Profiling.. ?? ... Well, in the real world, you can define profiling in many ways.

.. from the "profile" of the person standing next to you (what you see from your vantage point), to the personality "profiles" that we've all heard of in psychology (characterization based upon key attributes) ..

 
In your standard dictionary you'll find a definition such as this (from Dictionary.com) :

PROFILE :
     (-noun)

  •  the outline or contour of the human face, esp. the face viewed from one side.
  •  a verbal, arithmetical, or graphic summary or analysis of the history,
    status, etc., of a process, activity, relationship, or set of
    characteristics: a biochemical profile of a patient's blood; a profile of national consumer spending.
  •  a set of characteristics or qualities that identify a type or category of person or thing: a profile of a typical allergy sufferer.
  • Psychology. a description of behavioral and personality traits of a person compared with accepted norms or standards.

System Profiling .. in the World of Computing

Well, in the world of  technology, and more specifically.. Computing, "profiling" takes on it's own connotation, though similar to many of the more technical definitions noted above.

To some, system profiling simply includes a high level summary of resource utilization and bottleneck identification of a system during some period of data collection (point in time or over a duration).

"Broad Spectrum" System Profiling ..

System profiling to me is the characterization of a system as a whole, given a set of data, either for one event/point in time, or over a duration.  This characterization goes beyond workload (as you'll typically hear the term "workload characterization"), which is why I call it "Broad (or Full) Spectrum" profiling, more broadly taking into account and including :

  • Configuration characteristics (of all sub-systems/components within the "application environment" being profiled).
  • System Performance metrics captured, reflecting the variations in system/subsystem activity measurements (utilization, contention, throughput, latency, etc...).
  • Workload Characterization : Details correlating the Workload that was ongoing during the data collection (workload characteristics.. TPS/ Response Times/ Mbps/ ...), with the measurements taken.   (beyond the internal system workload identified, External sources need to be correlated)
  • A Characterized (Summarized) "Profile" of overall System Efficiency and "health" based upon Performance Analysis findings (system/sub-system Avg/Peaks.., Utilization vs. Workload, etc ...)
  • Notable Events and/or Exceptions encountered from the data available.
  • ... etc ...

"Narrow Spectrum" System Profiling ..

This would be in contrast to "Narrow Spectrum" (Focused) System Profiling, where attention to detail is focused in a very "narrow" and specific area of interest for analysis (typically in determining a Root Cause where a specific bottleneck is know within a sub-system or specific component of the system).

Note the common themes of defining Requirements, the "Application Environment", etc.. as presented in my previous postings .. (eg.  "What is Performance ? .. in the Real World", etc..).. and likely to be common themes.. in the Real World.. ;)

 
Look for more details on this and much more in an up-coming blog entry more thoroughly delineating the distinction between Profiling, Workload Characterization, Performance Analysis, and Capacity Planning....  

For now, enjoy the following discussion on how sys_diag can have you profiling in no time at all ... :)



Profiling with sys_diag ...

Automating Solaris Performance and Configuration Analysis


In the arena of Performance and Configuration Analysis, the freely available Solaris utility “sys_diag” offers the capability to automatically capture this information in a single .html report (also .txt and .ps) after running one easy to use ksh script. Typically, several utilities/tools need to be run separately, requiring manual collection, aggregation and correlation of the data, prior to conducting the analysis of data.

Sys_diag automates this legwork, by running over 100 Solaris built in commands/utilities (depending on the parameters used) and presenting the data as a structured report with a summarized header, a color-coded “dashboard” (broken down by high level workload characterization, sub-system findings, followed by a Table of Contents), all with links to the corresponding report sections with detailed configuration and/or performance analysis findings.
 

sys_diag 's  HTML Report Header :



sys_diag 's  HTML Performance "Dashboard" :

 The following sample .html system performance “Dashboard” (a portion of which is shown below) reflects the 4 key sub-systems (CPU/Kernel Profiling, Memory, IO, and Networking) as a summarized depiction of sub-system “health”, based against a list of rules / thresholds that the captured data is compared to during post-processing.

These rules and thresholds are listed in the Performance Analysis section (Section #24) and can be easily tuned / modified to offer more stringent or lenient identification of performance exceptions that contribute to the Green/ Yellow / Red (OK / Warning / Critical) color-coded status within each dashboard section. Within each section are listed the key performance metrics and a summary of exceptions, along with Average and Peak (High Water Mark) values present during the collection/sampling period. At the end of each section is a list of key “links” to the substantiating detailed data analysis within the report.  


\*Click to Enlarge\*


When run for performance data gathering (-g or -G), 2 types of performance data is captured :

\* vmstat, mpstat, iostat, netstat, .. data for a duration (-T total_secs), captured at specified sampling rates (-I interval_#secs). The default duration is 5 minutes of data capture @ 2 second intervals if -I / -T are not specified.

\* 3 Point in Time detailed snapshots (beginning, mid point, end point). If -G is used, and         Solaris 10 is the OS, then Dtrace and detailed lockstat, pmap/pfiles, cputrack, ...  snapshots will be taken (beyond the core “-g” snapshots that include  ps, netstat -i, vxstat, kstat,  ...).
 

 
Sys_diag has been run on virtually all models of Sun systems running Solaris 2.6 or > (from x86 laptops up to fully loaded E25K's), offering extensive Solaris 10 configuration and performance data, including DTrace snapshots. It creates a single .tar.Z compressed archive (including all raw, snapshot and post-processed datafiles) that can be emailed/ ftp'd.. for performing system configuration and/or performance analysis off-site.. from virtually anywhere.

This is one of the key characteristics that sys_diag offers.. to save a LOT of time.. not requiring many separate manual runs / collection / correlation of data, or the need for any 3rd party tools, libraries, or agents to be installed on a system other than downloading the "sys_diag" ksh script itself. Virtually no learning curve is required for loading, running, and reflecting basic performance profiling, including high level sub-system bottlenecks (deeper root cause correlation might require some level of advanced system administration knowledge, though virtually all the data needed will have been already captured by sys_diag).

This utility has been used extensively in the field over the last several years, run on literally hundreds of production systems as part of escalation root cause analysis, in addition to providing the basis for dozens of Architectural and/or Performance Assessments
(including formal Capacity Planning / Benchmarking). Graphing of the data captured (vmstat, netstat...) is also easy to do using StarOffice as explained in the README file that sys_diag creates.


sys_diag 's  HTML Report "Table of Contents" :

The screenshot below shows the Table of Contents and related sections available within the .html report (\* Click to Enlarge \*) :

 

Although this tool isn't meant to replace long-term historical Performance Trending and Capacity Planning packages (Teamquest, etc..), it provides the foundation and basis for a very robust starting point (and actually is much better at point in time workload characterization and root cause analysis of bottlenecks, where very granular detailed data correlation is required).

Over the time that sys_diag has been posted on BigAdmin, many Sun customers around the globe have downloaded and commented positively on their experiences with it. For more information, or to download and try it out for yourself , the following URL's should help you get started :

 

 

 

The latest release of sys_diag (v.7.04) is available from BigAdmin
(unpackaged ksh) at :

http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sys_diag__solaris_c
http://www.sun.com/bigadmin/scripts/submittedScripts/sys_diag.txt


sys_diag  is also available as part of the "SunFreeware" Distribution
(packaged with the README) at :

http://www.sunfreeware.com/programlistsparc10.html#sys_diag


The following recent blog postings provide an extended overview of sys_diag and it's capabilities :

Solaris Performance Analysis and Monitoring Tools... at what cost ?...http://blogs.sun.com/toddjobson/entry/solaris_performance_monitoring_tools

What is sys_diag ?? .. Automating Solaris Performance Profiling and Workload Characterization.
http://blogs.sun.com/toddjobson/entry/what_is_sys_diag_automating
sys_diag v.7.04 command line output ...
http://blogs.sun.com/toddjobson/entry/sys_diag_v_7_04

 

\*\*Note, read the ksh script header pages or the README file prior to using, and ALWAYS test first on a representative non-production system.. as is the best practice when making ANY production environment changes... ;)

(Copyright 2007, Todd A. Jobson) 



Add to Technorati Favorites

Monday Jul 30, 2007

sys_diag 7.04 on SunFreeware.com !

As of 7/30/2007, SunFreeware.com will now be including the Solaris utility "sys_diag" as part of their distributions (Solaris 8 -> Solaris 10 for both Sparc and x86).

 The format at SunFreeware varies BigAdmin's distribution, only in that BigAdmin provides the raw ksh script, where sys_diag on SunFreeware is packaged along with it's README file in the standard Solaris package format (pkgadd).

 SunFreeware.com :     sys_diag v.7.04

 SunFreeware.com : README_sys_diag.txt

If you're not yet familiar with SunFreeware.com, you should be ! .. check it out asap for a great selection of Solaris freeware (already compiled and packaged for you !!).


Tuesday Jul 17, 2007

sys_diag v.7.04 command line output ...


The following output was captured recently from running sys_diag v.7.04 on a (Solaris 10u3) Sun Ultra60 2 cpu test system in my lab  

Note the list of utilities run and types of data captured, as well as the final performance summary (a small summary of the complete color coded HTML dashboard available in the full .html report).

sys_diag has been run on virtually every type of Sun system, running Solaris 2.6 -> S10.  I have personally conducted dozens of Performance Analysis, Capacity Planning/ Benchmarking, and/or Architectural Assessments using sys_diag in production environments.. x86.. up to fully loaded E25K environments.

The latest release of sys_diag (v.7.04) is available from either BigAdmin (unpackaged ksh) or SunFreeware.com (pkg'd with the README) at the following URL's :
http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sys_diag__solaris_c

or   http://www.sunfreeware.com

Realize that more than half of sys_diag 's benefit is in working from the .html aggregated report file.. that links and correlates all the independant data files together with findings and exceptions via a nice color-coded header / dashboard / and Table of Contents.  (the legwork is all done for you !)

I'll try to get a sample snapshot of a report header/dashboard for an up-coming blog... but for now, just download and test run sys_diag (v.7.04 is recommended), review the final .html report and forward and questions/comments back to me.. along with RFE for future releases.

(Read the last sections of the README for a detailed description of all datafiles created/available...)

With a little practice, it should save you many hours.. if not days.. of effort as it does for me.


Enjoy and let me know what you think,  

Todd


The following example does the deepest level of Performance data Gathering (-G, which includes Dtrace and pmap/pfiles snapshots vs. -g for light-weight perf gathering), Verbose output (-V), in addition to creation of a long/detailed configuration report (-l). The sampling rate used is 1 second intervals (-I1) for a total duration of 298 seconds (-T298).

\*Without -I || -T, the defaults are 2 second samples for 5 minutes total data gathering. Also note that when -G && -V are used together, the initial Dtrace and Lockstat snapshots take a couple minutes to complete, prior to beginning the data collection for 298 seconds (since the duration of probing is expanded with -V to 5 seconds vs 2 seconds with -G alone, or 1 second minimal lockstat sampling using  -g  ..aka.. no Dtrace probing).

root@/var/tmp #  ./sys_diag -G -V -l -I1 -T298

sys_diag:0717_033209: GATHER Extra PERFORMANCE DATA (-G)
sys_diag:0717_033209: VERBOSE (-V)
sys_diag:0717_033209: INTERVAL : 1 second sampling (-I1)
sys_diag:0717_033209: TIME Duration: 298 seconds (-T298)
sys_diag:0717_033209: LONG report (-l)
sys_diag:0717_033209: # Creating ... README_sys_diag.txt ...

sys_diag: ------- Beginning Process SNAPSHOT (# 0) -------
sys_diag:0717_033209: Dtrace: TCP write bytes by process ...(_dtcp_tx Snap 0)
sys_diag:0717_033209: Dtrace: TCP read bytes by process ... (_dtcp_rx Snap 0)
sys_diag:0717_033209: Dtrace: systemwide IO / IO wait... (_diow Snap 0)
sys_diag:0717_033235: Dtrace: Syscall count by process... (_dcalls_ Snap 0)
sys_diag:0717_033243: Dtrace: Syscall count by syscall... (_dsyscall_ Snap 0)
sys_diag:0717_033251: Dtrace: Read bytes by process... (_dR_ Snap 0)
sys_diag:0717_033258: Dtrace: Write bytes by process... (_dW_ Snap 0)
sys_diag:0717_033306: Dtrace: Sysinfo counts by process... (_dsinfo_ Snap 0)
sys_diag:0717_033314: Dtrace: Sdt_counts ... (_dsdtcnt_ Snap 0)
sys_diag:0717_033321: Dtrace: Interupt Times [sdt:::intr].. (_dintrtm_ Snap 0)
sys_diag:0717_033321: # ps -e -o ...(by %CPU) ... Snapshot # 0
sys_diag:0717_033321: # ps -e -o ...(by %MEM) ... Snapshot # 0
sys_diag:0717_033332: # pmap -xs 519 ...
sys_diag:0717_033332: # pmap -S 519 ...
sys_diag:0717_033332: # pmap -r 519 ...
sys_diag:0717_033332: # ptree -a 519 ...
sys_diag:0717_033332: # pfiles 519 ...
sys_diag:0717_033333: Dtrace: IO by process 519 ... (_dpio Snap 0)
sys_diag:0717_033339: # pmap -xs 448 ...
sys_diag:0717_033339: # pmap -S 448 ...
sys_diag:0717_033339: # pmap -r 448 ...
sys_diag:0717_033339: # ptree -a 448 ...
sys_diag:0717_033339: # pfiles 448 ...
sys_diag:0717_033340: Dtrace: IO by process 448 ... (_dpio Snap 0)
sys_diag:0717_033346: # pmap -xs 90 ...
sys_diag:0717_033346: # pmap -S 90 ...
sys_diag:0717_033346: # pmap -r 90 ...
sys_diag:0717_033346: # ptree -a 90 ...
sys_diag:0717_033346: # pfiles 90 ...
sys_diag:0717_033347: Dtrace: IO by process 90 ... (_dpio Snap 0)
sys_diag:0717_033353: # pmap -xs 825 ...
sys_diag:0717_033353: # pmap -S 825 ...
sys_diag:0717_033353: # pmap -r 825 ...
sys_diag:0717_033353: # ptree -a 825 ...
sys_diag:0717_033353: # pfiles 825 ...
sys_diag:0717_033353: Dtrace: IO by process 825 ... (_dpio Snap 0)
sys_diag:0717_033353: # /usr/bin/netstat -i -a ...
sys_diag:0717_033400: # Snapshot Kernel Memory Usage.. ::memstat | mdb -k ...
sys_diag:0717_033409: # /usr/sbin/lockstat -IW -n 100000 -s 13 sleep 5 ...
sys_diag:0717_033419: # /usr/sbin/lockstat -A -n 90000 -D15 sleep 5 ...
sys_diag:0717_033431: # /usr/sbin/lockstat -A -s8 -n 90000 -D10 sleep 5 ...
sys_diag:0717_033446: # /usr/sbin/lockstat -AP -n 90000 -D10 sleep 5 ...
sys_diag:0717_033521: Dtrace: Involuntary Context Switches (icsw) by process .. (_dmpc Snap 0)
sys_diag:0717_033526: Dtrace: Cross CPU Calls (xcal) caused by process ........ (_dmpc Snap 0)
sys_diag:0717_033531: Dtrace: MUTEX try lock (smtx) by lwp/process ............ (_dmpc Snap 0)

sys_diag: --\*\*-- (Background) DATA COLLECTION FOR 298 secs STARTED --\*\*--
sys_diag:0717_033531: # /usr/bin/vmstat -q 1 298 > ./sysd_socrates_070717_0332/sysd_vm_socrates_070717_033209.out 2>&1 &
sys_diag:0717_033531: # /usr/bin/iostat -xn 1 298 > ./sysd_socrates_070717_0332/sysd_io_socrates_070717_033209.out 2>&1 &
sys_diag:0717_033531: # /usr/bin/mpstat -q 1 298 > ./sysd_socrates_070717_0332/sysd_mp_socrates_070717_033209.out 2>&1 &
sys_diag:0717_033537: # /usr/bin/netstat -i -I lo0 1 298 > ./sysd_socrates_070717_0332/sysd_net1_socrates_070717_033537.out 2>&1 &
sys_diag:0717_033537: # /usr/bin/kstat -p -T u -n lo0 1> ./sysd_socrates_070717_0332/sysd_knetb_lo0_socrates_070717_033537.out 2>&1
sys_diag:0717_033538: # /usr/bin/netstat -i -I hme0 1 298 > ./sysd_socrates_070717_0332/sysd_net2_socrates_070717_033538.out 2>&1 &
sys_diag:0717_033538: # /usr/bin/kstat -p -T u -n hme0 1> ./sysd_socrates_070717_0332/sysd_knetb_hme0_socrates_070717_033538.out 2>&1
sys_diag:0717_033538: # /usr/sbin/snoop ...

sys_diag: ------- (Foreground) Gathering System Configuration Details -------
sys_diag:0717_033539: # uname -a ...
sys_diag:0717_033539: # hostid ...
sys_diag:0717_033539: # domainname (DNS) ...
sys_diag:0717_033539: ###### SYSTEM CONFIGURATION / DEVICE INFO ######
sys_diag:0717_033539: # prtdiag ...
sys_diag:0717_033539: # prtconf | grep Memory ...
sys_diag:0717_033539: # /usr/sbin/psrinfo -v ...
sys_diag:0717_033539: # /usr/sbin/psrinfo -pv ...
sys_diag:0717_033539: # /usr/sbin/psrset -q ...
sys_diag:0717_033539: # cfgadm -l ...
sys_diag:0717_033539: # cfgadm -al ...
sys_diag:0717_033539: # cfgadm -v ...
sys_diag:0717_033539: # cfgadm -av | grep memory | grep perm ...
sys_diag:0717_033541: ###### E10K / E25K / SunFire System INFO ######
sys_diag:0717_033541: # Checking Kernel Cage settings ...
sys_diag:0717_033541: # eeprom ...
sys_diag:0717_033541: # /usr/bin/coreadm ...
sys_diag:0717_033541: # /usr/sbin/dumpadm ...
sys_diag:0717_033541: # modinfo ...
sys_diag:0717_033541: # /usr/sbin/lustatus ...
sys_diag:0717_033541: # cat /etc/path_to_inst ...
sys_diag:0717_033542: ###### WORKLOAD CHARACTERIZATION ######
sys_diag:0717_033542: # prstat -c -a 1 1 ...
sys_diag:0717_033542: # prstat -c -J 1 1 ...
sys_diag:0717_033542: # prstat -c -Z 1 1 ...
sys_diag:0717_033542: # prstat -c 1 2 ...
sys_diag:0717_033544: # prstat -c -v 1 3 ...
sys_diag:0717_033546: # ps -e -o ...(by %CPU) ...
sys_diag:0717_033546: # ps -e -o ...(by %MEM) ...
sys_diag:0717_033546: # ps -e -o ...(by LWP) ...
sys_diag:0717_033546: ###### PERFORMANCE PROFILING (System / Kernel) ######
sys_diag:0717_033547: # vmstat 1 5 ...
sys_diag:0717_033551: # /usr/bin/mpstat 1 3 ...
sys_diag:0717_033551: # /usr/bin/isainfo -v ...
sys_diag:0717_033553: # /usr/bin/ipcs -a ...
sys_diag:0717_033553: # /usr/bin/pagesize ...
sys_diag:0717_033553: # swap -l ...
sys_diag:0717_033553: # swap -s ...
sys_diag:0717_033553: # /usr/bin/vmstat -s ...
sys_diag:0717_033553: # /usr/bin/kstat -n system_pages ...
sys_diag:0717_033553: # /usr/bin/kstat -n vm ...
sys_diag:0717_033554: # /usr/sbin/trapstat 1 2 ...
sys_diag:0717_033554: # /usr/sbin/trapstat -t 1 2 ...
sys_diag:0717_033554: # /usr/sbin/trapstat -l ...
sys_diag:0717_033554: # /usr/sbin/trapstat -t 1 2 ...
sys_diag:0717_033554: # /usr/sbin/trapstat -T 1 2 ...
sys_diag:0717_033554: # /usr/sbin/intrstat 1 2 ...
sys_diag:0717_033554: # /usr/bin/vmstat -i ...
sys_diag:0717_033554: ###### KERNEL ZONES/ SRM / Acctg / TUNABLES ######
sys_diag:0717_033554: # /usr/sbin/zoneadm list -v ...
sys_diag:0717_033554: # /usr/bin/projects -l ...
sys_diag:0717_033554: # /usr/sbin/psrset -i ...
sys_diag:0717_033554: # /usr/sbin/psrset -p ...
sys_diag:0717_033554: # /usr/sbin/psrset -q ...
sys_diag:0717_033554: # /usr/sbin/rctladm -l ...
sys_diag:0717_033554: # /usr/bin/priocntl -l ...
sys_diag:0717_033554: # /usr/sbin/acctadm ...
sys_diag:0717_033554: # /usr/sbin/acctadm -r...
sys_diag:0717_033554: # tail -80 /etc/system ...
sys_diag:0717_033554: # sysdef | tail -85 ...
sys_diag:0717_033554: # tail -40 /etc/init.d/sysetup ...
sys_diag:0717_033554: # cat /etc/power.conf ...
sys_diag:0717_033612: ###### STORAGE / ARRAY INFO ######
sys_diag:0717_033612: # prtconf -pv ...
sys_diag:0717_033613: # luxadm probe ...
sys_diag:0717_033614: ###### STORAGE VOLUME MANAGEMENT INFO ######
sys_diag:0717_033614: ###### SOLARIS (SDS/SVM) VOLUME MANAGER Info ######
sys_diag:0717_033614: # /sbin/metadb ...
sys_diag:0717_033614: # /sbin/metastat ...
sys_diag:0717_033614: # /sbin/metastat -p...
sys_diag:0717_033614: ###### Sun STMS / MPxIO Info ######
sys_diag:0717_033614: # cat /kernel/drv/fp.conf ...
sys_diag:0717_033614: # cat /kernel/drv/fcp.conf ...
sys_diag:0717_033614: ###### FILESYSTEM INFO ######
sys_diag:0717_033614: # df ...
sys_diag:0717_033614: # df -k ...
sys_diag:0717_033614: # mount -v ...
sys_diag:0717_033614: # /usr/sbin/showmount -a ...
sys_diag:0717_033614: # cat /etc/vfstab ...
sys_diag:0717_033614: # /usr/bin/cachefsstat ...
sys_diag:0717_033614: ###### I/O STATS ######
sys_diag:0717_033614: # /usr/bin/iostat -nxe 3 2 ...
sys_diag:0717_033614: # /usr/bin/iostat -xcC 3 2 ...
sys_diag:0717_033614: # /usr/bin/iostat -xnE ...
sys_diag:0717_033614: ###### NFS INFO ######
sys_diag:0717_033614: # /usr/bin/nfsstat ...
sys_diag:0717_033614: # /usr/bin/nfsstat -m ...
sys_diag:0717_033614: ###### NETWORKING INFO ######
sys_diag:0717_033614: # cat /etc/hosts ...
sys_diag:0717_033614: # /usr/sbin/ifconfig -a ...
sys_diag:0717_033614: # /usr/bin/netstat -i ...
sys_diag:0717_033614: # /usr/bin/netstat -r ...
sys_diag:0717_033614: # /usr/sbin/arp -a ...
sys_diag:0717_033614: # /usr/sbin/ping -s 192.168.200.1 56 10 ...
sys_diag:0717_033614: # /usr/sbin/ping -s 192.168.200.1 1016 10 ...
sys_diag:0717_033614: # /usr/sbin/ping -s google.com 56 10 ...
sys_diag:0717_033614: # /usr/sbin/ping -s google.com 1016 10 ...
sys_diag:0717_033614: # cat /etc/hostname.hme0 ...
sys_diag:0717_033614: # cat /etc/inet/networks ...
sys_diag:0717_033614: # cat /etc/netmasks ...
sys_diag:0717_033614: # tail -30 /etc/inet/ntp.server ...
sys_diag:0717_033614: # /usr/sbin/dladm show-dev ...
sys_diag:0717_033614: # /usr/sbin/dladm show-link ...
sys_diag:0717_033614: # /usr/sbin/dladm show-aggr ...
sys_diag:0717_033614: # /usr/sbin/pntadm -L ...
sys_diag:0717_033703: # /usr/bin/kstat -c net ...
sys_diag:0717_033703: # ndd -get /dev/tcp ...
sys_diag:0717_033703: # ndd -get /dev/udp ...
sys_diag:0717_033703: # ndd -get /dev/ip ...
sys_diag:0717_033706: # ndd -set /dev/hme instance 0 ...
sys_diag:0717_033706: # ndd -get /dev/hme ...
sys_diag:0717_033706: # /usr/bin/netstat -a ...
sys_diag:0717_033711: # /usr/bin/netstat -s ...
sys_diag:0717_033711: ###### TTY / MODEM INFO ######
sys_diag:0717_033711: # /usr/sbin/pmadm -l ...
sys_diag:0717_033711: # cat /etc/remote ...
sys_diag:0717_033711: # cat /var/adm/aculog ...
sys_diag:0717_033711: ###### USER / ACCOUNT / GROUP Info ######
sys_diag:0717_033711: # w ...
sys_diag:0717_033711: # who -a ...
sys_diag:0717_033711: # cat /etc/passwd ...
sys_diag:0717_033711: # cat /etc/group ...
sys_diag:0717_033711: ###### SERVICES / NAMING RESOLUTION ######
sys_diag:0717_033711: # /usr/bin/svcs -v ...
sys_diag:0717_033711: # cat /etc/services ...
sys_diag:0717_033711: # cat /etc/inetd.conf ...
sys_diag:0717_033711: # cat /etc/inittab ...
sys_diag:0717_033711: # cat /etc/nsswitch.conf ...
sys_diag:0717_033711: # cat /etc/resolv.conf ...
sys_diag:0717_033711: # cat /etc/auto_master ...
sys_diag:0717_033711: # cat /etc/auto_home ...
sys_diag:0717_033712: # /usr/bin/ypwhich ...
sys_diag:0717_033712: # /usr/bin/nisdefaults ...
sys_diag:0717_033712: ###### SECURITY / CONFIG FILES ######
sys_diag:0717_033712: # cat /etc/syslog.conf ...
sys_diag:0717_033712: # cat /etc/pam.conf ...
sys_diag:0717_033712: # cat /etc/default/login ...
sys_diag:0717_033712: # tail -250 /var/adm/sulog ...
sys_diag:0717_033712: # /usr/bin/last reboot ...
sys_diag:0717_033712: # /usr/bin/last -200 ...
sys_diag:0717_033712: # /usr/sbin/ipf -T list ...
sys_diag:0717_033712: # cat /etc/ipf/ipf.conf ...
sys_diag:0717_033712: # cat /etc/ipf/pfil.ap ...
sys_diag:0717_033712: # /usr/sbin/ipnat -vls ...
sys_diag:0717_033713: ###### HA/ CLUSTERING INFO ######
sys_diag:0717_033713: ###### SUN N1 Configuration INFO ######
sys_diag:0717_033713: ###### APPLICATION / ORACLE CONFIG FILES ######
sys_diag:0717_033713: ###### PACKAGE INFO / SOLARIS REGISTRY ######
sys_diag:0717_033713: # /usr/bin/prodreg browse ...
sys_diag:0717_033713: # /usr/bin/pkginfo ...
sys_diag:0717_033713: # /usr/bin/pkginfo -l ...
sys_diag:0717_033713: ###### PATCH INFO ######
sys_diag:0717_033713: # /usr/bin/showrev -p ...
sys_diag:0717_033713: # /usr/sadm/bin/smpatch analyze NOT RUN, passwd required....
sys_diag:0717_033753: ###### CRONTAB FILE LISTINGS ######
sys_diag:0717_033753: ###### FMD / SYSTEM MESSAGE/LOG FILES ######
sys_diag:0717_033753: # /usr/sbin/fmadm config ...
sys_diag:0717_033753: # /usr/sbin/fmdump ...
sys_diag:0717_033753: # /usr/sbin/fmstat ...
sys_diag:0717_033753: # tail -250 /var/adm/messages ...
sys_diag:0717_033753: # /usr/bin/dmesg ...
sys_diag:0717_033753: # tail -500 /var/log/syslog ...
sys_diag:0717_033754: ...WAITING 12 seconds for midpoint data collection...

sys_diag: ------- MidPoint Process SNAPSHOT (# 1) -------
sys_diag:0717_033806: Dtrace: TCP write bytes by process ...(_dtcp_tx Snap 1)
sys_diag:0717_033806: Dtrace: TCP read bytes by process ... (_dtcp_rx Snap 1)
sys_diag:0717_033806: Dtrace: systemwide IO / IO wait... (_diow Snap 1)
sys_diag:0717_033832: Dtrace: Syscall count by process... (_dcalls_ Snap 1)
sys_diag:0717_033840: Dtrace: Syscall count by syscall... (_dsyscall_ Snap 1)
sys_diag:0717_033847: Dtrace: Read bytes by process... (_dR_ Snap 1)
sys_diag:0717_033855: Dtrace: Write bytes by process... (_dW_ Snap 1)
sys_diag:0717_033903: Dtrace: Sysinfo counts by process... (_dsinfo_ Snap 1)
sys_diag:0717_033911: Dtrace: Sdt_counts ... (_dsdtcnt_ Snap 1)
sys_diag:0717_033918: Dtrace: Interupt Times [sdt:::intr].. (_dintrtm_ Snap 1)
sys_diag:0717_033918: # ps -e -o ...(by %CPU) ... Snapshot # 1
sys_diag:0717_033918: # ps -e -o ...(by %MEM) ... Snapshot # 1
sys_diag:0717_033929: # pmap -xs 4188 ...
sys_diag:0717_033929: # pmap -S 4188 ...
sys_diag:0717_033929: # pmap -r 4188 ...
sys_diag:0717_033929: # ptree -a 4188 ...
sys_diag:0717_033929: # pfiles 4188 ...
sys_diag:0717_033929: Dtrace: IO by process 4188 ... (_dpio Snap 1)
sys_diag:0717_033935: # pmap -xs 4181 ...
sys_diag:0717_033935: # pmap -S 4181 ...
sys_diag:0717_033935: # pmap -r 4181 ...
sys_diag:0717_033935: # ptree -a 4181 ...
sys_diag:0717_033935: # pfiles 4181 ...
sys_diag:0717_033936: Dtrace: IO by process 4181 ... (_dpio Snap 1)
sys_diag:0717_033942: # /usr/bin/netstat -i -a ...
sys_diag:0717_033942: # Snapshot Kernel Memory Usage.. ::memstat | mdb -k ...
sys_diag:0717_033952: # /usr/sbin/lockstat -IW -n 100000 -s 13 sleep 5 ...
sys_diag:0717_034002: # /usr/sbin/lockstat -A -n 90000 -D15 sleep 5 ...
sys_diag:0717_034015: # /usr/sbin/lockstat -A -s8 -n 90000 -D10 sleep 5 ...
sys_diag:0717_034037: # /usr/sbin/lockstat -AP -n 90000 -D10 sleep 5 ...
sys_diag:0717_034051: Dtrace: Involuntary Context Switches (icsw) by process .. (_dmpc Snap 1)
sys_diag:0717_034056: Dtrace: Cross CPU Calls (xcal) caused by process ........ (_dmpc Snap 1)
sys_diag:0717_034101: Dtrace: MUTEX try lock (smtx) by lwp/process ............ (_dmpc Snap 1)

sys_diag: ------- EndPoint Process SNAPSHOT (# 2) -------
sys_diag:0717_034101: # /usr/bin/kstat -p -T u -n lo0 2>&1
sys_diag:0717_034101: # /usr/bin/kstat -p -T u -n hme0 2>&1
sys_diag:0717_034107: Dtrace: TCP write bytes by process ...(_dtcp_tx Snap 2)
sys_diag:0717_034107: Dtrace: TCP read bytes by process ... (_dtcp_rx Snap 2)
sys_diag:0717_034107: Dtrace: systemwide IO / IO wait... (_diow Snap 2)
sys_diag:0717_034133: Dtrace: Syscall count by process... (_dcalls_ Snap 2)
sys_diag:0717_034141: Dtrace: Syscall count by syscall... (_dsyscall_ Snap 2)
sys_diag:0717_034149: Dtrace: Read bytes by process... (_dR_ Snap 2)
sys_diag:0717_034156: Dtrace: Write bytes by process... (_dW_ Snap 2)
sys_diag:0717_034204: Dtrace: Sysinfo counts by process... (_dsinfo_ Snap 2)
sys_diag:0717_034212: Dtrace: Sdt_counts ... (_dsdtcnt_ Snap 2)
sys_diag:0717_034220: Dtrace: Interupt Times [sdt:::intr].. (_dintrtm_ Snap 2)
sys_diag:0717_034220: # ps -e -o ...(by %CPU) ... Snapshot # 2
sys_diag:0717_034220: # ps -e -o ...(by %MEM) ... Snapshot # 2
sys_diag:0717_034230: # pmap -xs 519 ...
sys_diag:0717_034230: # pmap -S 519 ...
sys_diag:0717_034230: # pmap -r 519 ...
sys_diag:0717_034230: # ptree -a 519 ...
sys_diag:0717_034230: # pfiles 519 ...
sys_diag:0717_034231: Dtrace: IO by process 519 ... (_dpio Snap 2)
sys_diag:0717_034237: # pmap -xs 448 ...
sys_diag:0717_034237: # pmap -S 448 ...
sys_diag:0717_034237: # pmap -r 448 ...
sys_diag:0717_034237: # ptree -a 448 ...
sys_diag:0717_034237: # pfiles 448 ...
sys_diag:0717_034238: Dtrace: IO by process 448 ... (_dpio Snap 2)
sys_diag:0717_034244: # pmap -xs 90 ...
sys_diag:0717_034244: # pmap -S 90 ...
sys_diag:0717_034244: # pmap -r 90 ...
sys_diag:0717_034244: # ptree -a 90 ...
sys_diag:0717_034244: # pfiles 90 ...
sys_diag:0717_034245: Dtrace: IO by process 90 ... (_dpio Snap 2)
sys_diag:0717_034251: # pmap -xs 825 ...
sys_diag:0717_034251: # pmap -S 825 ...
sys_diag:0717_034251: # pmap -r 825 ...
sys_diag:0717_034251: # ptree -a 825 ...
sys_diag:0717_034251: # pfiles 825 ...
sys_diag:0717_034251: Dtrace: IO by process 825 ... (_dpio Snap 2)
sys_diag:0717_034251: # /usr/bin/netstat -i -a ...
sys_diag:0717_034258: # Snapshot Kernel Memory Usage.. ::memstat | mdb -k ...
sys_diag:0717_034307: # /usr/sbin/lockstat -IW -n 100000 -s 13 sleep 5 ...
sys_diag:0717_034317: # /usr/sbin/lockstat -A -n 90000 -D15 sleep 5 ...
sys_diag:0717_034329: # /usr/sbin/lockstat -A -s8 -n 90000 -D10 sleep 5 ...
sys_diag:0717_034344: # /usr/sbin/lockstat -AP -n 90000 -D10 sleep 5 ...
sys_diag:0717_034358: Dtrace: Involuntary Context Switches (icsw) by process .. (_dmpc Snap 2)
sys_diag:0717_034404: Dtrace: Cross CPU Calls (xcal) caused by process ........ (_dmpc Snap 2)
sys_diag:0717_034408: Dtrace: MUTEX try lock (smtx) by lwp/process ............ (_dmpc Snap 2)

sys_diag:0717_034408: ------- Data Collection COMPLETE -------
sys_diag:0717_034408: ###### SYSTEM ANALYSIS : INITIAL FINDINGS ... ######
sys_diag:0717_034414: ###### PERFORMANCE DATA : POTENTIAL ISSUES ######
_____________________________________________________________________________________

sys_diag:0717_034414: ## Analyzing VMSTAT CPU Datafile :
	./sysd_socrates_070717_0332/sysd_vm_socrates_070717_033209.out ...

\* NOTE: 2.6936 % : 8 of 297 VMSTAT CPU entries are WARNINGS!! \*


TOTAL CPU AVGS : RUNQ= 0.1 : BThr= 0.0 : USR= 15.0 : SYS= 11.2 : IDLE= 73.5
PEAK CPU HWMs : RUNQ= 8 : BThr= 0 : USR= 51 : SYS= 96 : IDLE= 0

___________________________________________________________________________________

sys_diag:0717_034414: ## Analyzing VMSTAT MEMORY from Datafile :
./sysd_socrates_070717_0332/sysd_vm_socrates_070717_033209.out ...

\* NOTE: 0.673401 % : 2 of 297 VMSTAT MEMORY entries are WARNINGS!! \*


TOTAL MEM AVGS : SR= 0.0 : SWAP_free= 747697.4 K : FREE_RAM= 287786.6 K
PEAK MEM Usage: SR= 0 : SWAP_free= 500128.0 K : FREE_RAM= 57080.0 K


___________________________________________________________________________________

sys_diag:0717_034414: ## Analyzing MPSTAT Datafile : ./sysd_socrates_070717_0332/sysd_mp_\*.out ...


\* NOTE: 5.20134 % : 31 of 596 MPSTAT CPU entries are WARNINGS!! \*

CPU MP AVGS: Wt= 0: Xcal= 736: csw= 120: icsw= 3: migr= 5: smtx= 3: syscl= 1024
PEAK MP HWMs: Wt= 0: Xcal= 51771: csw= 14108: icsw= 32: migr= 55: smtx= 79: syscl= 25836


NOTE: 0.2% CPU cycles handling TLB MISSES (0.0% ITLB_misses: 0.2% DTLB_misses)

_____________________________________________________________________________________

sys_diag:0717_034414: ## Analyzing IOSTAT Datafile :
./sysd_socrates_070717_0332/sysd_io_\*.out ...


\* NOTE: 14.4578 % : 24 of 166 IOSTAT entries are WARNINGS!! \*

TOP 10 Slowest IO Devices (\* AVG of non-zero device entries \*) :

r/s w/s kr/s kw/s actv wsvc_t asvc_t %w %b device # I/O Samples

32.6 10.8 263.6 24.6 0.8 0.0 13.7 0.0 19 c0t0d0 164
34.0 7.5 10.8 0.0 0.0 0.0 0.0 0.0 0 c0t1d0 2

_____________________________________________________________________________________

CONTROLLER IO : AVG and TOTAL Throughput per HBA (\*active/non-zero entries only\*) :
------------

c0 : AVG : 32.6 r/s | 10.8 w/s | 260.6 kr/s | 24.3 kw/s |
c0 : TOTAL: 5408 r | 1790 w | 43258 kr | 4037 kw | 166 entries

_____________________________________________________________________________________


sys_diag:0717_034414: ## Analyzing NETSTAT Datafiles : ...

\* lo0 : NOTE: 0 % : 0 of 297 NETSTAT entries are WARNINGS!! \*
\* hme0 : NOTE: 0 % : 0 of 297 NETSTAT entries are WARNINGS!! \*


------------ \*MAX_RX_PKTS\* AVG_RX_PKTS AVG_RX_ERRS AVG_TX_PKTS AVG_TX_ERRS AVG_COLL
NET1 : lo0 : 4 0.0 0.0 0.0 0.0 0.0

------------ \*MAX_RX_PKTS\* AVG_RX_PKTS AVG_RX_ERRS AVG_TX_PKTS AVG_TX_ERRS AVG_COLL
NET2 : hme0 : 14 0.4 0.0 0.4 0.0 0.0
: hme0 : TOT_RX_Bytes TOT_TX_Bytes TOT_RX_Packets TOT_TX_Packets TOTAL_Seconds
22210 30348 124 112 328
: hme0:1: TOT_RX_Packets TOT_TX_Packets

: hme0:1: 0 0

NOTE: \*\* 2 ESTABLISHED connections (sockets) exist\*\*

_____________________________________________________________________________________


\* NOTE: CPU=GRN : MEM=GRN : IO=YEL : NET=GRN \*

_____________________________________________________________________________________

sys_diag:0717_034417: ... gen_html_hdr ...
sys_diag:0717_034417: ... gen_html_rpt ...


sys_diag:0717_034419: ## Generating TAR file : ./sysd_socrates_070717_0332.tar ...

tar -cvf ./sysd_socrates_070717_0332.tar ./sysd_socrates_070717_0332 1>/dev/null
compress ./sysd_socrates_070717_0332.tarData files have been TARed and compressed in :

\*\*\* ./sysd_socrates_070717_0332.tar.Z \*\*\*

------- Sys_Diag Complete -------
#


( Copyright 2007, Todd A. Jobson )
Add to Technorati Favorites

Thursday Jul 12, 2007

What is sys_diag ?.. Automating Solaris Performance Profiling and Workload Characterization.

The following is an excerpt from the README_sys_diag.txt file which gives a
high level overview of the sys_diag capabilities and command line arguments /
/ usage examples... I've created this over many years to automate and reduce
the amount of time it takes to gather and correlate system data for conducting off-site (remote) Performance and Configuration Analysis.  With sys_diag
all you need to do is download the ksh script.. and you're on your way. After
it's run.. you get a single .tar.Z that you can upload or email for remote
analysis.. (-G even includes a wide variety of DTrace examination..).

Read the following introduction.. and more specific examples will follow.

Enjoy,
Todd



sys_diag v.7.04 Overview :
________________________

BACKGROUND / INTRODUCTION :

  sys_diag is a Solaris utility (ksh script) that can perform several
  functions, among them, system configuration 'snapshot' and reporting
  (detailed or high-level) plus workload characterization/profiling via performance
  data gathering (over some specified duration or time in point 'snapshot'),
  high-level analysis, and reporting of findings/exceptions (based upon
  perf thresholds that can be easily changed within the script header).

  The output is provided in a single .tar.Z of output and corresponding
  data files, and a local sub-directory where report/data files are stored.
  The report format is provided in .html, .txt, and .ps as a single file
  for easy review (without requiring trudging through several subdirectories
  of separate files to manually correlate and review).

  sys_diag runs on any Solaris 2.6 (or above) Sun platform, including
  reporting of new Solaris 10 capabilities (zone/containers, SVM,
  zfspools, fmd, ipfilter/ipnat, link aggr, Dtrace probing, etc...).

  Beyond the Sun configuration reporting commands [System/storage HW config,
  OS config, kernel tunables, network/IPMP/Trunking/LLT config, FS/VM/NFS,
  users/groups, security, NameSvcs, pkgs, patches, errors/warnings, and
  system/network performance metrics...], sys_diag also captures relevant
  application configuration details, such as Sun N1, Sun Cluster 2.x/3.x,
  Veritas VCS/VM/vxfs.., Oracle .ora/listener files, etc.. detailed
  configuration capture of key files (and tracking of changes via -t), etc ...

  Of all the capabilities, the greatest benefits are found by being able
  to run this single ksh script on a system and do the analysis  from one single report/
  file... offline/elsewhere (in addition to being capable of  historically
  archiving system configurations, for disaster recovery.. or to allow for
  tracking system chgs over time.. after things are built/tested/certified).

  One nice feature for performance analysis is that the vmstat and netstat
  data is exported in a text format friendly to import and created graphs
  from in StarOffice or Excell.. as well as creating IO and NET device
  Averages from IOSTAT / Netstat data (# IO's per device, AVG R/W K, etc..)
  along with peak exceptions for CPU / MEM / IO / NET ..

  Although this tool isn't meant to replace long-term historical Performance
  Trending and Capacity Planning packages (Teamquest, etc..), it provides the
  foundation and basis for a very robust starting point (and actually is much
  better at point in time workload characterization and root cause analysis of
  bottlenecks, where very granular detailed data correlation is required).

  Even though I'm a Sun employee, this has been personally developed over many
  years, in my spare time in order to make my life a lot easier and
  more efficient.  Hopefully others will find this utility capable of
  doing the same for them, also making use of it's legwork.. to streamline
  the admin/analysis activities required of them.  This has been an invaluable
  tool used to diagnose / analyze hundreds of performance and/or configs issues

  Regarding the system overhead, sys_diag runs all commands in a serial
  fashion (waiting for each command to complete before running the next)
  impacting system performance the same as if an admin were typing these
  commands one at a time on a console.. with the exception of the background
  vmstat/mpstat/iostat/netstat that's done when gathering performance data
  (-g | -G) over some interval for report/analysis (which generally has minimal
  impact on a system, especially if the sample interval [-I] is not every
  second, or if the lighter weight -g is run vs. -G detailed/Dtrace snapshots).

  sys_diag is generally run from /var/tmp as "sys_diag -l"  for creating
  a detailed long report, or via "sys_diag -g -l " for gathering
  performance data and generating a long/detailed config/analysis report),
  however offers many command line parameters documented within the header,
  or via "sysdiag -?".   \*\* READ the Usage below, as well as the Performance
  Parameters sections for further enlightenment.. ;)


  NOTE: For the best .html viewing experience, Do NOT use MS Internet Explorer browser
        as it varies in support of HTML stds for formating and iframe file inclusion
        (ending up opening many windows vs embedding output files within
        the single .html report).  \*\* USE Netscape, Mozilla, Firefox, etc.. browsers,
        ensuring that your display resolution is set to the maximum resolution, and
        font sizes are defaults or not made too large (for best viewing open full screen)

    \*\*\* As is the best practice for any environment, first TEST thoroughly on a representative
        TEST configuraiton PRIOR to running this or making any production system changes.
        (read the sys_diag ksh headers for disclaimer and support notes) \*\*\*

\*\* See  :  http://blogs.sun.com/toddjobson/  for other entries relating to system performance,
                capacity planning, and systems architecture / availability.

\*\* For the latest release of sys_diag see either BigAdmin or SunFreeware.com at the following URL's :

     http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sys_diag__solaris_c

     http://www.sunfreeware.com 

___________________________________________________________________________________________
___________________________________________________________________________________________

___________________________________________________

Common Command Line usage and available parameters :
___________________________________________________

COMMAND USAGE :

	# sys_diag [-a -A -c -C -d_ -D -f_ -g -G -H -I_ -l -L_ -n -o_ -p -P -s -S -T_ -t -u -v -V -h|-? ]


-a Application details (included in -l/-A)
-A ALL Options are turned on, except Debug and -u
-c Configuration details (included in -l/-A)
-C Cleanup Files and remove Directory if tar works
-d path Base directory for data directory / files
-D Debug mode (ksh set -x .. echo statements/variables/evaluations)
-f input_file Used with -t to list configuration files to Track changes for
-g gather Performance data (2 sec intervals for 5 mins, unless -I |-T exist)
-G GATHER Extra Perf data (S10 Dtrace, more lockstats, pmap/pfiles) vs -g
-h | -? Help / Command Usage (this listing)
-H HA config and stats
-I secs Perf Gathering Sample Interval (default is 2 secs)
-l Long Listing (most details, but not -g,-V,-A,-t,-D)
-L label_descr_nospaces (Descriptive Label For Report)
-n Network configuration and stats (also included in -l/-A except ndd settings)
-o outfile Output filename (stored under sub-dir created)
-p Generate Postscript Report, along with .txt, and .html
-P -d ./data_dir_path Post-process the Perf data skipped with -S and finish .html rpt
-s SecurIty configuration
-S SKIP POST PROCESSing of Performance data (use -P -d data_dir to complete)
-t Track configuration / cfg_file changes (Saves/Rpts cfg/file chgs \*see -f)
-T secs Perf Gathering Total Duration (default is 300 secs =5 mins)
-u unTar ed: (do NOT create a tar file)
-v version Information for sys_diag
-V Verbose Mode (adds path_to_inst, network dev's ndd settings, mdb, snoop..)
Longer message/error/log listings. Additionally, pmap is run if -g ||-G,
and the probe duration for Dtrace and lockstat sampling is widened
from 2 seconds (during -G) to 5 seconds (if -G && -V). Ping is
also run against the default route and google.com to guage latency.


NOTE: NO args equates to a brief rpt (No -A,-g/I,-l,-t,-D,-V,..)

\*\* Also, note that option/parameter ordering is flexible, as well as use of white
space before arguments to parameters (or not). The only requirement is to list
every option/parameter separately with a preceeding - (-g -l , but not -gl).

BOTH of the following command line syntax examples is functionally the same :

eg. ./sys_diag -g -I 1 -T 1800 -t -f ./config_files -l
OR
./sys_diag -g -l -t -f./config_files -I1 -T1800

 

------------------------------------------------------------------------------------------------

  eg.   Common Usage :
-------------------
./sys_diag -l Creates a LONG /detailed configuration rpt (.html/.txt)
Without -l, the config report created has basic system cfg details.

./sys_diag -g -l gathers performance data at the default sampling rate of 2 secs for

a total duration of 5 mins, adding a color coded performance header/
Dashboard Summary section and any performance findings/
exceptions found to the long (-l) cfg rpt. Also takes (3) starting/
midpt/endpoint snapshots using minimal lockstat/kstat (1sec)
			     NOTE: -g is meant to gather perf data without overhead, therefore
  only 1 second lockstat samples are taken. Use -G and/or -V
for more detailed system probing (see examples and notes below)
Using -V with -g, adds pmap/pfiles snapshots, vs. using -G
to also capture Dtrace and extended lockstat probing.
                           \* Any time that sys_diag is run with either -g or -G, the command
line output is appended to the file sys_diag_perflog.out, which
gets copied and archived as part of the final .tar.Z output file.

  ./sys_diag -g -I 1 -T 600 -l gathers perf data at 1 sec samples for 10 mins and
creates a long config rpt as noted above. Also does

basic start/mid/endpoint sampling using lockstat/kstat/pmap.


./sys_diag -l -C creates long config rpt, and Cleans up..

aka removes the data directory after tar.Z completes

./sys_diag -d base_directory_path (changes the base dir for datafiles from curr dir)

./sys_diag -G -I 1 -T 600 -l Gathers DEEP performance & Dtrace/lockstat/pmap data

at 1 sec samples for 10 mins & creates a long cfg rpt
(in addition to the standard data gathering from -g).

\*NOTE: this runs all Dtrace/Lockstat/Pmap probing during 3 snapshot intervals

(beginning_0/midpoint_1/ and endpoint_#2 snapshots), limiting probing
overhead to BEFORE/AFTER the standard data gathering begins
(vmstat, mpstat, iostat, netstat, .. from -g).
The MIDPOINT probing occurs at a known point as not to confuse this
activity for other system processing.

\*Because of this, standard data collection may not start for 30+ seconds,
or until the beginning snapshot (snapshot_#0) is complete.

                        (-g snapshot_#0 activities only take a couple seconds to complete, since 
they do not include any Dtrace/lockstat.. beyond 1 sec samples).

./sys_diag -G -V -I 1 -T 600 Gathers DEEP, VERBOSE, performance & Dtrace/lockstat/pmap

data at 1 sec samples for 10 mins (using 5 second Dtrace and
Lockstat snapshots, vs. 2 second probes for only -G.
(in addition to the standard data gathering from -g).

./sys_diag -g -l -S (gathers perf data, runs long config rpt, and SKIPS Post-Processing

and .html report generation)

\*\* This allows for completing the post-processing/analysis activities
either on another system, or at a later time, as long as the data_directory
exists (which can be extracted from the .tar.Z, then refered to as
-d data_dir_path ). \*\* See the next example using -P -d data_path \*\*

./sys_diag -P -d ./data_dir_path (Completes Skipped Post-Processing & .html rpt creation)



(Copyright 2007, Todd A. Jobson)
Add to Technorati Favorites

Tuesday Jul 10, 2007

Solaris Performance Analysis and Monitoring Tools... at what cost ?

In the area of Performance Analysis and related Monitoring tools, you'll find a plethora available for the Solaris environment. Each of them has it's own intrinsic costs associated.. listed here :


  • Monetary Costs ($$) :

    • Purchase Cost (Media, Documentation, etc..)
    • License Fees
    • Centralized or Management Server Required ? (HW Costs for System / Storage)
    • Hourly Costs of an Staff/ SME/ Consultant to Install/Config, Correlate, Interpret, Rpt findings...


  • Time / Effort Costs :

    • 3rd Party Installation / Configuration Pre-Requisites (libraries, tools Perl/Python.., etc..)
    • Server OS and Tools Design Requirements (Security, OS rev's, RAM, CPU, Storage, FS, Patches,..)
    • Server Installation (Rack/Stack, Network, Power, OS Install, Patching, Storage Cfg,..)
    • Server Toolset Installation (Installation, Configuration, License downloads, ..)
    • Client Node Agents Required for Installation / Configuration ?
    • Project/Manpower Time and Coordination Dependencies for an SME/Consultant vs. Other Resources (system, network, storage, etc...).
    • Time spent Installing, Configuring, Testing, Patching/Tweaking, Running, Correlating data, Analyzing/ Interpreting data, Reporting Findings...
    • Time spent learning the Toolset and how to interpret the raw and correlated data (thresholds, etc?)


  • System Overhead Costs :

    • CPU Consumption (% standard overhead vs. PEAK Load overhead)
    • Memory Consumption (RAM footprint.. standard vs. Peak load)
    • Storage Requirements (Toolset Installation space vs. active / historical storage vs. archiving req's)
    • RunTime Requirements (Running Constantly vs. During Specific PEAK load Intervals, Sampling Rate, ..)
    • Network Overhead (bandwidth and/or interrupt overhead due to data passed between client/server repository vs. local storage)
    • I/O Overhead (overhead of performing local IO.. generally depends on volume of data stored and sampling rates)


The Benefits of Accurate, Detailed, and Complete Data Gathering ...


\*\* NOTE: .. a key Attribute often left out is the ACCURACY and RELEVANCE of performance data captured (based up on the time it was captured, the sampling rates, and the level of detail provided).

This in many instances requires weighing the costs of having point in time event "detailed" snapshots (where the sampling rate intervals are very narrow.. per sec, etc.), vs. long-term historical trending data (where samples are aggregated and averaged over longer timeframes minimizing the storage requirements, but also smoothing out the Peak load visibility). For example, if you use a toolset or individual utility that can capture performance data at 1 second intervals, you will see a very granular view of systems utilization and PEAK load activity (resouce consumption, contention events, etc.).. VS.. using a historical trending toolset that can only save data at 1, 5, or 10 minute Averages.. (due to the contstraints of storage space available for the long periods of data that must be kept).

This might not seem like much would be missed, however.. even the difference between 1 second and 1 minute samples can be astronomical.. where 80 samples with 95% idle and 20 samples with 100% utilization (0% Idle) and a huge run queue will get "smoothed" out to a one minute sample where the box "appears" only 24% utilized (76% idle).. although the system is thrashing 20% of the time.  Even within the period of a second, you have over a billion instructions that get run on modern cpu's running at GHz + clock rates (Billions of cycles per second).. and only one aggregated sample for that period.

For complete end-to-end Capacity Planning and Performance Analysis capabilities, BOTH types of data is generally required (longer term trending for Capacity Planning purposes via graphs, etc.. VS. short term detailed drill down of system activity for point in time PEAK LOAD periods, allowing for detailed performance and utilization assessment / correlation).

Without detailed and granular data during peak periods, there can be no real correlation of root causes or specific bottlenecks... and in the same regard, without long-term, historical data that shows growth rates in capacity and cycles (patterns and models) of utilization and Peak activity.. accurate Capacity Planning isn't feasible.
\*\*

 

..  if data captured doesn't include peak activity, or the granularity of samples is too sparse.. (not reflecting peak events), ...  then that data can only be useful for defining a BASELINE of Average Utilization.

 


MANY, many, .. tools

 

A wide variety of performance tools can be found.. from the high end.. using end-to-end third party products such as Teamquest (which provides a graphical, historical vantage point).. than need to be purchased, installed, and trained on... to the OS built-in utilities and the freely available open source / public domain variteies.

However, either way you go, be prepared for the requiring learning curve.. along with the extensive manual process and time required to identify and run the utilities, before you can capture and begin the extensive correlation process on the data from several disparate utilities (before you even get to do the analysis of your findings).

Either approach has it's advantages and disadvantages.. along with their strengths and weaknesses (3rd party purchased suites might save you time in graphical aggregation and correlation.. but tend to limit the level of detail and granularity available vs. what the OS utilities will provide).

The basic list of KEY "built-in" tools historically available for monitoring performance applies to nearly any Unix/Linux distribution, including the following partial list of common utilities used ... following the basic breakdown of computing subsystems :

\*\* CPU / Kernel Utilization :

--> vmstat (vm system cpu and kernel utilization metrics \*\* a great starting pt \*\*)
--> mpstat (multi processor .. per cpu performance statistics)

\*\* Memory / Kenel Utilization :

--> vmstat
--> ipcs

--> swap
--> top

\*\* I/O Performance :

--> iostat (Standard IO.. ufs, .. IO performance utility)
--> vxstat (Veritas vxfs filesystem IO performance)

\*\* Network Utilization :

--> netstat
--> ping
--> traceroute

\*\* Process / Kernel :

--> ps
--> top
--> prstat

--> ...

\* sar (provides most basic types of high level performance metrics, assuming that system accounting is turned on, which does incur some level of system overhead when always running)

 


SOLARIS 10 ... Above and Beyond other Unix / Linux Distributions ... 

 

In addition to the basic toolsets available, there exist the following key additions that Solaris 10 provides, which sets it apart from the other Unix / Linux variants.

\*\* DTrace (Dynamic Tracing via "D" language scripting and probe/providers)

__ Dtrace is the "Electron microscope" of performance analysis for a Solaris 10 system
See the DtraceToolkit for a long list of specific Dtrace scripts (several of which are used
within sys_diag, among others created)

\*\* lockstat (uses the kernel dtrace infrastructure) Summarizes system lock/mutex contention

\*\* Mdb (Modular Debugger)

\* kstat (Kernel statistics .. counters, etc..)

\* cpustat / cputrack (cpu statistics, system-wide or per process)

\* intrstat, trapstat (interrupt and system trap, I/DTLB_miss statistics, ..)

\* ... & many more.. [this list will be re-done in a future blog with a more thorough breakdown.. ]

___________________________________________________________________________________

The Time Saving.. automated nature... of SYS_DIAG   :)


Over the past several years, I have created a utility called "sys_diag" that offers the capability of automatically capturing performance statistics, using nearly all available system utilities.. and aggregating the data, performing analysis and HTML report generation of findings. Sys_diag creates a single .tar.Z compressed archive that can be emailed/ftp'd.. for performing system configuration and/or performance analysis off-site.. from virtually anywhere.. saving a LOT of time.. not requiring any 3rd party tools or agents to be installed on a system other than downloading the "sys_diag" ksh script itself (with a color coded dashboard.. and links to detailed analysis findings).  Virtually no learning curve is required for loading, running, and reflecting basic performance profiling, including high level subsystem bottlenecks (deeper root cause correlation might require some level of advanced sys admin knowledge).

Beyond performance analysis, sys_diag can be used to also generate a detailed configuration snapshot report, including OS, HW, Storage, SW, 3PP configuration attributes, among several other capabilities that it provides.

\*\* See the next blog entry for more details and examples on sys_diag \*\*.
The published repository and high level description of sys_diag is always available at BigAdmin using the following URL :
http://www.sun.com/bigadmin/jsp/descFile.jsp?url=descAll/sys_diag__solaris_c

(Copyright 2007 Todd A. Jobson)


Add to Technorati Favorites

Wednesday May 16, 2007

What is Performance ? .. in the Real World


When we think of "Performance", the definition can have/take many connotations...

In the context of computing, the dictionary defines it as : (http://dictionary.reference.com/browse/performance)

PERFORMANCE (noun) :

"The manner in which or the efficiency with which something reacts or fulfills its intended purpose." or "the execution or accomplishment of work, acts, feats, etc."

 

From this definition, it can be readily seen that the "efficiency" and overall "utilization" of resources is a key characteristic of the "performance" of a system (also leaving room for some subjective interpretation).

 

Real World Performance.. and the holistic viewpoint

The other key aspects of assessing the performance, whether in the real world, or that of a system, relates directly to the volume of productive OUTPUT over a duration of TIME that a system produces.

In the arena of Information Technology.. as in real world performance (auto's, economics, the human body, etc..) the entity as a whole needs to be examined, allowing for symptoms to be identified in one or more areas... aka.. "sub-systems".   Hence, the complete Integrated "system" as a whole.. comes to life with it's own unique dynamics and patterns that need to be examined.

(eg.   one analogy might be the "performance" of a race car .. dependent upon the design/architecture of the vehicle.. and all it's constituent components.. the chasis [weight, flexibility, ..],  the steering [responsiveness, turn ratio..], the engine [horsepower, air/fule intake, exhaust output, ..], the Transmission [gear ratios, latency in shifting.., MTBF of clutch,..], braking [responsiveness, 60-0 secs, ..], tires [ G's on the skidpad, wear rate, ..], .. and Overall Performance.. [0-60 acceleration, MPG, top speed, slalom speed, ..] .. individually each can be measured easily.. but as a whole.. the INTEGRATED "system dynamics" come into play ). 

The same can be said (and analogous) to most "systems"... hence, looking at the environment as a whole is crucial ...

 

The "Application Environment" ... 

That holistic entity in the arena of Computing is called the "Application Environment".. comprised of all the systems and underlying nested/encapsulated sub-systems. In the IT world, an Application Environment is composed of all the underlying infrastructure that together provides and supports the "Service(s)" (environments, systems, networks, storage, OS, Application Software, etc...).

 

"Perceived" Performance and Expectations :

For any system (or environment), the ultimate guage is in the "Perception" of it's performance, relating to whether or not it can fullfil the expectations of it's client user community.

How efficient, proficient, and/or productive we perceive something to be, is in large part.. a product of our vantage point (perception), and how we judge or evaluate it... according to our expectations, pre-conceived notions (rules), and the means available to us for measuring it (tools, etc..).
The perception of one impatient user doesn't always accurately reflect the responsiveness, efficiency, or other attributes for evaluating the performance / workload characterization of a system.

 

Understanding , Metrics, and Measurement ...

From this vantage point, it becomes evident that in Assessing a system, there must be measurement of key attributes .. aka.. METRICS... and in order to define key metrics that can/should be monitored, we must first UNDERSTAND the system and how it works (components, mechanics, inputs / outputs, among other items that can be measured).

Hence, "If you can't understand it... you can't effectively measure it, .. and if you can't measure it.. you can't assess it..". (T.Jobson 7/2006)

 

Requirements dictate Measurements... driven by Service Level Commitments :

Of the various vantage points that a system's performance is guaged, the following
attributes (relating to specific Metrics that can be sampled) are typically those
which Service Level Agreements (SLA's) and/or Commitments (SLC's) are based upon (reflecting Customer Requirements and "acceptable" Thresholds.. ) :

  • Response Time (Client GUI's, Client/Server Transactions, Service Transactions,..) Measured as "acceptable" Latency.

  • Throughput (how much Volume of data can be pushed through a specific subsystem.. IO, Net..)
  • Transaction Rates (DataBase, Application Services, Infrastructure / OS / Network.. Services, etc.).  These can be either rates per Second, Hour, or even Day... measuring various service-related transactions.
  • Failure Rates (# or Frequency of exceeding  High or Low Water Marks .. aka Threshold Exceptions)
  • Resource Utilization (CPU Kernel vs. User vs. Idle, Memory Consumption, etc..)
  • Startup Time (System HW, OS boot, Volume Mgmt Mirroring, Filesystem validation, Cluster Data Services, etc..)
  • FailOver / Recovery Time (HA clustered DataServices, Disaster Recovery of a Geographic Service, ..)  Time to recover a failed Service (includes recovery and/or startup time of restoring the failed Service)

  • etc ...

    For any exceptions to the "acceptable" thresholds listed above, SLA's typically reflect PENALTIES ($$$).

 

Latency ... the heart of a Bottleneck ...

Each of the attributes and perceived guages of performance listed above has it's own intrinsic relationships and dependencies to specific subsystems and components... in turn reflecting a type of "latency" (delay in response). It is these latencies that are investigated and examined for root cause and correlation as the basis for most Performance Analysis activities.

\*\* STAY TUNED \*\*..  Look for my up-coming blog entry on "The many Flavors of system Latency..".

Future blog entries will expand upon this baseline definition of performance.. so keep your eye's peeled.. and look at the world around you.. from as many vantage points as possible... Perspective is key.. hand in hand with understanding the world around us... Don't be afraid to ask why.. and dig deeper.. there's typically a reason for everything if you look at it with an open mind.. understanding the fundamentals first !

Todd ;) :)

(Copyright 2007, Todd A. Jobson)


Add to Technorati Favorites

Monday May 14, 2007

Destined or decided..?

Don't Forget...

"We're each the sum of our live's experiences... the product of the decisions that we make... and the actions that we take.. or DON'T take..". Todd Jobson Sept.2001

Be responsible.. open your eye's.. don't be afraid to ask WHY, challenge yourself to learn something new every day.. explore, take time to ingest life's simple pleasures...but most importantly.. ENJOY !!

Todd

(Copyright 2001-2007 Todd A. Jobson )
Add to Technorati Favorites

Wednesday Feb 28, 2007

BACK... and Empowered by Perseverance !

333

LOOKOUT Folks.. He's back.. :)

With a fire in my belly.. and no room for failure..
I'm refueling my determined inspiration to regain
some lost ground.. and capture some new turf ..
along with the help of a renewed perseverance.

Accomplishments all start with hope, belief, and faith
in yourself.. as part of a larger vision of success.. all
built upon the foundation to achieve.. as we strive for happiness.

Sometimes, life seems to leave us with our hands tied behind our
backs... perpetually kept distant from the carrot of success .. just
out of our reach. However, with renewed conviction.. and commitment,
there comes a point, when focused determination, love, and adversity
collide head on... this is when passionate perseverance WILL prevail !!!

This time, it's a NEW GAME ... with new rules.. and the wind's at YOUR back !!
Here's to that steadfast persistence, your conviction to succeed ... no matter
what obstacles are thrown before you .. your tenacity will ring true !

Below is a little poem I just wrote .. reflecting on the task at hand.. charting the
course.. carving our destiny ..one decision at a time.. and then acting on those decisions..
no longer being the bystander's in our own lives..

Let's turn up the heat and ignite the fire within each of us ...
... with a little extra added... ROCK and ROLL... thrown in for good measure .. ;)

Stand back .. warm yourselves.. toast some marshmallows.. and turn up the volume,
cuz... the party's about to begin.... ALL Over Again... ;) :)


Perseverance :


With a bounce in his step..
and more spring to his stride..

.. he resurfaces anew..
empowered.. and Glowing inside !


Focused on goals..
his mission is clear.

Inspired and Determined,
he devours all fear !


His patient perseverance
once again prevails

..steadfast, his princess
sets wind to his sails.


Off again he goes,
into the eye of the storm..

Strengthened and Hardened
his heart's been reborn.


He looks down from below
but jumps from above

clutching his heart in battle,
he's fueled by his love.


With his mind set on success
and a belief in himself,

The future, his present,
conviction .. his wealth.


The achievment, his purpose
his passion to compete

.. Persistent, tenacious,
overcoming defeat.


As he stands tall in victory
with a hunger to thrive,

he basks in the glory
persevering, once again .. ALIVE !


( Copyright 02/18/2007, Todd A. Jobson )


Don't forget .....
the origin of Everything ... starts somewhere !! ..

"...every new beginning.. comes from some other beginning's end.."

Turn UP your volume and Listen to my tribute to persevering ...
and plug in your appliances.. cuz this man's Batteries are Recharged..
 :) ;) ENJOY !!

Todd



Add to Technorati Favorites

Nickelback : Hero



Add to Technorati Favorites

Friday Jan 26, 2007

Deception.. my Reflections..


Take a close look.. Not everything in Life is as it Appears...



333





Deception, Trust, .. and Online Security ...



In this fast-paced, internet-based culture that we're all a part of, there are a few things that we all need to be aware of.   Among the long list.. items that we all need to be cognizant of, relates to online deception in all it's forms (MySpace/ Yahoo 360 Identity Fraud, Identity Theft, Security of our personal information, players, con-men/women, etc..). 

Unfortunately, an ever growing majority of those on-line are NOT honest.. trustworthy.. or even who they say they are.    MOST of these deceivers are fulfilling some inner need (conning others, living vicariously, boosting personal confidence, a fling.. a cheap thrill.. who knows).. most compelled to continue.. or addicted to the virtual recognition or relationships they hold (not honestly knowing the person on the other end of the wire).

While we need to protect our identities and personal information, we also need to work on being honest once we establish a friendship (granted.. we do need some level of keeping our guard up initially to protect ourselves.. take things slow.. don't be in a rush to give out too much about yourself.. be careful).



No matter what.. nearly every page online is backed by a person that can be hurt.. so I've written the following poem below to reflect on the impact of deception  :



Reflections on ....


DECEPTION

The Players, the Liars, the Pathological pseudo-sympathizers..
they prey on the vulnerable,

the weak.. those sad and meek.


Those lonely, or alone,
feeling broken.. or trapped at home,

They're the ones they seek..
those helpless and weak.


Fakes and impostors,
posers with on-line rosters.

Perception, just a reflection..
a Relative Reality of a friendship defection.

.. but Truth and honesty is the exception,
when deception is my reflection...
..a window on a world where trust is an election.


A Disillusioned Delusion
of deceitful collusion..

.. on-line love.. is it real,
or just an illusion ?



For many... Exploitation is their Specialization,
their duality of sexuality..

.. a vacation for excitation.
Their lapse.. in realization

.. to others.. real hell under magnification.


Expectations and elation,
all relate to a perverted sensation.

Temptation or Emancipation,
just casual exploits of the internet generation.


To be used and discarded..
left mentally scarred and broken hearted..

our fellings shattered... hearts guarded..
to the abuser ... a victory, to the victim... abuse.


Elusive invaders,
"users" .. the Invasive persuaders

Some weekend offenders,
others weathered pretenders,
all on the loose.


Predators that pray
and prey that Surrenders,

they cast their veil by day
the lure, who remembers,
most free to run their course.

Lets hope Justice prevails
..criminals punished, some in jails,
infused with a sense of remorse.


But before we claim victory.. elated,
let's ask ourselves..
Is Jealousy and Deception.. anticipated ?


Could it just be ..

.. the "Invasion" of an Illusion.. ?
.. or simply, the "Illusion" of an Invasion ?


To me, they're all Reflections..

.. Deception .. Our perception of an expectation.


Todd A. Jobson    1/22/2007

Copyright   2006-2007  Todd A. Jobson



Be careful.. and choose your words wisely.. Broken Hearts and shattered dreams are just a keystroke away.


TRUE FRIENDS ...


TRUE Friends are ones that won't run away on a moment's notice..
when the going gets tough.. (those are fair weather friends that we each
have encountered time and time again.. always looking to fulfill their own needs,
irrespective of the needs that others might have.. theirs always coming first).


Everyone hears what you say.. but TRUE Friends LISTEN to what you
are thinking about..and feel.. there for you during your darkest moments.


Caring and compassion is a key, but honesty and trust is a must.

True friends can share fears and hopes, tears and their deepest aspirations.


When all else in your life seems bleak or is gone .. you've lost hope and feel all alone..
True Friends will be there.


True Friends won't try to sway one another with Political or Religious debates.. or throw
a temper tantrum when they don't get their way. They will forgive and say their sorry..
they won't let bad days or mood swings get between them and those around them.

True friends will always care about more than just themselves.. shining when the clouds set in...

Few of us appreciate what we have.. until we loose it..

Think about the simple things in life that we have lost appreciation for.. and give thanks
to those around us just as they deserve.. reminding them of just how much they mean to us..
because here today.. could easily be .. gone tomorrow..
(Our Children, our spouses / significant others, parents/ family, friends, .. our health, our careers, our finances, our freedoms, the safety that we take for granted, the abundance of food / shelter, ..our lifestyle.. etc.. etc..)


It all underscores the topic at hand ... Loyalty, Honesty, Integrity.. all those Virtues and Values that each of us should uphold in any relationship, whether it's a friendship.. or something more.. we all need to Give Thanks !

Ask yourself... If I (or another friend) needed to reach out in a time of need... would you be there ? ?
.. and do I appreciate what those friends have done for me.. or do I just expect it ??..

.. am I only fulfilling MY needs ??.. or am I also recipricating and fulfilling the needs of those that have been there for me ??..


Balance in life is a must. To sustain what we have.. everything needs to be a 2 way street.. give thanks and don't take life .. (including those that care about you) for granted !!


Your Trustworthy blogger ..

Todd J.

Copyright :  Todd A. Jobson  2006-2007


Monday Jan 22, 2007

More Reflections on Life : Our Journey.. Goals, Success, Confidence...

234


\*Caution\* : This should probably be multiple blogs.. but.. If someone wants "food for thought" in the interim.. well here you go ! Be prepared, since a few verbose tangents are surely to follow. :)

This posting is a collection of : thoughts.. self help (success..setting goals, Confidence/failure cycles), science, psychology, philosophy, some lessons learned.. all Reflections on Life... 

Also note.. this blog is a follow-on to my early Nov '06 blog.. "Lessons Learned.."   So, if you haven't yet read that one recently.. you probably should do so as a Part 1 of this Blog (I apologize if there's any overlap).  Together, you'll get a good grasp of who I am.. and what I represent.  Though this one is a bit more fragmented than I had hoped for (lots of notes put together here).. it's a starting point for future editing to come.. when I get the time. (in most of my blogs.. clicking on the highlited words will bring you to other links for that topic..give it a try)


Please don't be shy and Comment openly. With that said.. buckle your seat belts.. OPEN your minds... turn off your cell phones.. and away we go ... the wisdom of the elders is awaiting you..(sounds good at least .. lol).



The most Valuable things in life take Time, Planning, .. and Effort..

Each of us should ALWAYS have a list of priorities in our lives. This includes
some definition of Goals... WHAT we want to get out of life.. and the steps we need
to take to get there. Together.. all this gives us guidance and STRUCTURE that we
can lean on and strive towards. It allows us to maintain FOCUS on what's important,
at the same time, allowing us to block out what's just a distraction.. simply taking us
farther from achieving our goals.. or wasting our time. Almost, if not more importantly, clearly defined objectives allow us to gain a "sense of Accomplishment" and/or recognition that we all need to fuel our motivation.. a core need that boosts our energy through self esteem and the resulting confidence that we all need to succeed in life.


Make up your Mind & Decide... then ACT on your Decisions...

If you don't have any set goals / objectives.. or even if there's no clear purpose in your life.. then CREATE IT !

Realize that virtually NOTHING is PreDestined !! The end result --> the Outcome of
situations is usually a complex combination of MANY variables (all the entities, actions, and reactions..). Unfortunately, the ones that are unpredictable are the ones that involve People (indecisive buggers!). However, the remaining aspects are reasonably predictable through common probability, statistics (the 1 in 2 a..50-50 chance of tossing tails with a coin.. or rolling a die with 1/6 probability of any side) and laws of Physics (outcomes based on laws of motion, energy, matter, mechanics, etc..).

We can sabotage our own lives.. just as easily as we can etch paths to success that will bring us closer to our end Goals.. one step at a time. Our probability of success goes down with fewer opportunities, with less confidence, or by not defining or acting towards our end goals. We can increase our probability of success by increasing the number of "predictable" steps (attempts / tries..) that we take... not giving up, but instead learning from our failures.


Although there are few clearly delineated "true" Black or White Decisions in life.. we can always "narrow down" the alternatives to a handful of choices by taking a look at the big picture around us.. to reflect on ourselves, our actions.. our OPTIONS... and those around us.

To make the best decisions possible, write down all possible alternatives.. and then associate advantages (pros) and disadvantages (cons) with each... and the consequences (End Result) of each alternative. If one path brings you closer to your end objectives (the positives outweighing the negatives..), then that's the choice to make.

Although at times we feel as limited as "Pawns in the game of Life" .. if we carefully plan and weigh out our decisions.. the road to goals (along with happiness) can be etched out in front of us (play a game of chess once in a while at games.yahoo.com to brush up on your desicsion making skills).


Not Acting on our decisions is actually worse than not Deciding .. in that we end up perpetuating the same scenario .. reinforcing the negative aspects of our lives.


EVERYTHING is RELATIVE to the scenario !! Take 10+ steps back and don't act rashly.. Make Decisions based up on facts and Goals.. not just emotions (though they help act as our compass and guide us).


If something is important to you.. then it's definitely worth planning for.. most likely even Fighting for.. so Make up your mind (DECIDE) .. and ACT NOW !!


No more .. Should Have, Could Have... Would Have...
Don't be a bystander in your life... TAKE CHARGE and Make A Difference !

Among the simple concepts of Goals and the importance of Structure and a sense of Accomplishment, I've outlined the following cycles that I've experienced or witnessed countless times in life.. to explain the dynamics and semantics (along with common pitfalls) of succeeding in life.


The Goal Cycle :

  • Outline Priorities in your life and create a List of GOALS and Objectives
  • Envision the END POINT of Success for each Goal (where you want to be)
  • Sell yourself on the purpose and Course of action (plan) for Each Goal.
  • Decide what steps are need to get you from here to there.
Without clearly defined goals.. our life wanders aimlessly. There's little hope since there are no targets for accomplishment or achievement. Don't be afraid to Decide and change your life for the best ! (people do typically resist change.. as it touches on the fears of not being able to control what is unknown.. vs. what is familiar.. see the Failure Cycle below)

RULE: Every time you achieve a Milestone along the way (the critical path), or reach your end GOAL(s)... make sure that you take steps back and Re-Assess the big picture to ensure that your plans are still in alignment with your overall Goals and Objectives. Personally, I do just that.. making sure that the direction I'm heading is where I want to be.. and if so, raising the bar every time I reach my endpoints.


When you achieve a major milestone (might not even be the end Goal)..you need to TREAT yourself (embrace the sense of accomplishment and bask in it's glow). A common pitfall along the path to success is "BURN out".. so don't fall victim. PACE yourself so you still have time to enjoy the fruit of your labors !! (take the occational break to smell the roses along the path to success !! .. and be PROUD of yourself and what you have accomplished !)

In life.. there's always room for continuous improvement.




The Success Cycle :

  • Get Motivated.. Inspire yourself by envisioning a sense of Accomplishment
  • Surround yourself with POSITIVE, Motivated, Successfull individuals
  • Learn from and emulate those that have succeeded in these endeavours
  • Commit to and believe in yourself and your viability of Success.
  • Visualize yourself achieving ("at") the end point.. successfully reinforcing a positive outlook.
  • Sell yourself on the course of action (plan) and ACT on it !
  • Stay Focused... block out any external distractions.
  • Use all the resources at your disposal.. delegate if possible.
  • Commit to doing whatever it takes to reach our Milestones along the critical path to success.
  • Maintain and grow the confidence within yourself with each accomplishment.
  • Time is one of your most precious assets.. use it wisely.

It only takes one success to build upon.
Success "BREEDS" Success !

Generally, the more effort that you put in.. the more you will get out. However,
it's best to work SMARTER.. than to BLINDLY work HARDER.
(couch potatoes .. usually result in rotten home fries)

Failures can slow us down.. but need to be looked upon as ONLY temporary
setbacks to be learned from.... The successful don't quit easily.. learn from your mistakes !



The Failure Cycle :

Within each of us.. much of success and failure hinge upon our past failures and resulting LOW self esteem and confidence.

Ultimately at our core, we equate nearly everything as either positive (good) or negative (bad). With this in mind we associate good outcomes as pleasant.. or PLEASURABLE.. and bad outcomes with PAIN. Hence.. the pleasure / pain principle.. (avoid pain at all costs.. and seek pleasure whenever possible). [I could digress into a tangent on how this also has formed a basic tenant of most mythology in cultures.. Yin/Yang, etc.. as well as how good and bad have progressed to the opposite forces of GOOD (GOD) and EVIL (dEVIL) .. simplifying the balance of harmony in the world around us & our understanding of how it works.. but I think I just did].

Self Esteem and Confidence.. are psychological mechanisms common in all animals... Allowing for those behaviors that succeed to embolden us and be repeated. However, those actions that result in failure (or lack of control --> Fear) ratchet back our confidence and underlying self esteem, as they did more than a million+ years ago in our ancestors.. resulting as innate DNA etched fears passed down from those that survived to have children...[eg. not to go out at night (we couldn't see our foe..and predators were lurking -> fear of the dark), not to stand at the edge of a cliff (fear of heights), snakes, etc..].   Those that had these primal fears survived in much greater #'s.. growing the populous over time to dominate with those genes and behaviors].

Vestiges of our past ancestors... the ones that had "fears" survived.. and the ones with no fear.. ended up dead and unable to pass on their DNA. Success and Failure play on these primal fears and underlying instincts to curb our behavior. However, many of us get to a point where the fear seems to overtake our ability to "TRY AGAIN".. resulting in .. "middle of the road" (aka "Safe") lifestyles..rituals and habits that are less than fulfilling. Others experience scenario's that etch within us.. intense fears.. that manifest as panic attacks and phobia's that keep us from encountering those scenarios.

Take Responsibility for our actions... Live like your not affraid.. but be prepared for any side effects or consequences.. (hence plan accordingly).

Not all plans end up as expected.. but the
difference between those that end up successful is that they don't give up and quit ! Don't be a QUITTER !!


Understand that the Human Brain has an underlying "Primitive" subsystem as part of the brain stem (common in ALL animals from Fish onward..) which produces the same fearful (fight or flight) reactions.. (in mammals also the Amygdala, etc..). However, within humans.. we have a special differentiation.. (a HUGE prefrontal cortex) that allows our brain to "short circuit" and override these impulses / emotions..and/or fears by deep analytical (NON-EMOTIONAL) thought.  It is a fact that you can set your mind to FOCUS and concentrate on OTHER activities that do not open the door to emotions...  not allowing those thoughts of failure and FEAR to creep into your mind...  and that anyone can move past these points of failure and retrain their brain in the process (the same way that we each get beyond basic fears such as public speaking).

Much in Life is a Self Fulfilling Prophecy .... You Think it.. and so Shall it Be ! (maintain that optimistic edge with a positive attitude !!) .. Don't fall victim of self-doubt.. Be Confidently Determined !!


The old adage.. "Idle minds breed gossip and fear" .. is not far from the mark. By keeping yourself busy (in a positive way) you keep the mind from manifesting itself with perceptions of negativity and failure. We can each convince ourselves of nearly ANYTHING. Our likes and/or dislikes come about this way throughout our lives.. what we deem "fact" and believe.. becomes our own REALITY.. ending up as the "truth" to us. (eg. If we have a horrible elementary school teacher that doesn't teach us the basics of math.. and we develop a feeling of "not being good at" math since we're lost .. playing catch-up.. from that point on.. then we will associate a dislike with math.. resulting in our reality being "I'm not good at math" .. ensuring that I'll never excel in Math .. since I've given up before I've even started. This can be applied to ANY area of our life.. and is how most of us evaluate and judge the world around us.. via past experiences).

Hence... we are each the sum of our past experiences.. both the decisions that we've made or not made.. and the actions that we've taken or not taken.


The Confidence Cycle :

Success "breeds" Confidence.. which ultimately Breeds Success. Without Confidence, Success is unlikely. If you think you can't accomplish the task at hand, then you won't try hard enough.. or long enough.. or persevere through obstacles. Self sabotage will result in a self fulfilling prophecy.

Confidence allows us to block out the basic probabilities that govern reality (chances of failure), which normally inhibit or restrict our behavior (remaining focused and trying harder if we first don't succeed).

One interesting paradigm within human psychology is that Over-Confidence generally has other associated detrimental affects, such as arrogance, where those shielded from failure (or in denial of it) exude an aura of perfection.. to the point of invincibility that ultimately ends in it's own form of demise (failure, social acceptance, lost friendships..). Remember, modesty goes a long way .. so eat some humble pie on occasion and give yourself a reality check !


Unless you Understand it, you probably can't Control it ...

Most problems in Life are trivial in nature. Before you get upset.. look at the overall
scheme of the "big picture".... is it a life/death situation ?.. does it involve major health concerns that can't be overcome in time ? ... Just about ANY other scenario can be overcome and worked around. For those situations that can't be changed, we unfortunately must accept the cards that we're dealt and the responsibilities and impacts that come with them.. along with the help of others (true friends, family, faith, prayer, meditation, etc..).


Fear and Uncertainty : It's the "Not Knowing" that will eat you up inside !

The Fear of the Unknown is the Greatest fear that ALL humans have.  For humans, not Knowing (aka. not Understanding) .. means that we can't predict what will happen in the future.. showing us a glimpse of how large the world around us is.. and that we are not in Control of it all.  It's the underlying primal fear that drives us all to create explanations and ways for coping with those difficult moments in our lives that we can not understand or explain. It's why the ages have had a continuing progression of formal systems to explain the unknowns around us .. hence.. Mythology.. Many gods for many unknowns (Sun, Moon, Rain, etc.. etc..), Superstitions (the #13, .. umbrellas, etc.. to the Pope law for preventing the Plague from spreading during the Dark Ages.. the origin of our saying "God Bless You" when someone sneezes), or Astrology (through the movement of the gods.. aka planets or constellations.. though the alignment of planets that "supposedly" corresponds to our birth compatibilities.. hasn't been aligned in the same way for thousands of years, and is ever changing. Even though the premise that everyone born within the same month has the same personality and emotional compatibility attributes is one that has no factual basis.. fortune telling.. is still with cultures around the globe.. and horoscopes are read by hundreds of millions of less scientific and generally less educated people every day around the globe to try and feel comforted and somewhat in "control" of their lives). 

 

We have the Critical Thinkers and Non-Conformists to thank for nearly all the major advances of History

Luckily, over the years, the foundation of explaining the world around us and how everything works.. (with proof.. causes and effects.. scientific deduction,..) has progressed even through centuries of restrictions during the dark ages. From the first critical thinkers.. not afraid to as WHY?.. to the period of enlightenment and the "Age of Reason" when science was finally embraced and became an accepted discipline.  It took bravery and inspired devotion to a cause to say the TRUTH and endure the inevitable condemnation by the masses, during times when most "deviants" became social outcasts, of those not killed or imprisoned.   Freedom of the mind and speech has and still is resisted by the masses that would rather be "accepted" (ignorance is bliss) than exercise free thought and question what their parents grandparents might have taught them.  Unfortunately, speaking out against a culture's dominant religion is still today an issue, just as it was for Socrates, Coppernicus, Galileo,  and Darwin...  Never forget however, that we have those oringinal inquisitive minds that voiced the TRUTH (in light of the mythology, superstitions, and BLIND Faith before them), to thank for the productivity gains, technical revolutions, and basic UNDERSTANDING that we have today.   The quality and longevity of life that we all expect complacently.. is all a result of science.

People will GRASP at and fight tooth and nail to justify their belief systems.. no matter how much evidence is put in front of them, they will latch onto coincidence as fact, allowing for HOPE that their beliefs (and the intricate web of understanding that we create to support those beliefs) will not be dismantled. It's human nature to be in Denial and "protect" our underlying belief systems.. so that our foundation of Confidence and Self Esteem are protected. What's WISE, however.. is to look at our beliefs with an open mind.. allowing for us to grow ourselves in a positive direction, whenever possible not seeing life through a false veil that only clouds are judgments.

We manifest our own fears by dwelling on what we're unable to control, predict, or understand. Being in control of your life means that through better understanding and planning (reducing the number of unknowns) ..also comes better management and acceptance of those things outside your control.


Each time we make a decision based upon "intuition".. we're subconsciously letting our brain make the best choice from all possible options.. factoring in our knowledge of past experiences. Even though intuition creates a guiding emotional "feeling" within us.. it's basis is to help us avoid failure using logic and past outcomes.



Searching first within ourselves is the first step to inner peace and happiness.

This same understanding.. (or Misunderstanding) taken "within" ourselves.. can allow each of us to see those moments when we are in a state of denial.. simply to stubbornly protect our "stance" .. whether it's regarding a belief or a position in an argument... An open mind and the ability to self-reflect.. will enlighten each and every one of us. Even though the truth can sting.. at least initially... it's an essential step in that can resolve or avoid many common pitfalls in life.

\* Until we see and admit our own flaws, we will never change or grow. \*


Living with a Clean conscience comes from having Integrity, Admitting to our mistakes, and taking Responsibility for our Actions (even the ones that don't go as we had planned..).


Having HOPE and maintaining FAITH are the common elements that we each need in some shape or form (whether it's in ourselves, our mission, our goals, those around us, or in our own definition of God..).

Hope, Faith, and Focused Determination.. together, they allow us to escape the clutches of defeat.. when we might otherwise give up in the eyes of adversity. (..when we might otherwise not try that one last time that results in success !)    HAVING FAITH IN YOURSELF is the starting point !!


Big Ripples make Waves ...
(\* Skip if you aren't in for a Tangent on science \*)

Life is full of waves... of all forms, shapes, and dimensions.. each with their own set of laws, characteristics, and Patterns.  From sound ... (music to language), to light ... (fireworks to Movies).. and on.   Taking this concept further, to that of patterns in the real world, some of these "ripples" in our daily lives can be "detractors" (detours, tangents, distractions...) that we encounter along the path of life.. hopefully capable of being hurdled or sidestepped, so that we don't become derailed from our target destintation(s). Ultimately, we should be growing ourselves along the path, enriching and fueling our curiosities, savoring the simple pleasures along with the major milestones... hopefully working ourselves in a positive direction.. towards our goals.

.. Little Ripples show us that each and Every one of us can have an effect on our surroundings
(the world around us) !! By taking the events that life throws at us.. and comparing them to "waves" that propigate all around us (radio waves, sound waves, water, ...) we can see that each wave has a repeating cycle of oscillation (frequency).. just as life ebbs and flows with highs and lows (crests and troughs), so does the typical sine wave.. transferring energy in the process (contributing to or eroding from the end result).

The complete spectrum of energy around us is composed and differentiated by specific characteristics (for waves.. it is the wavelength). The visible spectrum of light that we are familiar with (as represented by ROY_G_BIV) is a subset of the complete electromagnetic spectrum of waves that exist.. (the invisible InfraRed, UltraViolet, x-rays, Gamma Ray's, radio waves, etc..), .. JUST Like in every day life, events within our circle of influence are but a subset of those external events that are outside of our visibility (or control)Identifying the "Patterns" and recognizing ways to "flatten" out and minimize the troughs is a key to maximizing our highs (happiness, success, etc..).


For EVERY system there's a point of equilibrium...or "balance" (Yin / yang) governed by the dynamics of the system itself (physical,
environmental, ethical, .. laws). Whether it's room temperature, a
stock price, or the number of fish that can live in a tank
.. there will always be an average that is influenced by the chaos and entropy of the world.


All interactions.. have the potential to Transfer Energy.. either TO.. or FROM us.. as described in Newton's Law's, as well as the Law's of Thermodynamics :

First law of thermodynamics

In any process, the total energy of the universe remains constant.
It can be seen as simply as we can measure the overall change in temperature in a room, that
when a new object or air is introduced or removed... (Unless Everything remains constant).. the overall Energy (based upon.. heat / temperature, light, humidity, etc..) will change to factor in the differences.. either having a Positive or Negative overall affect (.. add cold air.. or remove a the hottest object.. and the overall temp/energy of the room will go down). The same can be said relating to the impact that Events and/or people have on our every day lives.. either contributing to or taking away from the Positive aspects of our life. Our choices and actions can help us filter out the negatives.. and when they can't be avoided.. we must learn from them.. to better manage and limit our exposure to these eroding / corrosive entities that will have a cumulative effect over time.

In the "clean" world of Physics, ALL actions cause equal and opposite reactions (see Newton's Third Law of Motion). However, Newton's Law's of Motion didn't have to account for interpersonal unpredictability that we need to account for in the real world. You push.. either they push back.. something falls down.. ???... OR.. a door opens up ! .. it ALL depends on the Situation at hand .. take those 10 steps back and look at the big picture to make your choices wisely (once you make your bed.. you will need to sleep in it .. and live with those decisions !) !!

... they key is in harnessing the "ripples" and "waves" .. using them to your advantage !! In EVERY scenario.. avoid the detractors (distractions) and look for the openings (the doors) ... and Ride the WAVEs to your advantage !! (many events are a blessing in disguise.. inspect what you have at your disposal.. and don't be affraid to examine things from many vantage points.. step out of your shoes and see the world from other perspectives.. you might be surprised to see what you've been missing out on !!).


Grow Faith and Confidence in Yourself, Define Purpose and Goals within your Life.. and Act on them !!

EVERY Moment that passes.. becomes History.. out of our reach.. that can not be rewound. Embrace the past and learn from it.. but be careful not to repeat the mistakes... learning instead from the lessons already learned. Break from those unjustified fears of the past and extend your reach into the world of fulfilling happiness and Love.

Life begins by "Finding (understanding) Ourselves", but LIVING... starts when we allow ourselves to GROW.. "Defining Ourselves" in the process (not simply allowing others and the world around.. to define US).



Hopefully, this posting has made a few of us reflect on our own scenario's... allowing us to move one step closer to establishing and/or achieving something beyond what we have today. :)

Feel free to post comments.

My Best Wishes to you all.. :)
Todd J.



( Copyright 01/22/2007  Todd A. Jobson )


Add to Technorati Favorites

About

This blog does not reflect the viewpoint or opinions of Oracle or Sun Microsystems. All comments are personal reflections and responsibility of Todd A. Jobson, and are copyrighted from the posted year to current year, to that effect.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today