Wednesday Aug 09, 2006

Shell Scripting and PATHs

Everseen a PATH like this....

/opt/sybase/ASE-12.5/bin:/opt/sybase/ASE-12.5/install:/usr/local/bin:/opt/sybase/ASE-12.5/bin\\
:/opt/sybase/ASE-12.5/install:/usr/local/bin:/usr/bin:/bin

I have over the last couple of weeks been installing Users on UNIX systems and considering the issue of the PATH variable. The PATH above is caused by not checking if the directory is in the PATH, before adding it. This is particulalry a problem for those shells that have one file executed on a login and one file executed each time a new shell is invoked, such as ksh & bash. Also, for users created to own shared services (such as sybase in this case), people may enter the user from several entry points, including login, su \* su -. The problem is exacerbated when the environment definitions are written by a third party, in this case Sybase, but other packages have their own problems.

I have decided to undertake a test prior to amending the path. I'd like to write a function 'prepend2path, but came to the conclusion that the level of indirection required was to much, for me at least. I use a function contains

contains()
{
    #contains $PATH $directory
    /usr/bin/echo $1 | /usr/bin/grep $2 > /dev/null
    return $?
}

This allows me to test if a $PATH contains a given $directory

  if [ -d ${directory} ]
  then
      contains $PATH $directory
      if [ $? !=0 ]
      then
          PATH=${directory}:${PATH}
      fi
  fi

These statements should probably be placed in the shell read configuration file for the default login shell for the user, however they may be located elsewhere for good reason, such as in a script so that it can be run from wherever and the script becomes responsible for obtaining its own environment. The function can be placed in an external functions file and then invoked from wherever should this seem sensible.

This code model can be used for LD_LIBRARY_PATH and MANPATH. Both these variables may be unset so the contains function needs to OR'd with -z test. Also we can iterate the test.

  for directory in ${STEM}/install ${STEM}/bin
  do
    if [ -d ${directory} ]
    then
        contains $PATH $directory
        if [ $? !=0 ]
        then
          PATH=${directory}:${PATH}
        fi
    fi
  done

As I write this article, it becomes clear I should put the -d test inside the contains function.

contains()
{
    #contains $PATH $directory
    if [ -d $directory];then
        /usr/bin/echo $1 | /usr/bin/grep $2 > /dev/null
        return $?
    else
        return 99
    fi
}

This leaves the path assignment statement looking like this.

  for directory in ${STEM}/install ${STEM}/bin
  do
    contains $PATH $directory
    if [ $? !=0 ]
    then
      PATH=${directory}:${PATH}
    fi
  done

Of course there are those that say if you don't echo $PATH, you don't have the problem either!

tags:

Tuesday Nov 01, 2005

Making Sybase Scream

This article is about running Sybase on a sophisticated UNIX. It discusses sizing Sybase's max engines parameter, the effect of resource management tools & leveraging UNIX & Consolidation. Also note that this is not a Sun Blueprint, its meant to show you that you can, not how to.


I am often asked, "Given that Sybase recommend one engine/CPU, how can you consolidate Sybase onto a large computer?"

The first thing to say is that it is my view that consolidation is about architecture and co-operating systems, not merely an excuse to inappropriately pimp big-iron; an ideal consolidation host may be of any size. However, it is a fact that higher levels of utilisation are more likley if the available system capability is organised in large systems rather than the equivialently capable number of smaller systems.

Sybase is a database and most databases are good citizens within the operating system. This is true of Sybase which is a good Solaris citizen and is hence easy to consolidate.

It is absolutley true that Sybase have traditionally recommended that a Sybase engine (implemented as a UNIX process) should be mapped onto a physical CPU, although a common alternative rule is that the number of engines configured (max engines) should be the number of CPUs, leaving one spare. Either

E=#CPU or E=#CPU - 1

where E is number of Engines, #CPU is the number of CPUs.

The reason these rules are important is that "A well tuned database server will be CPU bound".

However the equations above are only performance tuning rules, and what's more, they're out of date. One of the key reasons that they're out of date is that CPUs are now much faster and more capable than when the rules were first developed. Due to the increased capability of modern CPUs, potentially fractions of CPU are required to support the applications requirement, but only integer numbers of engines can be deployed. The poverty of these rules is compounded by the fact that because they do not take into account the capability of the CPUs, and they are of little use in deciding the capacity requirements of replacement systems, or comparing between system architectures.

The tuning rules above have been useful where Sybase is the only piece of work the system is running and only one instance of the database server is running. In these cases, the designer needs to determine how many cycles/MIPS/specINTs etc. the server needs to deliver the expected/required performance. BTW, I shall define the delivered power of CPU as 'omph' (O) for the rest of this article. With Sybase, if it requires more than one CPU's worth of omph, then multiple engines will be required.

E=roundup(OReq/Ocpu,0)

where OReq is the amount of CPU required and Ocpu is the capability of the CPU. If we are looking to use the Solaris Resource Manager, then we need to translate this rule into SRM concepts and talk about shares. The rule above has stated OReq, and Ocpu \* #CPU defines the capability of the system (or pool). In the world of managed resource, the number of engines is,

E=roundup((SReq/Stot)\*#CPU,0)

where S is the number of SRM shares either required (SReq) or defined (Stot) & roundup() takes the arguments of expression, sig.figs (set to zero (0) not O for omph). This assumes that the system proposed is more than powerful enough. i.e.

i.e. SReq/Stot < 1

Or in other words the number of CPUs in a system must be greater than (or equal to) the number of engines. Where this is not the case, then the amount of omph delivered will be equal to the number of CPUs, unless constrained by SRM.

We now have three cases, where SRM is used to constrain system capability availble to the database server, where the Engine/#CPU is used to constrain system capability to the database server or where the database server consumed all available system resource.

  1. where SRM is used to constrain system capability availble to the database server,
  2. where the Engine/#CPU is used to constrain system capability to the database server or
  3. where the database server consumed all available system resourece.

These might be expressed in our equation language as

O=( SReq/Stot\*#CPU ) || E/#CPU\*#CPU || 1\*#CPU

where O is the proportion of the system consumed and the conditions are,

if SRM on || E < #CPU, SRM off || E>#CPU, SRM off.

These conditions are important; they show the massive difference in configuration rules depending on how actively the systems are resource managed. A single or multiple Sybase instance can then be placed under resource management using the S Ratio {SReq/Stot} to define the resources allocated to the Sybase instance. This can be enforced by user namespace design (S8) or projects (S9/S10). The permitted resources can be enforced using Solaris Resource Manager, processor sets with or without domain dynamic reconfiguration. It should be noted that the max engines parameter can be used to enforce rule two in a consolidation scenario; more engines than CPUs can be configured. In one assignment undertaken, the customer required a ratio of two engines/CPU and rationed between server instances by varying the number of engines.

I have written previously here... in my blog about designing the database instance schema as an applications map and why Consolidating Sybase instances onto fewer Solaris instances makes sense. These multiple server instances can be managed using SRM or the Solaris Scheduler. (I propose to research and write something more comprehensive about scheduler classes and zones.) I recommend that the operating system should be the resource rationing tool. Only the OS knows about that hardware and the system capability and unlike Sybase, Solaris with its project construct can approximate an application, and can therefore ration in favour, or against such objects. A Sybase server will not discrimate between applications, nor between queries.

In a world of virtual servers, and managed resource, another factor is that the number of CPUs in a domain is no longer static within an operating system instance, or any resource management entity. Our ability to move resources from & into a resource management entity permit us to change the constraints that enable the rule above. An example is that an ASE instance can be started with eight engines & eight CPUs within its processor set and that the processor set is shrunk to four CPUs. The ASE is originally consuming eight CPUs worth of omph, and after the configuration change, it only consumes four. These options permit configuration rules where the maximum number of engines for a single instance of Sybase may be either less than or equal to the Max number of CPUs planned over time. Also where multiple data servers are running on a single system the total number of engines is likely to exceed the number of CPUs. (One leading Sybase user in the City aims at two engines/CPU, while another moves system resources between domains on an intra-day basis.) These are both key consolidation techniques. The number of engines for each ASE Server instance should be set to the maximum required and the resources underpinning it can be changed using dynamic domaining, Solaris processor management utilities or Solaris Resource Manager. Consolidating database servers permits the resource management functionality of the OS (or hardware) to allocate CPU resource to the server. These can be dynamically changed. This means configuring sufficient engines to take advantage of the maximum available CPUs. i.e. if at a short time of day Sybase is required to use 12 CPUs, and for the rest of the day only four, then 'max engines' needs to be set to 12 and constrained at the times of day when only four are required.

In summary, in a world where consolidation & virtualisation, exist (or are coming), the value of max engines can no longer be assumed to be the based on the simple rules of the past.

tags:

Thursday Sep 15, 2005

Sybase 15

I popped into the Brewer's Hall at the invitation of Sybase for thier UK launch of Sybase 15. The have three great new features to help them compete with Oracle and the free ones, and its good to see them spend some money and more importantly intellect on the database server.

The optimiser has been given a major overhaul. They've looked long and hard at what they can do and what academia and the market say is possible and adopted more modern, flexible and cheaper query plans, including multiple entry points, eliminating work tables, minimising the use of nested loops, new storage structures, more row level locked defaults etc. Some of the query performance improvements they claim would be unbelievable if you don't understand what they've done. This has been done to support their re-engineering of the size limits to allow VLDB implementations, together with changes to enable more effective OLAP mixed work load solutions. However, perfromance improvements can/will be obtained by more traditional OLTP implementations.

Sybase 15 has improved XML support, both storage and more excitingly, the ability to expose stored procedures as a web service. This might not be enough to encourage people to locate the data server in a DMZ, but either Sybase replication or standard web proxy architectures should be sufficient to protect the solutions.

Another featured highlight is LDAP implementation, I was discussing with Mark Hudson of Sybase about this because it extends the solutions design capability based on Sun/Sybase co-operation and he pointed out that Sybase have been on a road to LDAP implementation since V12.0, initially offering support for the server maps but now offering User authentication. I expect that understanding the definition, storage and aquistion of privilidges and aliases will need a fuller reading of the documentation.

They've also responded to customer calls and the markets growing requirement for more pervasive Encryption. My final personal highlight is that they've improved statement level statistics capture, permitting the simpler tracking of rogue queries.

Mark also let slip that Grid enablement is next. It'll be interesting to see what competition will do to the market's solutions designs when active/active HA databases become non-proprietary.

tags:

Sunday Apr 03, 2005

Consolidating Sybase

When designing system platforms for Sybase based applications, three patterns are available.

  1. Traditionally, a single Sybase instance is installed onto a system host. This serves an application and user community. This remains a popular pattern for ISV products, however, where a rigorous environment management policy is adopted, this leads rapidly to 'server sprawl' as each DBMS instance requires not only production but also a development, testing and contingency system.
  2. Alternatively, multiple applications can be collected into a single Sybase instance. This is referred to by Sun as ‘Aggregation’. It is best done in conjunction with a single data model so that, for instance, only one logical customer (or counter party) table exists.
  3. The final implementation model is to install multiple Sybase instances within a single instance of the operating system. As noted above, with Solaris 10, these can be installed within their own Zone, or they can share zones. This Sun refers to as database server ‘Consolidation’.

Sybase's development of multiple workspaces has again extended the parallelisation of the server and reduces the number of serial bottlenecks. This is a technique that might permit multiple applications to be hosted within a single ASE server instance i.e. adopt pattern two above. ASE has for many years had multiple #145;databases’ within it but a database is a unit of recovery, not of applications logic and while Sybase has dramatically reduced the number of scarce resources within the ASE instance, it has no concept of application and its resource management algorithms have no concept of priority nor of service quality.

These weaknesses can be overcome by the Solaris resource manager which permits an application via a project to have its system resources guaranteed against anti-social behaviour be other applications. Having multiple Sybase instances within a Solaris instance would allow a more sophisticated resource management policy to be declared. Further reasons for ‘consolidating’ is that the regression tests for application changes are simpler. A change in one application will not require tests in the databases and procedures that are required by others and start/stop requests do not impact other applications. Any conflict around the values of the server run time parameters can be resolved using the consolidation model, to the extent that even different versions & EBFs of Sybase can be applied to different applications (using the file system name space to enforce and differentiate between versions). This last factor is very important if the platform designers do not own the Sybase version definition policy, such as when ISV code is being used.

Platform designers have a choice and they should carefully consider which of the three patterns they wish to implement as they develop their infrastructure plans.

tags:

Wednesday Nov 24, 2004

Consolidation & Sybase

This article explores a couple of basic Sybase consolidation techniques and the UNIX and Solaris technologies that support consolidation. It offers a couple of configuration tips on the way.

When consolidating multiple Sybase (or any RDBMS) hosts, designers have the option to implement either an aggregation solution, which would involve implementing each application as a separate database within a server, or alternatively co-locating multiple ASE server instances within a single operating system environment. The first of these techniques I refer to as aggregation, and the second as consolidation (or work load sharing). The former technique may involve developer time because the data & business models require reconciling. It also may lead to constraints for scarce resources within the Sybase instance. These are typically Sybase memory objects, or system database resources. While a choice between Aggregation & Consolidation exists, consolidation is easier and delivers more benefits.

Sybase is implemented in Solaris as a a number of processes that attach themselves to a single shared memory segment. (The processes are referred to as Sybase engines). The processes are linked in the process table and inherit a Sybase name from the master..sysservers table. (Actually at run time, the name is inherited from the -S switch in the run server file.) The data in master..sysservers table needs to be replicated into the interfaces file. (More recent versions of Sybase can utilise LDAP for this purpose). The purpose of the interfaces file is to map the Sybase server name onto a tcp/ip address consisting of {tcp/ip address:port no}. Each data server requires two ports on the same system. The default Solaris syntax has been to document the Sybase name direct to a tcp/ip address, although using Solaris' name aliasing is syntactically supported and a superior method. As a diversion, you should always convert the generated addresses into hostname:port no format. Also Solaris will support multiple tcp/ip addresses for each network interface. It is thus possible to co-locate multiple Sybase names within a single instance of the operating system (without containers) and ensure that remote processes can find the correct database server without changing the applications code, or client configuration parameters.

The number of processes cable of running in a Solaris instance/domain is defined and controlled by the /etc/system file. It is folklore and good practice to derive the number of engines as either the same as the number of CPUs (or related in some way to the number of CPUs). There is no technical constraint in either Solaris or Sybase that mandates this tuning rule. In the world of workload sharing, the share of a domain or system utilised by a Sybase server instance needs to be actively managed and I recommend that the Solaris Resource Manager is used to perform this function. This is a superstructure product in Solaris 8 and integrated into the OS from Solaris 9 on.

The default memory configuration of Solaris systems is to implement Sybase's shared memory as 'intimate'. The effect this has is to 'pin' Sybase's buffer caches into real memory. This leads to the configuration rule that sum of the consolidated server's caches needs to be less than real memory. If this configuration rule is not implemented, it is likely that one (or more) Sybase ASE instances will fail to start. (Oh, by the way, always set your SHMMAX parameter to HIGH VALUES, it saves a reboot when you breach a bound constraint. Systems on sale today (64 bit systems), are generally configured with ample memory for this not to be a problem. Thirdly, SHMMAX is a permissive parameter, setting it higher than needed is free.) Also Sybase DBAs have historically chosen/had to install their databases on raw disks and hence all the data cache is configured as database cache through the "Total Memory" parameter. (The UNIX file system cache is of no relevance to the DBMS). Most systems administrators and solutions designers are also aware of this and 'reserve' memory for Sybase. In the world of consolidation the solutions designer needs to be sure that real memory is greater than the sum of "Total Memory", plus the UNIX kernel image.

Sybase has a very rich semantic for abstracting disk objects into tables and rows, and this can be used to scale IO resource. I recommend that raw disks are presented by the systems administrator with permission bits set to 600 and then that file system links with relevant names are created. The disk init command should use the link name as the physname argument. This creates a level of indirection in the naming conventions and also permits a rich naming convention so that stored procedures such as sp_helpdevice can actually tell the DBA stuff about the disks in use. Discovering that the database is mounted on /dev/rdsk/md22 is not helpful!. The abstraction means that data can be moved between disks without changing the database configuration (although the database can't be running). The other huge advantage of this technique is that striping and locating the sybase devices across multiple disks, RAID devices, controllers or switches becomes transparent to the database's mounting script. It allows DBAs to begin to use the language of storage attribution, and leverage the system scalability.

The UNIX file system will permit multiple versions of Sybase to exist within a file system hierarchy. The default sybase installation model incorporates the Sybase version into the UFS directory name for the install tree. This will cope with the circumstances where the applications portfolio is running two (or maybe more) versions of the database. A further advantage is that effort to support patch management is reduced. Consolidation will force increased standardisation, which will lead to a reduction in the breadth of the problem as a reduced number of hardware platforms hosting sybase require to be patched and tested against diverse requirements.

In summary

  • Decide to Aggregate or Consolidate, or how to combine these strategies
  • Design your engine/process/instance map
  • Design you memory implementation
  • Design your disk map and its abstraction interface

Sunday Aug 01, 2004

Sybase commits to Solaris x86???

Since starting to write, I have been reading others entries to the Sun blogging community. This is a link & repeat of John Gardner's web log entry about Sybase & Sx86. Click here .... for John's original blog. The entry & hyperlink interested me as Sybase commit their ASE to Solaris x86, even as a developer cut. Click here .... for the download. I've been using Sybase for a long time as those of you who have access to internal publications know and am glad to see them follow me & Sun into fabric, Solaris based computing. This will give those sites committed (or locked into) Sybase two advantages. The first is that since Sybase have historically worked hard to ensure that hardware is not a performance constraint, internal contention is more frequently the problem than with its competitors. Utilising Solaris x86 and the x86 CPUs will allow Sybase to 'crack' serial hot spots or "code paths" more effectively. Secondly, Sybase has always had an effective 'data movement' solution and the new platform changes the price and hence enables a the deployment of distributed database solutions, based on caching patterns or real life geography.

I've started to write a Sun paper on consolidating Sybase on Solaris, its 30% written and you may see some of it here. I will be downloading the sybase binaries onto my Sun laptop sometime soon.

Version 1.2

About

DaveLevy

Search

Archives
« April 2014
MonTueWedThuFriSatSun
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    
       
Today