September 3, 2008

New version of datagenerator available

I've uploaded a new build of datagenerator. New features include

a?? Support for indexes and sequences
a?? New command line options
a?? Better multi threading support
a?? New scaleable data builds
a?? Number generators can reference row counts from other tables
a?? Better database performance
a?? Ability to generate only the DDL of a schema
a?? Numerous bug fixes

The new build can be downloaded here

Ia??ve also updated the install, and added some additional walk throughs (in the swingbench section)

More details can be found here

Creating larger swingbench data sets

Ia??ve added some new pages describing how to build large scale a??SOEa?? and a??SHa?? schemas. Ia??ve tested them both to 500GB in size and will create larger scale versions as soon as I can borrow hardware to test them at multi terabyte levels. If youa??re interested in giving it ago let me know via the comments page and I can try and assist. You can find the the instructions on how to do create 100GB+ schemas for a??SOEa?? here and the how to create 100GB+ a??SHa?? schemas here

SQL Timing utility

Some times I feel like Ia??ve missed out on a whole chunk on functionality in Oracle products. One little nugget is the a??timinga?? function in SQL*Plus. This allows you to time groups of operations.

Obviously turning on is achieved with the a??set timing ona?? operation. i.e

SQL > set timing on

SQL > select count(1) from all_objects;

COUNT(1)
----------
68653

Elapsed: 00:00:03.95

SQL>

Which is great but what if want to time mulitiple operations. Use the timing function and simply give the timer a name, in this case statement timer.

SQL> timing start statement_timer
SQL> select count(1) from all_objects;

COUNT(1)
----------
68653

SYS@orcl > /

COUNT(1)
----------
68653

SQL> timing show statement_timer;
timing for: statement_timer
Elapsed: 00:00:30.85
SQL>

Which times anything that went on in between the timer starting and finishing. In this case also my typing of the commands. Its a fantastic utility for timing stages in a batch job including call outs to os operations.

September 28, 2007

Swingbench Trace Analyzer

Its been a while since I've posted on this blog... Apologies however I have been been keeping my personal blog up to date.

Since my last post here Swingbench,TraceAnalyzer and Datagenerator have had some big updates. So if you interested head over to dominicgiles.com and take a look.

TraceAnalyzer has had some improvements made

New features include


  • Support for 10046 traces
  • Bind parameters
  • Explain Plans
  • Wait Events
  • Formatting of SQL

It has few issues... the capture of bind values is a little flakey.... still trying to get my head around some of the parsing. I'll try and fix a few of the more obvious issues of the coming weeks...

I have a had a few emails stating that on some trace files TraceAnalyzer falls apart... I can't reproduce this on any trace files I have so if you have had this issue and can send me the trace file Im happy to take a look.

Dom.

November 14, 2006

Swingbench Trace Analyzer

So I haven't released much in the way of an update to
swingbench lately mainly because work has been so
busy...




However between meetings I put together a little
program that parses Oracle trace files. Now I know
TKProf does a fine job of this but I've never been
really comfortable with having to continually rerun
TKProf to change the ordering and filter out classes of
statements. This came to a head just recently after
looking through a big trace file and trying to figure
out what SQL to work on first. I also thought that
perhaps I could use a richer user interface to give a
better overview on what has happened a particular run.
So I started with the intention of figuring out how to
parse the file and come up with some ideas on what to
with the results... This turned out to be pretty
trivial because of its structure and Java's regular
expression support. With that taking much less time
than expected I put them into a Java Swing JTable just
to verify the results, which lead on to the next thing
and then the next.... Needless to say the code is far
from perfect but it does give a feel as to what could
be achieved. If there is no interest I'll stop now and
go back to finishing swingbench 2.3.




So In summary what does it do

  • Parse trace files.
  • Profile the data via a bar of the right had side
    of the scroll bar.
  • Supports dynamic filtering and sorting of the
    data.
  • Highlight the 5 worst performing pieces of SQL
    (elpased, cpu, physical etc)



What I'd like to add to it

  • highlight concurrent SQL

  • Explain Plans

  • Display bind variables

  • dump sql to flat files

  • generate load files for swingbench
  • create a plugin for SQL*Developer



However unless there is interest I probably won't
bother..



You can download the code here




Leave your comments here.

October 16, 2006

New build of datagenerator


I've uploaded the latest build of datagenerator here. New
features include

  • User definable number of threads

  • Data insertion directly into the database
    (Oracle)

  • Ordered generation of data (largest first)

  • Lots and lots of small bug fixes

Apologies for not updating swingbench recently but
I've been struggling with the amount of "proper" work
at the moment. I'll try and find some time over the
next few week to include the updates I promised. Let me
know if you encounter any problems in the code... now
is the time to get them fixed.

September 1, 2006

On the subject of I/O

One the things that constantly surprises me when talking with clients about hardware for a new database server is that I/O is always at the bottom of the list. Typically the list will look something like this (listed in order of perceived importance)

  • CPUs, have we enough. Fast as possible.
  • Memory, as much as we can put in the box. Oracle don't charge us for that
  • SAN, big as possible.

At this stage the purchase order is usually given the nod and the hardware supplier will ship yet another run of the mill box. Don't get me wrong. Many experienced DBAs have been through this process many times before and realise that not only is the list in the wrong order but its missing some critical components.

  • HBAs, need to specify these in proportion to the CPUs and attached storage
  • NICs, might need a lot of these i.e public, cluster interconnect, storage, management, backup. And typically in multiples for resillience or performance.
  • Backup, are we using the existing backup infrastructure?

I don't blame anyone for this way of thinking, its the way its always been. When discussing a new server the first question that people tend to ask is "So whats this monster packing? 16 CPUs!!!" followed by lots of very macho grunts and hollering. The standard licensing model (not just Oracles) doesn't help. It starts with premise of a CPU describing the power of a server, and to a large degree it does but misses the point of what a database is all about and that's information. Typically that information is held in ones and zeros on a bunch of spinning scrap metal. The real power of a database comes from its ability to aggregate, analyze and process those ones and zeros, turn it into information and push results out to interested parties. Paraphrasing a little "Its all about I/O stupid".

With this in mind I'm constantly surprised by the imbalance of I/O put into servers both disk and network. Its not unusual to see a 4 cpu server running with the latest generation Intel and AMD CPUs but with a single HBA and dual ported NIC. Whilst memory is cheap many of these servers still run 32 bit kernels. This typically means only a small proportion of the database is cached in memory be it in the SGA or file cache (don't me started on file cache). I'd make a rash guess that whilst the size of the memory in a typical database server has increased the average size of the SGA hasn't increased in line with this trend. To make matters worse the typical size of a database has got significantly bigger. This has to lead us to the conclusion that less of the database is cached and as a result a bigger proportion of its is located on disk. As I said this is just a guess but its backed up with real customer engagements. What would be of interest is to have performed an analysis over the last 10 years to see if the wait event for scattered and sequential reads had decreased or increased as a proportion of the total wait event in production databases.

What I'm driving at is the need to move I/O way up the agenda when sizing a server for databases. The number of CPUs needs to be married to the number of I/O channels available. It makes no sense to buy database licenses for a machine that will simply sit and wait on I/O, Its simply wasting money. Equally it makes no sense to stuff a 4 cpu machine full of HBAs for a database application that will perform index lookups on a index that fits comfortably in the cache. Adding HBAs later to an existing server isn't necessarily a simple option either especially for a mission critical application or one that has hard coded paths to disk.

The next obvious question is "well thats well and good but how do I size the ratio of HBAs to CPUs." and in a typically vague fashion I reply "well that depends". The type of application and the type of processor should heavily influence the decision. Certainly the CPU has been winning the race in terms of performance over the last few years and it needs a lot more I/O to keep it busy. But the equation also needs to be balanced with the amount of memory available on the box. A large SGA will certainly reduce the need to visit disk. The best advice I can give is to speak to your hardware supplier and find out what the current state of play is. Also check the latest TPC-C and TPC-H figures show. Whilst these are generally edging towards the extremes of performance it does show what a hardware supplier believed was needed to show their hardware in the best light.

August 12, 2006

Icon Design

One of the problems when buiding a bespoke application is that you can never find a icon that reperesents exactly the action you need. Sure there are hundreds of sites on web that have "free" icons but these tend to be designed for the desktop. You can sometimes find sets that look very professional and you'd be proud to have them in your application. However you still have the issue that you don't only need icons to represent "file open" or "delete record" but also ones to represent the new action that is going make your application a best seller and the last think you need is a icon that sticks out like an ugly sore thumb.

So your left with a couple of choices. Go ahead and build that expensive icon set and hope that no one notices your child like attempts at graphic design. Or bite the bullet and commit your self to buiding the whole set your self. Now this not something you should attempt if you have no artistic aptitude or are short of patience. To be fair I've always been interested in graphic design but have never had the inclination or need to commit to buiding my own icon set. That was until I started working on datagenerator and swingbench. Im a one man team and I know many of you will be questioing the sense in spending time working on icons when it could better be invested in fixing bugs or working on new features, however one of my objectives when I started working on thes projects was to touch on a number of disciplines that my day to day job (core database) doesn't allow.



So once you've committed yourself to building your own bespoke set what tools do you use. Well theres no shortage of tools from bespoke icon editors to top end tools such as Adobe Illustrator and Photoshop, Paintshop pro etc. I've shyd away from icon editors in the past simply because I find them to restrictive. You find yourself spending to much time trying reproduce effects such as shadows and gradients which are pretty much the defacto standard on a modern desktop. I use two platforms these days, My Apple iMac and a Linux notebook. If I was designing icons and other graphics for a living I would invest the big bucks for a product like Adobe Illustrator... I wouldn't even hesitate, from my limited experience nothing comes close, however I dont do this for a living and it doesn't make any sense to spend a couple of thousand pounds for a dozen icons (although if someone has a spare license lying around...).



Luckily the open source market has a number of alternatives that provide a viable alternative. i dont have the time to list the various projects building superb tools to compete with the various commercial offerings but two stand out. The Gimp (worth using just to shout accross the office "I working on the Gimp") and Inkscape. The Gimp is primarily used as a raster or digital paint tool as is comaparable to Adobe Photoshop. Inkscape is a vector paint tool and is comparable to Adobe illustrator in terms of use if not functionality, it also has the added benefit of working directly in SVG. Working in a scalable format, such as SVG, is a real asset to icon design it means you can work on a large scale and then shrink or enlarge your design with very little loss in quality.



So Ive comitted to Inkscape and Im very impressed so far. It appears rock solid, has ports for MacOS and Linux, has tools for viewing your designs as they would appear as icons and has some genuinely inovative features. However it does have some flaws.... The documentation is very weak, some of the dialogues are confusing at best and its not a native port to MacOS (that really would set the cat amongst the pigeons).



That said I've started work and its pretty straight forward to put together some consistent icons. I'll post the results shortly... don't laugh

August 7, 2006

New build of datagenerator

I've uploaded a new build of datagenerator (46). The update to this build is primarily to include new debug code to try and diagnose a issue with reading the default config files shipped with datagenerator. This prevents users from reading the data inside of the xml file. This appears to be a national language issue since the data is shipped in UK format. If you have this issue, uncomment the debug line in the shell script or bat file that starts datagenerator and post the output to me via the comments page.

July 29, 2006

New features for swingbench 2.4

It wont come as a surprise to people that know me I have a hard time focusing on work thats nearly done. And so finishing a release of swingbench is the hardest thing I have to do... Im always tempted to start thinking about the next relase in prefence to sorting out the final last issues and documenting all of the new features and functionality...

With this in mind Im interested to know what features people would like to see in the next release... Some ideas might be


  • Load generated from trace files

  • More benchmarks

  • more sophisticated clusteroverview


Anyway let me know what you think... while I concentrate on finishing 2.3

About Dominic Giles

Im an Oracle employee based in the UK working on high transaction and very large databases. I've been in Oracle for a while. I'm also the author of the "Swingbench" load generator. You can view my personal blog at www.dominicgiles.com/blog/blog.html

Top Tags

Categories

Powered by
Movable Type and Oracle