Monday Sep 21, 2009

What's going on inside your MySQL server?

Your MySQL server is under heavy load or refuses any new connections because MySQL runs out of available threads. Ever wondered why?

For me the easiest way to get a first and very helpful real-time insight in what is happening in your database server is to use the show processlist statement:

mysql> show processlist;
+--------+---------+-----------+-------------+----------------+------+--------------------+------------------------------------------------------------------+
| Id     | User    | Host      | db          | Command        | Time | State              | Info                                                             |
+--------+---------+-----------+-------------+----------------+------+--------------------+------------------------------------------------------------------+
| 426144 | DELAYED | localhost | statistics  | Delayed insert |  170 | Waiting for INSERT |                                                                  |
| 431669 | root    | localhost | NULL        | Query          |    0 | NULL               | show processlist                                                 |
| 431677 | mantis  | localhost | mantis      | Query          |    0 | Sending data       | SELECT \* FROM mantis_custom_field_table WHERE id='3'             |
+--------+---------+-----------+-------------+----------------+------+--------------------+------------------------------------------------------------------+
3 rows in set (0.00 sec)

Only the first 100 characters of a statement are shown in the Info column. If you really need more information simply call show full processlist to get the whole complete statement information.

Friday Sep 18, 2009

Save energy! Stop using CGI!

The day CGI was invented was a great day for the Internet, but a dark day for the history of how-to-do-thinks-right. CGI was great, because it gave us (standard computer nerds) the ability to easily implement dynamically generated HTML pages - the predecessor of todays web applications. The interface was so ingeniously simple and powerful that everyone could use his favorite programming language for implementing web services. You could use C, Perl, AWK, PostScript or even the Bourne Shell to write your web application. And people did.

However, there is another side, a darker side, of CGI. It is not just ingeniously simple and powerful, in fact, it's also terribly slow. For every incoming HTTP request your system needs to fork a new process, and forking a process is an extremely expensive operation.

It was a great blessing, but nowadays there are so many programming languages especially suited for web development - designed to run in a web server environment - armed with the features and functions best fitting in the needs of a web application. I say PHP, you say Java. Whatever.

But if you stuck to a programming language of the past, please don't use CGI any more, use FastCGI or something comparable to connect your application with your web server. Or if you're lucky and your programming language is already aware of the web, there may be an even easier and better way.

For example, if your're a real Perl programmer, then use mod_perl. It's so simple to run unaltered(!) Perl-CGI scripts with mod_perl:

PerlModule ModPerl::PerlRun
<Files ~ "\\.pl$">
      SetHandler perl-script
      PerlResponseHandler ModPerl::PerlRun
      PerlOptions +ParseHeaders
      Options +ExecCGI
</Files>

Compared to a simple Hello World in Perl with CGI...

% ab -n 1000 http://demo/helloworld.pl
...
Requests per second:    304.84 [#/sec] (mean)
...

... the same script with mod_perl...

% ab -n 1000 http://demo/helloworld.pl
...
Requests per second:    956.16 [#/sec] (mean)
...

...is three times faster. Without touching your Perl script.

And if I actually write a Hello World which takes advantage of mod_perl, I get:

# ab -n 1000 http://demo/helloworld
...
Requests per second:    1777.50 [#/sec] (mean)
...

Nearly six times faster, and that's only a simple example script!

Stop using CGI! Green your IT! Save energy! Save the world! Save the Cheerleader!

Wednesday Sep 16, 2009

Hide X-Powered-By: PHP

About two weeks ago I showed a simple way to beautify your URLs and hide the use of PHP as the backend of your web site. Since I got a lot of emails from people indirectly asking me how to also hide the X-Powered-By: PHP header which is still showing up in ones web server's HTTP response.

As so much in our beautiful world of IT it's very easy:

  1. Open your php.ini in the editor of your trust.
  2. Find this line:
    ; Decides whether PHP may expose the fact that it is installed on the server
    ; (e.g. by adding its signature to the Web server header).  It is no security
    ; threat in any way, but it makes it possible to determine whether you use PHP
    ; on your server or not.
    expose_php = On
    
  3. Change the On to an Off:
    expose_php = Off
    
  4. Reload your Apache and the X-Powered-By: PHP header is gone.

Very easy, no magic(k), and no rocket science.

Tuesday Sep 15, 2009

I (wireframing) Robot

Last week one my primary occupation was wireframing. It took me quite a long time to finish the first wireframes using a normal vector graphics editor and I started to look after a simpler and more efficient way to create my mockups. I quickly stumbled over Balsamiq Mockups.

balsamicmockups.jpg

It's by far not as featureful as products as OmniGraffle, or if you're living on the dark side of the moon: as Visio, but it's very easy to use and it's acutally fun using it.

So, if you need to do wireframes from time to time, Balsamiq Mockups is probably the right tool for you. If you do wireframes and nothing but wireframes all day long, you probably better stay with the tool you're already using.

With US$ 79 it's quite pricey, and I personally think that's too expensive. But if you need it, you'll pay.

Monday Sep 14, 2009

PHP Unconference 2009

DSC07076.jpg

Last weekend Germany's most intellectual elite of PHP ubergeeks came together for two days full of interesting and surprisingly entertaining sessions and discussions. Although it was a very informal and student-friendly event (just drop an eye on the picture above), I was pleasantly surprised by the high level and quality of this event. Yes, events like unconferences, barcamps or BoFs become serious competitors for all so-called professional conferences.

I was also very surprised by the high level of professionals and enterprises working with PHP and today I'm more confident than ever that PHP is coming to the enterprise in a big way over the next few years. I can strongly advise IT strategists to start learning about the Personal Home Page tools.

Duke, be aware of the elephant.

Friday Sep 11, 2009

Snow Leopard and SWAN access

Preparing myself and my hardware for the PHP Unconference this weekend in Hamburg and just realized that the new Mac OS X 10.6 Snow Leopard contains a native Cisco IPSec VPN client out-of-the-box. Just migrate your VPN settings and you're free to uninstall Cisco's antiquated VPNClient.

Travel expenses

Just finished my first travel expense report at Sun, exciting as a tax return and easy as a sendmail.cf.

Thursday Sep 10, 2009

How to Live Without Magic Quotes

In PHP 5.3 the magic quotes feature has been deprecated and with PHP 6 magic quotes will be removed completely. There are good reasons why magic quotes are bad, but how to live without?

I think the most promising solution is to (finally) start using MySQLi instead of PHP's classic MySQL extension. It's not as easy and straightforward to use as the classic one, but it's easier and probably more secure than to fight SQL injections by ones own hands.

Wednesday Sep 09, 2009

MySQL and UTF-8 recipe

If you're leaving the secure world of ASCII characters you're usually enter a dangerous land of confusion. It's actually a very complex topic and from a technical point of view there is no one to blame.

But if you don't care about the basics, don't care why or how, and just want a reliable and working system, simply follow these three advices to attain happiness and inner peace in your developer's life.

Advice 1

Define the character set for every column where you plan to store international character data:

CREATE TABLE t1 ( col1 VARCHAR(42) CHARACTER SET utf8 );

Or at least for every table:

CREATE TABLE t1 ( col1 VARCHAR(42) ) CHARACTER SET utf8;

The later example saves you some typing, but usually you don't need UTF-8 in every string column.

If you omit this parameter, MySQL assumes a default value which may be different from system to system.

Advice 2

After connecting to MySQL always do a SET NAMES 'utf8' before doing anything else.

For example in PHP:
$mysqli=mysqli_connect('localhost','demo','demo','demo');
$mysqli->query("SET NAMES 'utf8'");

If you omit this parameter, MySQL assumes a default value which may be different from system to system.

Advice 3

Before sending any text to the browser set the charset in the Content-Type header of your HTTP response.

For example in PHP:

header("Content-Type: text/html; charset=utf-8");

If you omit this parameter, your browser assumes a default value which may be different from system to system.

A final word...

This is a very simple recipe made for people who don't want to be bothered with technical details, but it's very important to follow all these advices, and not just one or two.

Tuesday Sep 08, 2009

SELECT LAST_INSERT_ID() FROM LEGEND

I don't know why or who was the first who introduced this legend, but if you review source code from time to time, you see one weird MYSQL SQL statement showing up regularly: SELECT LAST_INSERT_ID() FROM some table.

If you google for »SELECT LAST_INSERT_ID() FROM« you'll get 98,100 hits. And if you use Google Code you actually find all the public available sources using this phrase: 216 times in some PHP code, 120 times in Perl code, 102 times in Java code, 21 times in Python code, and finally 1 time in Ruby code.

But what's so bad about this statement? Let's go into the lab:

mysql> SELECT \* FROM pages;
+----+---------+-----------------------------------------+
| id | name    | content                                 |
+----+---------+-----------------------------------------+
|  1 | welcome | Dear Traveler, welcome to my AMP world! | 
|  2 | about   | This is about AMP!                      | 
|  3 | team    | Apache, MySQL, and PHP.                 | 
+----+---------+-----------------------------------------+
3 rows in set (0.00 sec)

A simple table I used in an earlier blog entry. Now let's add some data:

mysql> INSERT INTO pages (name,content) VALUES('faq','Typical mistakes');
Query OK, 1 row affected (0.00 sec)
And now we want to know which id our new data set got:
mysql> SELECT LAST_INSERT_ID() FROM pages;
+------------------+
| LAST_INSERT_ID() |
+------------------+
|                4 | 
|                4 | 
|                4 | 
|                4 | 
+------------------+
4 rows in set (0.00 sec)

4 times a »4«? Sure, because LAST_INSERT_ID() got calculated for every row in that table. And since we have 4 rows in that table, we got the LAST_INSERT_ID() for 4 times. In this case it's not a big problem, but in case of a really large table this can create some serious issues. Especially if you know that this whole list of last IDs is also transferred from the server to the client.

The statement is syntactically correct, but it's definitely not what you want.

To get what you want, simply omit the FROM and the table name:

mysql> SELECT LAST_INSERT_ID();
+------------------+
| LAST_INSERT_ID() |
+------------------+
|                4 | 
+------------------+
1 row in set (0.00 sec)

Good.

Monday Sep 07, 2009

Perlify your command line

I'm not a Perl expert, not even someone who know how to actually program in Perl. So please don't expect anything miraculous about Perl coming through my keyboard. I happen to prefer PHP or AWK personally, but I respect Perl very much. And sometimes, even recently, it has saved my life.

The directory tree of a big server changed and I needed to update outdated path details in some hundreds configurations files.

Usually I do this using sed, but this time I was on a system without any GNU sed available:

[oswald@solaris10 ~] cat life
My life is so bad!
[oswald@solaris10 ~] sed -i 's/bad/good/g' life
sed: illegal option -- i

In this case Perl was my life saver:

[oswald@solaris10 ~] perl -pi -e 's/bad/good/g' life
[oswald@solaris10 ~] cat life
My life is so good!

Thank you, Perl.

Friday Sep 04, 2009

Remember the time, security awareness was a topic?

Setting up this blog remembered me on an old security issue I stumbled upon some years ago. At that time I was simply lazy to investigate in more detail, but this time I took a closer look.

Let's do the time warp...

Before getting into this security thing, let's do a little time travel: Do you remember telnet and the so called rtools like rlogin/rsh/rcp/...? As ethernet hardware became cheaper and available for private use, these commands quickly disappeared from the TCP/IP dance floor and were quickly replaced by SSH counterparts. Do you also remember why did this happen? Yes, because these tools allowed an IP-based authentication and/or sent the user's password plain text over the network. They have done so since many years, but now because of the cheap network hardware it became easy for "everyone" to spy on a network and fetch all the passwords walking by.

SSH stopped this. For a while...

The presence

Today, with Web 2.0, security awareness seems to be a lost ideal from the past. No one cares about the privacy of private data or the security of their communication. So many of those fancy and stylish social network services use plain text password authentication over HTTP. Or if you take a look at the incredible popular and narcissistic blogosphere, where XML-RPC based APIs were used by blogging clients to talk to their online publishing system, you'll find this:

An additional downside of XML-RPC based APIs is their lack of security. Although security can be introduced by other means such HTTPS, in their basic operation these transports deliver user credentials as plain text. This leaves an open door to password interception.
("Beginning RSS and Atom programming" by Danny Ayers and Andrew Watt, 2005, page 525)

Or before that in 2003 Mark Pilgrim stated in »The Atom API« about the MetaWeblog API:

Passwords are sent in the clear, as plain text.

That was 6 years ago. At least 6 years of plain text passwords. 6 years of free and unencrypted WLAN access at Starbucks or whatever other public place. You don't need to be a genius to smell a rat there.

Who to blame?

And you can't actually blame MetaWeblog or other XML-RPC APIs. Why should the API concern about encryption, if it's so easy to implement in a lower layer? Like in HTTPS as mentioned above. It's so easy to set up HTTPS and use it for the XML-RPC API. Sure, big and popular blog hosting services already do it this way, but myriads of self-maintained, self-hosted blogs don't. Doesn't matter if it's a private or a corporate blog.

Can you blame the user? No. The user should be able to turn the computer on and off, and USE it. He's there to use the computer and not to think about it. I don't want to think about how an ATM secures my financial transactions.

Can you blame the system administrators? Yes, a big yes. Whoever is responsible for the IT infrastructure should be aware of this issue and should take appropriate actions.

Who else? Of course the software developers! Another big yes. Why do they make it so easy for the people to set up insecure systems? They can easily implement something to force the use HTTPS or at least, inform the users and ask them to actively do something to enable unencrypted XML-RPC over HTTP.

Thursday Sep 03, 2009

PHP Unconference in Hamburg

Yippee, finally found the time to book train and hotel for the upcoming PHP Unconference on September 12th and 13th in Hamburg. Looking forward to meet old friends, talk about bleeding edge technologies and have one or two beers.

Wednesday Sep 02, 2009

Easily boost your AMP website combo with mod_rewrite

The Scenario

Imagine: You run a popular website using AMP technologies, but your hardware is awfully old and over the years, as your website became more and more popular, it's got slower and slower and slower. Today it runs with a load of 42 and smoke pours out of the TCP ports. And because you're a silly idealistic fool you've no money for a new hardware or to pay a reasonable hosting service. What now?

The answer is in the book, the book of caches.

The Example

Let me set up a simple example:

You've a PHP (or whatever) based system for your website and your requests usually look like:

http://domain/index.php?page=welcome

To get human readable and more SEOish addresses you're already using Apache's mod_rewrite in your .htaccess:

RewriteEngine on
RewriteRule \^([\^/]\*).html$ /index.php?page=$1 [L]
Now all your website's URLs look like static HTML pages:
http://domain/welcome.html

Perfecto.

Now let's focus on the backend. For this example I'll use the following very(!) simple(!) PHP code:

<?php
    $mysqli=mysqli_connect('localhost','user','password','db');

    if (!$mysqli)
        die("Can't connect to MySQL: ".mysqli_connect_error());

    include("header.php");
    include("navigation.php");

    $stmt = $mysqli->prepare("SELECT content FROM pages WHERE name=?");
    $stmt->bind_param('s', $_REQUEST['page']);
    $stmt->execute();
    $stmt->bind_result($content);
    $stmt->fetch();

    echo $content;

    $stmt->close();
    $mysqli->close();

    include("footer.php");
?>

The MySQL table pages looks like this:

+----+---------+-----------------------------------------+
| id | name    | content                                 |
+----+---------+-----------------------------------------+
|  1 | welcome | Dear Traveler, welcome to my AMP world! | 
|  2 | about   | This is about AMP!                      | 
|  3 | team    | Apache, MySQL, and PHP.                 | 
+----+---------+-----------------------------------------+

The files header.php, navigation.php and footer.php contain some mix of PHP and HTML to build the navigation and some basic page layout.

Everything put together may look like this in a browser:

screenshot.png

The Benchmark

Now, I'm using ApacheBench (included in every Apache installation) and fire 1000 sequential request at my website:
% ab -n 1000 http://demo/welcome.html
...
Requests per second:    397.25 [#/sec] (mean)
...
In this case my AMP system was able to serve 397 request per second. Not bad, but it's also a very(!) simple(!) PHP script.

Setting up the cache

First, I add some lines of code to my previous PHP script.

One line just before the include("header.php") statement:

    ob_start();

And this four lines at the end just after the include("footer.php") statement:

    $output=ob_get_contents();
    file_put_contents("cache/".basename($_REQUEST['page']).".html", $output);
    ob_end_clean();
    echo $output;

ob_start() instructs PHP keep the generated output into an internal buffer. And the last 4 lines tell PHP to save this buffer into a file, for example: cache/welcome.html.

Now I create a directory called cache next to my index.php file and make sure my Apache is able to write and access that directory:
% mkdir cache
% chmod a+rwx cache
If I now reload my welcome page in my browser, a file named welcome.html gets created in this cache directory:
% ls -l cache
total 4
-rw-r--r-- 1 www-data www-data 732 2009-09-02 13:02 welcome.html
Now I add this lines to my mod_rewrite configuration (new lines highlighted):
RewriteEngine on

RewriteCond %{REQUEST_URI} \\.html$
RewriteCond %{DOCUMENT_ROOT}/cache/%{REQUEST_URI} -s
RewriteRule . /cache/%{REQUEST_URI} [L]

RewriteRule \^([\^/]\*).html$ /index.php?page=$1 [L]

These three lines reads like this: (first line) For all requests ending with ".html": (second line) If there is a file in the cache directory, named exactly like the resource my web server's got asked for, than (third line) send this file to the browser. If there is no such file, continue with calling the PHP script.

The Rerun of the Benchmark

That's all, now I rerun my benchmark from earlier:
% ab -n 1000 http://demo/welcome.html
...
Requests per second:    1287.57 [#/sec] (mean)
...
Wow, that's about three times faster as the regular PHP version. And in this example I'm using a very(!) simple(!) PHP script. On a more complex system, the boost will be much higher. For example on www.apachefriends.org we're using a cache based on this recipe and we got a performance win of 300 times. (That's because we have a very complex - some may say crappy - CMS running.)

Pros & Cons

Pros:
  • quite easy to set up
  • no additional software is needed, just an Apache with mod_rewrite
  • very high performance win on slow systems
  • works with every web programming language, not only PHP
Cons:
  • the cache will never refresh
  • the system doesn't work with user sessions
But all these drawbacks can be relatively easily solved by adding some more lines of program code or mod_rewrite configurations.

Tuesday Sep 01, 2009

Snow Leopard with PHP 5.3

Crazy stuff: The new Mac OS X Snow Leopard 10.6 ships PHP 5.3.

Of course with the usual annoyances:

[sunnyside:~] oswald% echo '<? strftime(0) ?>' | php
Warning: strftime(): It is not safe to rely on the system's timezone settings. You are 
\*required\* to use the date.timezone setting or the date_default_timezone_set() function. 
In case you used any of those methods and you are still getting this warning, you most 
likely misspelled the timezone identifier. We selected 'Europe/Berlin' for 'CEST/2.0/DST' 
instead in /Users/oswald/- on line 1

Looking forward how the Mac community will respond to that.

About

Kai 'Oswald' Seidler writes about his life as co-founder of Apache Friends, creator of XAMPP, and technology evangelist for web tier products at Sun Microsystems.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today