"Having" a grand old time

One of my current MySQL Web Team projects involves the creation of a "Tag" system for the web site. The idea is quite simple, we have lots of different types of content and we want to be able to link it all together by tags. For example, a customer maybe tagged with a specific region or industry and same goes true for a white paper, case study, web page, image and so on.

Hopefully I'll get a chance to describe this tag system in more detail, but in creating some administration pages for it I had to use a bit of SQL that few people talk about. I thought, hey why not throw it out there in case someone is learning SQL and finds it useful. My standard disclaimer: There are many ways to skin a cat as they say, so this is just one take and feel free to post some suggestions. Also this example is not optimized nor performance tuned so you might want to test this carefully on a set of tables with a lot of data.

Tables used in this example:

CREATE TABLE tag (
id int(10) unsigned not null auto_increment,
name varchar(50) not null default '',
PRIMARY KEY (id)
);

CREATE TABLE tag_applied (
tag_id int(10) unsigned not null,
app_id varchar(50) not null,
UNIQUE KEY(tag_id, app_id),
KEY (tag_id)
);
The table "tag" contains the listing of tags, the table "tag_applied" contains the tag_id matched against a specific app_id. As you can imagine, an app_id must be able to have many different tags.

Let's add some sample data just for kicks before I describe the problem:

INSERT INTO tag (id, name) VALUES (1, "Red"), (2, "Green"), (3, "Blue"), (4, "Orange"), (5, "Black"), (6, "White"), (7, "Violet"), (8, "Magenta"), (9, "Indigo");
For the applied tags, imagine we're using the tags to describe certain objects:

INSERT INTO tag_applied (tag_id, app_id) VALUES (1, "cat"), (9, "cat"), (1, "dog"), (2, "horse"), (8, "horse"), (5, "horse"), (8, "mouse"), (9, "mouse"),(1, "mouse"), (2, "mouse"), (5, "door"), (6, "door"), (8, "door");

Now here's the problem young Skywalker: "Give me back all the objects which have the colour 'Red' AND the colour 'Magenta'".

Some folks might be tempted to simply do a query and programatically find the answer. Hey, I've even seen the use of weird joins for this, as I've said, there are many ways, some better than others, some better depending on the situation. Here's a simple answer:


SELECT app_id, count(tag_id)
FROM tag_applied
WHERE tag_id IN (1, 8)
GROUP BY app_id
HAVING count(tag_id) = 2;

Answer:


+--------+---------------+
| app_id | count(tag_id) |
+--------+---------------+
| mouse  |             2 |
+--------+---------------+


"HAVING" as noted by the sage of our times, Wikipedia, is the ability to specify which results to bring back based on some sort of aggregate function (count and sum are such functions). In this case, the count that satisfies the condition of being both "Red" and "Magenta". "HAVING" should normally refer to columns in the "GROUP BY" or in use by some aggregate function in the select string. The MySQL Documentation goes into very specific detail about how it's used. You might notice as you read it that different versions of MySQL have treated "HAVING" in different, non-standard ways. I'd read up a bit more about it before using it if you are curious, or stick to the standard.

Comments:

The GROUP BY app_id and HAVING COUNT(tag_id) = 2 will produce a temporary filesort. To eliminate this, you could do:

SELECT ta2.app_id, count(\*)
FROM tag t1 CROSS JOIN tag t2
INNER JOIN tag_applied ta1
ON t1.id = ta1.tag_id
INNER JOIN tag_applied ta2
ON ta1.app_id = ta2.app_id
AND ta2.tag_id = t2.id
WHERE t1.id = "Magenta"
AND t2.id = "Red"
GROUP BY ta2.app_id;

Cheers!

Jay

Posted by Jay Pipes on October 28, 2008 at 04:38 AM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

This is the blog of Dups... currently I'm one of MySQL's Community Relations Managers for Sun Microsystems, post, contact me, I want to hear from you!

Search

Archives
« August 2015
SunMonTueWedThuFriSat
      
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
     
Today