"Having" a grand old time
By Duleepa Wijayawardhana on Oct 19, 2008
One of my current MySQL Web Team projects involves the creation of a "Tag" system for the web site. The idea is quite simple, we have lots of different types of content and we want to be able to link it all together by tags. For example, a customer maybe tagged with a specific region or industry and same goes true for a white paper, case study, web page, image and so on.
Hopefully I'll get a chance to describe this tag system in more detail, but in creating some administration pages for it I had to use a bit of SQL that few people talk about. I thought, hey why not throw it out there in case someone is learning SQL and finds it useful. My standard disclaimer: There are many ways to skin a cat as they say, so this is just one take and feel free to post some suggestions. Also this example is not optimized nor performance tuned so you might want to test this carefully on a set of tables with a lot of data.
Tables used in this example:
CREATE TABLE tag (The table "tag" contains the listing of tags, the table "tag_applied" contains the tag_id matched against a specific app_id. As you can imagine, an app_id must be able to have many different tags.
id int(10) unsigned not null auto_increment,
name varchar(50) not null default '',
PRIMARY KEY (id)
CREATE TABLE tag_applied (
tag_id int(10) unsigned not null,
app_id varchar(50) not null,
UNIQUE KEY(tag_id, app_id),
Let's add some sample data just for kicks before I describe the problem:
INSERT INTO tag (id, name) VALUES (1, "Red"), (2, "Green"), (3, "Blue"), (4, "Orange"), (5, "Black"), (6, "White"), (7, "Violet"), (8, "Magenta"), (9, "Indigo");For the applied tags, imagine we're using the tags to describe certain objects:
INSERT INTO tag_applied (tag_id, app_id) VALUES (1, "cat"), (9, "cat"), (1, "dog"), (2, "horse"), (8, "horse"), (5, "horse"), (8, "mouse"), (9, "mouse"),(1, "mouse"), (2, "mouse"), (5, "door"), (6, "door"), (8, "door");
Now here's the problem young Skywalker: "Give me back all the objects which have the colour 'Red' AND the colour 'Magenta'".
Some folks might be tempted to simply do a query and programatically find the answer. Hey, I've even seen the use of weird joins for this, as I've said, there are many ways, some better than others, some better depending on the situation. Here's a simple answer:
SELECT app_id, count(tag_id)
WHERE tag_id IN (1, 8)
GROUP BY app_id
HAVING count(tag_id) = 2;
| app_id | count(tag_id) |
| mouse | 2 |
"HAVING" as noted by the sage of our times, Wikipedia, is the ability to specify which results to bring back based on some sort of aggregate function (count and sum are such functions). In this case, the count that satisfies the condition of being both "Red" and "Magenta". "HAVING" should normally refer to columns in the "GROUP BY" or in use by some aggregate function in the select string. The MySQL Documentation goes into very specific detail about how it's used. You might notice as you read it that different versions of MySQL have treated "HAVING" in different, non-standard ways. I'd read up a bit more about it before using it if you are curious, or stick to the standard.