Retro-tagging a WordPress blog

I’ve got a WordPress blog with a couple thousand posts, upgraded several times, from a TWiki blog to a Radio Userland blog to WordPress and a couple of upgrades. And I’d like to add the new taggin features available in WordPress 2.3. Here’s what I’ve figured out from poking around: what used to be wp_category is now wp_terms and terms contains both categories and tags. The categories are hierarchical from a self-join using the group column to link back into the term’s term_id primary key. wp_terms in turn links into wp_term_taxonomy: that’s got a column named taxonomy that contains groups like ‘category’ and ‘link_tag’ and a parent column that self links into term_id, which I think is a duplication of what wp_terms is doing. There’s also a count field, meaning that we’re denormalizing the count of categories and tags, meaning if we hack at them, we’ll need to manually update this (see below). Finally, the posts are related to categories and tags using the table wp_term_relationships that’s a many-to-many link from the post’s primary key, object_id, to the term_taxonomy_id primary key of the wp_term_taxonomy table.

So, here’s the recipe to retrofit tags into some posts. I used phpMyAdmin as the web interface into the MySQL table that holds WordPress’ data. You can do the same via the mysql terminal interface or using other remote connectivity like ODBC or your choice of tools.

1. Create or update a tag in the wp_terms table. Note the term_id PK.
2. Insert the term_id and category of ‘post_tag’ into the wp_term_taxonomy table.
3. Create the linking records in wp_term_relationships with the following SQL. In this case, I’m tagging with “Ruby” any post that has “Ruby” in the title or post.

INSERT INTO wp_term_relationships
SELECT wpp.id object_id, `term_taxonomy_id`
FROM wp_posts wpp, `wp_term_taxonomy` wptt
JOIN wp_terms wpt ON wptt.term_id=wpt.term_id
WHERE wpt.name='Ruby'
AND wptt.taxonomy='post_tag'
AND (wpp.post_content LIKE '%Ruby%'
OR wpp.post_title LIKE '%Ruby%' )
ON DUPLICATE KEY UPDATE
wp_term_relationships.term_taxonomy_id=
wp_term_relationships.term_taxonomy_id

(The ON DUPLICATE code is just a quick hack to tell MySQL to ignore it if you’ve already tagged some posts Ruby by hand.)

4. Hack away. When you’re all done, you’ll want to fix the count in the table with the code below:

UPDATE wp_term_taxonomy wptt
SET wptt.count=
(SELECT count(*)
FROM wp_term_relationships wptr
WHERE wptr.term_taxonomy_id = wptt.term_taxonomy_id
GROUP BY term_taxonomy_id)

, , ,

Powered by WordPress. Designed by Woo Themes

This work by Ted Roche is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States.