Click to See Complete Forum and Search --> : Optimize many-to-many queries


FeelLikeANut
02-19-2009, 08:03 PM
First let me show you what I have.

Table: MusicAlbums
* ID
* Title

Table: MusicGenres
* ID
* Genre

Table: MusicAlbums_Genres
* AlbumID
* GenreID

My question is about selecting albums that belong to certain genres. So far I've been achieving the result like this.

select
Title
from
MusicAlbums
where
ID in (
select AlbumID
from MusicAlbums_Genres
where GenreID = (
select ID
from MusicGenres
where Genre = 'Pop'
)
)
and ID in (
select AlbumID
from MusicAlbums_Genres
where GenreID = (
select ID
from MusicGenres
where Genre = 'Rock'
)
)

This works, but I assume it can be done another way more efficiently. Any suggestions?

NogDog
02-20-2009, 05:20 AM
I think you could just do a join:

SELECT Title
FROM MusicAlbums
INNER JOIN MusicAlbums_Genres ON MusicAlbums.AlbumID = MusicAlbums_Genres.AlbumID
INNER JOIN MusicGenres ON MusicGenres.GenreID = MusicAlbums_Genres.GenreID
WHERE MusicGenres.Genre IN ('Pop', 'Rock')

FeelLikeANut
02-20-2009, 07:22 AM
Hmm... seems close, but not quite right. My original query will find albums belonging to both Pop and Rock. The query you posted will find albums belonging to either Pop or Rock. :(

NogDog
02-20-2009, 02:06 PM
. . . WHERE MusicGenres.Genre = 'Pop' AND MusicGenres.Genre = 'Rock'

FeelLikeANut
02-20-2009, 04:33 PM
Still not there yet. That filters out every record. After the join, an album will be listed on multiple rows, one row for each of its genres. But any given row will not be both Pop and Rock.

NogDog
02-20-2009, 06:00 PM
Ah, yeah, I see what you mean. I'll have to think some more....

NogDog
02-20-2009, 06:31 PM
OK: I set up a little DB and tested this, and it works. As to whether it is "more efficient", that's a whole other question. :)

SELECT
Title
FROM
MusicAlbums AS ma
INNER JOIN MusicAlbums_Genres AS mag_1 ON ma.ID = mag_1.AlbumID
INNER JOIN MusicAlbums_Genres AS mag_2 ON ma.ID = mag_2.AlbumID
INNER JOIN MusicGenres AS mg_1 ON mg_1.ID = mag_1.GenreID
INNER JOIN MusicGenres AS mg_2 ON mg_2.ID = mag_2.GenreID
WHERE
mg_1.Genre = 'Pop' AND
mg_2.Genre = 'Rock'


PS: Here are the tables I used:

-- phpMyAdmin SQL Dump
-- version 2.11.6
-- http://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: Feb 20, 2009 at 06:33 PM
-- Server version: 5.0.51
-- PHP Version: 5.2.6

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";

--
-- Database: `test`
--

-- --------------------------------------------------------

--
-- Table structure for table `MusicAlbums`
--

CREATE TABLE `MusicAlbums` (
`ID` int(10) unsigned NOT NULL auto_increment,
`Title` varchar(255) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=6 ;

--
-- Dumping data for table `MusicAlbums`
--

INSERT INTO `MusicAlbums` (`ID`, `Title`) VALUES
(1, 'Dark Side of the Moon'),
(2, 'Chicago Transit Authority'),
(3, 'Abba''s Greates Hits'),
(4, 'Goodbye Yellow Brick Road'),
(5, 'I Am');

-- --------------------------------------------------------

--
-- Table structure for table `MusicAlbums_Genres`
--

CREATE TABLE `MusicAlbums_Genres` (
`AlbumID` int(10) unsigned NOT NULL,
`GenreID` int(10) unsigned NOT NULL,
PRIMARY KEY (`AlbumID`,`GenreID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

--
-- Dumping data for table `MusicAlbums_Genres`
--

INSERT INTO `MusicAlbums_Genres` (`AlbumID`, `GenreID`) VALUES
(1, 2),
(2, 1),
(2, 2),
(3, 1),
(4, 1),
(4, 2),
(5, 1),
(5, 3);

-- --------------------------------------------------------

--
-- Table structure for table `MusicGenres`
--

CREATE TABLE `MusicGenres` (
`ID` int(10) unsigned NOT NULL auto_increment,
`Genre` varchar(255) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=5 ;

--
-- Dumping data for table `MusicGenres`
--

INSERT INTO `MusicGenres` (`ID`, `Genre`) VALUES
(1, 'Pop'),
(2, 'Rock'),
(3, 'Soul'),
(4, 'Blues');

And here is the query result:

Chicago Transit Authority
Goodbye Yellow Brick Road

FeelLikeANut
02-20-2009, 07:16 PM
Yeah, I think you're right. I think that's the only way to do it with joins. Unfortunately, after testing, it seems terribly less efficient. And if I were to filter on a third genre, I'd need a third MusicAlbums_Genres join and a third MusicGenres join, and then it runs even slower. Significantly slower.

So I suppose what I had originally might just be the best way.

NogDog
02-20-2009, 09:14 PM
One other thought:

SELECT Title FROM (
SELECT Title, COUNT( * ) AS id_count
FROM MusicAlbums
INNER JOIN MusicAlbums_Genres ON MusicAlbums.ID = MusicAlbums_Genres.AlbumID
WHERE MusicAlbums_Genres.GenreID IN (
SELECT ID FROM MusicGenres WHERE Genre IN ('Pop', 'Rock')
)
GROUP BY MusicAlbums.Title
) AS subq
WHERE subq.id_count = 2

Again, I have no idea how it performs compared to your original solution, but it looks cool. :cool: