Click to See Complete Forum and Search --> : Social Networking Algorithm/SQL


avekm
06-20-2006, 05:21 PM
Alright I figured I'd let you geniuses have a stab at this since I don't know where else to start. How does social networking sites like myspace etc... link users so quickly when there is no direct link between them. For example user A <--B--> C. In this example myspace is able to tell me that user A is linked to user C through user B. So with myspaces millions and millions of users how the heck can they do this so fast and to the 6th degree none the less.

I'm sure that there are tons of ways to accomplish this but for me the only way I can think to do this is through SQL to join the friends table to itself thus getting me to user C. For example I have a table:

Friends
user_id friend_id
A B
B C



I could do this with SQL:
SELECT friend_id
from friends a, friends b
where a.friend_id = b.user_id
and a.user_id = 'A'

That would get me C but with millions and millions of records that would be an insane query especially if I tried to take it a few degree's further. So how can this be accomplished with minimizing the hits to the database and using some server processing power instead. Does anyone know of any algorithms etc... that would help accomplish this? My brain is too small to comprehend this complexity. Have a super great day folks.

chazzy
06-20-2006, 06:13 PM
i believe myspace has their table setup like this:

friend_cross_product:
left_friend right_friend
user_a user_b
user_b user_a
user_c user_b
user_b user_c

so the friends in common between user_a and user_c would be:

select right_friend from friend_cross_product where right_friend in (select right_friend from friend_cross_product WHERE left_friend='user_a') and left_friend='user_c'

i don't believe anyone goes beyond 2 levels of connectiveness, and even then most only go to 1. when determining the # of people in your network - it's typically the sum of the counts of distinct friends.