Facebook like news feed
I've searched Google for an hour now and not found anything that answers my question. I'm wondering how the News Feeds @ Facebook works. How is it put together, how is the DB behind it and how is it done?
If any of you have any ideas, please share
It will be based on your id linked to your friends ids. The news feed takes the activity from your friends and posts them in the feeds. It's straight forward but you need a good understanding of how tables relate to each other.
This thread is an example of how to relate tables.
Last edited by SyCo; 01-06-2009 at 03:20 PM.
there's no reason to cross post.
But the activity of your friend is in many different tables, I would guess. Will I have to JOIN each and every table that has anything to do with the activity of my friend and order it by date?
Or is it better to create a new separate table, called "news_feed" for an example, and dump every activity in there with a simple text like "John just wrote a new thread called 'This is my thread'" and then query that table?
Thanks for any suggestions!
Depends on how you want to structure your tables. Either way has pros and cons.
You can select from multi tables every time the news feed page is read but it carries the overhead of many SQL queries requiring more memory. It's still not going to be a huge amount. With web development you want to reduce the number of queries you send, memory is cheap and even if you join 20 tables it's not a huge overhead unless you're storing tons of blob data in multiple tables and the site is incredibly busy. If it's just varchars and ints with a couple of texts it's no big deal.
You might choose to store it in a news feed table and for that I'd set up triggers on the other tables but then you have non normalized data requiring more storage.
Perhaps as a compromise you could add a news feed table and simply store the ids of the records of any table that was changed/added to. Triggers would populate the news feed table and the storage required would be minimal. It would be less server intensive then so you could add an ajax heart beat to look for new feeds without incurring much of a memory hit.
Last edited by SyCo; 01-23-2009 at 12:55 PM.
Advancing the Topic
Think about it... what drives the News Feed is what happens on the individual profiles/Mini Feeds. If that holds true then the Mini-Feed could be driven by a trigger:
If you post a new thread a trigger is executed and your Mini-Feed publishes the details.
The Mini Feed is also broadcasting the RSS data which is identical to what appears on the Mini Feed, and the News Feed is essentially an RSS reader that displays the information.
I'm not that great at programming, yet, but it seems to me like building a whole "underside" of tables is just extra weight slowing down the site. Since you're already using user profiles, let the user profile it self be the defacto "table." Lemme know what you think.
All of what you just said... totally over my head. I'll pay you to finish my site, interested?
lol, knowledge is just a Google away.
I don't freelance anymore but plenty of people here do and there are freelance websites where people bid on your contracts.
awe come on! you seem like one of the few people with a handle on the News Feed concept which is really all I need, I can do the rest. If you don't want to take on the whole project I get that, thats cool, but you know as well as I do the News Feed is a complicated piece of technology... at the very least just develop that part for me and I'll implement it from there.
I'm a full time programmer and on the weekend I photograph weddings and portraits. I really don't have time to freelance. I make more money with my camera anyway
I respect that, thanks for the info.
Don't mean to step on your toes or anything, Syco ... just wanted to clarify: varchars are actually more likely to slow down your queries than blobs and texts will. Blobs and texts are strictly stored apart from the actual table rows for just about every storage engine. The real efficiency concerns are storage engine choice, indexing, and query structure: if you're using MySQL, use InnoDB, make sure every table has an auto-incremented PK, and be smart (and educated) with your queries.
... it's not a huge overhead unless you're storing tons of blob data in multiple tables and the site is incredibly busy. If it's just varchars and ints with a couple of texts it's no big deal.
You'll probably be perfectly fine launching your site with those enormous joins for frequent SELECTs. Just be ready to change things up in a variety of ways if your traffic increases. For the sake of being safe and ready, be aware that it will probably help more than it hurts to add a news table to keep updated alongside other news-worthy updates. And if your site grows significantly in the future, you'll even want to start updating that news table on a periodic basis with a cron job or something.
You'll want to consider what tables/rows need to be locked to process each page and how often those page are accessed in relation to each other. My impression is that your news feed will be accessed much more frequently than it will be updated. That fact pushes the advantage towards using that separate news table--possibly the only table in your schema that will be truly advantageous to use MyISAM with (because it should never receive any UPDATEs or be part of any JOINs; only simple SELECTs and INSERTs).
This is all highly dependent on your particular schema, query structures/methodologies, and traffic patterns though. I would recommend picking up a book on the topic -- personally, I would suggest getting High Performance MySQL from O'Reilly publishing. Again ... presuming you're using MySQL. And even if you're using some other DMBS, you can bet that most of the same performance practices from MySQL apply on other DBMS's.
You could discuss to death which option is better without finding a solid answer. Enter benchmarking .... once you've got a reasonable dataset to work with, copy that data over to the alternate schema and benchmark them both. Then you can tell us which one is better under your circumstances : )
Svidgen please say you will do my site.
Haha ... I don't even have time to work on my sites ... Svidgen.com has a growing list of intended features, but hasn't seen an update in months, and thepointless.com has been broken in IE6 for almost a year ...
Svidgen please say you will do my site.
Anyway ... let me emphasis the main point: if you figure out what schema works best, you should share it with us. Cuz theory ain't got nothin' on reality.
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Tags for this Thread