www.webdeveloper.com
Results 1 to 2 of 2

Thread: Login to facebook using urllib

  1. #1
    Join Date
    Jul 2012
    Posts
    4

    Login to facebook using urllib

    Hi!

    I am trying to login to facebook and scrape some data programmatically. I would like to used python and urllib2. This shouldn't be too difficult. But I'm having trouble getting the login to work and I'm not really sure what is wrong.

    Please, take a look at my code below, specifically the login function, and point me in the right direction. Any help is greatly appreciated.

    Full Disclosure: I'm new to python and am still learning how http works. But I know the basics.

    Code:
    import urllib, urllib2, cookielib, gzip, StringIO
    import re	# regex
    
    # will have this info passed in the future
    email = "***@gmail.com"
    passwd = "***"
    
    cookies = cookielib.CookieJar()
    
    headers={
    	'User-Agent':'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0.1',
    	'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    	'Accept-Language':'en-us,en;q=0.5',
    	#'Accept-Encoding':'gzip',
    	'Connection':'keep-alive',
    	'Cache-Control':'max-age=3600',
    	'Content-Type':'application/x-www-form-urlencoded'
    	}
    
    urls=	{
    	"findfriends":"http://www.facebook.com/find-friends/browser/",
    	"facebook":"http://www.facebook.com/"
    	}
    
    def getPage(url,data=''):
    	"""Returns page's HTML.
    	Args[0] = URL
    	Args[1] = Data
    	"""
    	req = urllib2.Request(url,headers=headers)
    	response = urllib2.urlopen(req)
    	return response
    
    def login(email,passwd):
    	"""Logs into facebook with passed credentials
    	"""
    
    	url="http://www.facebook.com/index.php"
    	response=getPage(url) # Go to facebook.com to get initial cookies
    	
    	# Should be able to get all this info (exept u/n and p/w) from response. But how?
    	opts= (
    	('lsd','AVrB8vRK'),
    	('email',email),
    	('pass',passwd),
    	('persistent','1'), # 0 or 1 for persistant to not
    	('default_persistent','1'),
    	('charset_test','%E2%82%AC%2C%C2%B4%2C%E2%82%AC%2C%C2%B4%2C%E6%B0%B4%2C%D0%94%2C%D0%84'),
    	('timezone','240'),
    	('return_session','0'),
    	('legacy_return','1'),
    	('display',''),
    	('session_key_only','0'),
    	('lgnrnd','191955_7tXF'),
    	('lgnjs','n'),
    	('login','Log+In')	
    	)
    
    	data = urllib.urlencode(opts)
    	print data
    	request = urllib2.Request(url, data, headers, origin_req_host="https://www.facebook.com/login.php?login_attempt=1") # Should req_host be part of the header?
    	print request
    
    	cookies.extract_cookies(response,request)
    
    	cookie_handler= urllib2.HTTPCookieProcessor( cookies )
    	redirect_handler= urllib2.HTTPRedirectHandler()
    	opener = urllib2.build_opener(redirect_handler,cookie_handler)
    
    	response = opener.open(request)
    	
    	# Decode gzip encoding
    	if response.info().get('Content-Encoding') == 'gzip':
    		buf = StringIO.StringIO( response.read())
    		f = gzip.GzipFile(fileobj=buf)
    		data = f.read()
    		return data
    	else:
    		return response.read()
    
    if __name__=="__main__":
    	
    	# Write output to fb.html
    	f = open('./fb.html', 'w') # Check fb.html to see if we are logged in.
    	f.write(login(email,passwd))
    	f.close()

  2. #2
    Join Date
    Jul 2012
    Posts
    4
    There are plenty of views but no responses. Is my question too broad or unclear? Or maybe you're just on vacation and have better things to do

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles