Feeds:
Posts
Comments

Posts Tagged ‘imdb’

Actually this is a sample question appeared in codejam contest conducted by mobme wireless. Imdbpy python module is used to retrieve movie informations from imdb.

For those who don’t know about this event : http://codejam.mobme.in/

Sample Question :

Write a program that ranks Hollywood actors based on the number of their appearances in a list of top 100 movies. There are a number of top movie lists on the Internet and it’s up to you to choose one. We’d prefer you choose one that has an open API.

Solution in Python :

#!/usr/bin/env python

__author__ = "Rag Sagar.V"
__email__ = '@'.join(['ragsagar','.'.join([_ for _ in ['gmail','com']])])


from twisted.internet import reactor, threads
import re,imdb,itertools


actors_rating = {} #actors_rating['actor name'] = rank
rank = 0
count = 1
current_rank = 0
concurrent = 5
finished = itertools.count(1)
reactor.suggestThreadPoolSize(concurrent)


try:
	imdb_access = imdb.IMDb()
except imdb.IMDbError, err:
	print err
		
top_100 = imdb_access.get_top250_movies()[:100]


def populate_actors(mid):
	movie = imdb_access.get_movie(int(mid))
	#print movie
	for i in (0,1):
		actor_name =  movie['cast'][i]['name']
		if actors_rating.has_key(actor_name):
			actors_rating[actor_name] = actors_rating[actor_name] + 1
		else:
			actors_rating[actor_name] = 1
	if finished.next()==added:
		reactor.stop()
	
added = 0
for movie in top_100:
	added += 1	
	req = threads.deferToThread(populate_actors, movie.getID())

try:
	reactor.run()
except KeyboardInterrupt:
	reactor.stop()	

   
for actor in sorted(actors_rating, key=actors_rating.get, reverse=True):
	previous_rank = current_rank
	current_rank = actors_rating[actor]
	if previous_rank !=  current_rank :
		rank += count
		count = 1
	else:
		count += 1	
	print rank,actor   
    
    

Dependency :
imdbpy

Read Full Post »

Here is a script i wrote last night.It finds imdb rating of a movie.

#ragsagar[at]gmail.com
from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import sys,re

if len(sys.argv) != 2:
	print "\nSyntax: python %s 'Movie title'" % (sys.argv[0])
	exit()
else :
	movie = '+'.join(sys.argv[1].split())

br = Browser()
br.open("http://www.imdb.com/find?s=tt&q="+movie)
link = br.find_link(url_regex = re.compile(r"/title/tt*"))
res = br.follow_link(link)
soup = BeautifulSoup(res.read())
try :
	title = soup.find('h1').contents[0].strip()
	rating = soup.find('span',attrs='rating-rating').contents[0]
	print "Movie : ",title
	print "Rating: ",rating,"/ 10.0"
except :
	print "Not Found!"	

Here is the screenshot of output.

Output of the script

Read Full Post »