imdb | r4g54g4r's h4ckl0g

Posts Tagged ‘imdb’

Python script that ranks Hollywood actors based on number of appearances in top 100 movies

Posted in Python, Scripts, tagged actors ranking, codejam, imdb, imdbpy, Python on June 15, 2011| 3 Comments »

Actually this is a sample question appeared in codejam contest conducted by mobme wireless. Imdbpy python module is used to retrieve movie informations from imdb.

For those who don’t know about this event : http://codejam.mobme.in/

Sample Question :

Write a program that ranks Hollywood actors based on the number of their appearances in a list of top 100 movies. There are a number of top movie lists on the Internet and it’s up to you to choose one. We’d prefer you choose one that has an open API.

Solution in Python :

#!/usr/bin/env python

__author__ = "Rag Sagar.V"
__email__ = '@'.join(['ragsagar','.'.join([_ for _ in ['gmail','com']])])


from twisted.internet import reactor, threads
import re,imdb,itertools


actors_rating = {} #actors_rating['actor name'] = rank
rank = 0
count = 1
current_rank = 0
concurrent = 5
finished = itertools.count(1)
reactor.suggestThreadPoolSize(concurrent)


try:
	imdb_access = imdb.IMDb()
except imdb.IMDbError, err:
	print err
		
top_100 = imdb_access.get_top250_movies()[:100]


def populate_actors(mid):
	movie = imdb_access.get_movie(int(mid))
	#print movie
	for i in (0,1):
		actor_name =  movie['cast'][i]['name']
		if actors_rating.has_key(actor_name):
			actors_rating[actor_name] = actors_rating[actor_name] + 1
		else:
			actors_rating[actor_name] = 1
	if finished.next()==added:
		reactor.stop()
	
added = 0
for movie in top_100:
	added += 1	
	req = threads.deferToThread(populate_actors, movie.getID())

try:
	reactor.run()
except KeyboardInterrupt:
	reactor.stop()	

   
for actor in sorted(actors_rating, key=actors_rating.get, reverse=True):
	previous_rank = current_rank
	current_rank = actors_rating[actor]
	if previous_rank !=  current_rank :
		rank += count
		count = 1
	else:
		count += 1	
	print rank,actor

Dependency :
imdbpy

Read Full Post »

Python script to find Imdb rating

Posted in Uncategorized, tagged BeautifulSoup, command line, imdb, mechanize, movie, Python, rating, script on November 20, 2010| 4 Comments »

Here is a script i wrote last night.It finds imdb rating of a movie.

#ragsagar[at]gmail.com
from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import sys,re

if len(sys.argv) != 2:
	print "\nSyntax: python %s 'Movie title'" % (sys.argv[0])
	exit()
else :
	movie = '+'.join(sys.argv[1].split())

br = Browser()
br.open("http://www.imdb.com/find?s=tt&q="+movie)
link = br.find_link(url_regex = re.compile(r"/title/tt*"))
res = br.follow_link(link)
soup = BeautifulSoup(res.read())
try :
	title = soup.find('h1').contents[0].strip()
	rating = soup.find('span',attrs='rating-rating').contents[0]
	print "Movie : ",title
	print "Rating: ",rating,"/ 10.0"
except :
	print "Not Found!"

Here is the screenshot of output.