satan 通过 Google 阅读器发送给您的内容:
于 12-4-9 通过 averiany涂鸦馆 作者:averainy
这个python脚本:批量查询网站的pr是在老王python的博客看到的,站长会喜欢的,贴出来分享一下,
# -*- coding: utf-8 -*- import re,urllib,httplib,time def get_url(url): '''获取标准的url''' host_re = re.compile(r'^https?://(.*?)($|/)', re.IGNORECASE ) return host_re.search(url).group(0)[7:-1] def get_pr(url): '''获取相关的pr''' params = urllib.urlencode({'PRAddress':url}) headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain", "User-agent":"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)", "Referer":"http://pr.chinaz.com/?PRAddress=www.baidu.com" } conn = httplib.HTTPConnection("pr.chinaz.com") conn.request("GET", "", params, headers) response = conn.getresponse() data = response.read() datautf8 = data.decode('utf-8') posin = datautf8.find('enkey') keyinfo = datautf8[posin+6:posin+38] opener = urllib.FancyURLopener() opener.addheaders = [ ('User-agent','Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)') ] hosturl = "http://pr.chinaz.com/ajaxsync.aspx?at=pr&enkey=%s&url=%s" % (keyinfo,url) info = opener.open(hosturl).read() cinfo = info.decode('utf-8').encode('gbk') num_re = re.compile(r'[0-9]') pr_num = num_re.search(cinfo).group(0) print pr_num return pr_num f = file('pr.txt','w') for m in file('info.txt','r'): murl = m.strip() # checkurl = get_url(murl) try: prnum = get_pr(murl) except Exception,e: prnum = -1 content = "%s,%s\n" % (murl,prnum) f.write(content) continue else: content = "%s,%s\n" % (murl,prnum) f.write(content) time.sleep(5) f.close()
来源:老王python
可从此处完成的操作:
- 使用 Google 阅读器订阅averiany涂鸦馆
- 开始使用 Google 阅读器,轻松地与您喜爱的所有网站保持同步更新
没有评论:
发表评论