Scrapy+Mysql爬虫 知乎热榜( 二 )


.py
import pymysql.cursorsclass ZhihuspiderPipeline(object):def __init__(self):self.connect=pymysql.connect(host='127.0.0.1',port=3306,database='scrapymysql',user='root',password='password',charset='utf8')self.cursor=self.connect.cursor()def process_item(self, item, spider):self.cursor.execute("""insert into zhihu_hot(tag)values(%s)""",(item['hot_topic']))self.connect.commit()return item
.py
BOT_NAME = 'zhihuSpider'SPIDER_MODULES = ['zhihuSpider.spiders']NEWSPIDER_MODULE = 'zhihuSpider.spiders'ITEM_PIPELINES = {'zhihuSpider.pipelines.ZhihuspiderPipeline': 300,}ROBOTSTXT_OBEY = False
【Scrapy+Mysql爬虫 知乎热榜】全部编写完成之后,编译下 。就可以在cmd(工程项目文件夹下)输入: crawl zhihu,就可以运行爬虫了 。