Home > Back-end >  Python crawler
Python crawler

Time:11-06

Project 1: booz learning platform course data crawl and clean warehouse
Component 1: create store course booz course database
1, create the database course
2, create the commodity information table ifly_course_info (number id, name, course name photo PIC, class type, participation num)
Sub-project 2: login cookie simulation booz learning platform
1, login cookie information
2, booz course page found in the url to the information of course
3, the use of login cookie simulation scrapy framework, access url to the information of booz course
Subproject 3: data cleaning inventory
1, will be the course data information (course name, category, image links, participation) deposited in the ifly_course_info table
2, save the image to the local
3, query ifly_ course _info the data in the table, the analysis of open course many doors and authorization course, and the participation of the top five courses form the histogram, either (or echarts matplotlib figure)
  • Related