Spiderbuf
爬虫练习
Python习题
技术文章
在线工具
捐赠
S07 - ajax动态加载数据的爬取
发布日期:
1718094739
阅读数:745
coding=utf-8 import requests import json url = ‘https://spiderbuf.cn/web-scraping-practice/iplist?order=asc’ myheaders = {‘User-Agent’:‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36’} data_json ...
S06 - 带iframe的页面源码分析及数据爬取
发布日期:
1718094696
阅读数:740
coding=utf-8 import requests from lxml import etree url = ‘https://spiderbuf.cn/web-scraping-practice/inner’ myheaders = {‘User-Agent’:‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36’} html = req...
S05 - 网页图片的爬取及本地保存
发布日期:
1718093764
阅读数:879
coding=utf-8 import requests from lxml import etree url = ‘https://spiderbuf.cn/web-scraping-practice/scraping-images-from-web’ myheaders = { ‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537....
S04 - 分页参数分析及翻页爬取
发布日期:
1718093716
阅读数:827
coding=utf-8 import requests from lxml import etree import re base_url = ‘https://spiderbuf.cn/web-scraping-practice/web-pagination-scraper?pageno=%d’ myheaders = { ‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91...
S03 - lxml库进阶语法及解析练习
发布日期:
1718093665
阅读数:817
coding=utf-8 import requests from lxml import etree url = ‘https://spiderbuf.cn/web-scraping-practice/lxml-xpath-advanced’ myheaders = {‘User-Agent’:‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36&rsqu...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21