H06 - 初识浏览器指纹:Selenium是如何被反爬的
# coding=utf-8 import base64 import hashlib import time import requests from lxml import etree from selenium import webdriver base_url = 'https://www.spiderbuf.cn/playground/h06' myheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chr...
H05 - js逆向破解时间戳反爬
# coding=utf-8 import base64 import hashlib import time import requests from lxml import etree from selenium import webdriver base_url = 'https://www.spiderbuf.cn/playground/h05' myheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chr...
H04 - js加密混淆及简单反调试
# coding=utf-8 import requests from lxml import etree from selenium import webdriver import time base_url = 'https://www.spiderbuf.cn/playground/h04' myheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537....
H03 - 网页滚动加载的原理及爬取(JavaScript加密混淆逆向基础)
# coding=utf-8 import os.path import requests from lxml import etree import time base_url = 'https://www.spiderbuf.cn/playground/h03' myheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36'} def getHTML...
H02 - 高分电影列表复杂页面的解析(仿豆瓣电影)-xpath高级用法
# coding=utf-8 import os.path import requests from lxml import etree import time base_url = 'https://www.spiderbuf.cn/playground/h02' myheaders = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36'} def getHTML...