Spiderbuf
爬虫练习
Python习题
技术文章
在线工具
捐赠
S01 - requests库及lxml库入门
发布日期:
1718093235
阅读数:1360
coding=utf-8 import requests from lxml import etree url = ‘https://spiderbuf.cn/web-scraping-practice/requests-lxml-for-scraping-beginner’ html = requests.get(url).text f = open(‘01.html’, ‘w’, encoding=‘utf-8’) f.write(html) f.close() root = etree.HTM...
Python调用Selenium爬取网页
发布日期:
1718092674
阅读数:1072
# coding=utf-8 from selenium import webdriver if __name__ == '__main__': url = 'http://www.example.com' client = webdriver.Chrome() client.get(url) html = client.page_source print(html) client.quit()...
Python解析Json字符串
发布日期:
1718092557
阅读数:950
# coding=utf-8 import json if __name__ == '__main__': json_str = '{"website":"Spiderbuf", "url":"http://www.spiderbuf.cn","description":"Python爬虫练习网站"}' json_obj = json.loads(json_str) print(json_obj['website'])...
Python爬取图片并保存
发布日期:
1718092472
阅读数:970
# coding=utf-8 import requests # 请求远程图片的函数,参数url为图片的完整链接,函数返回请求回来的二进制内容 def get_content(url): # 准备好User-Agent到变量myheaders myheaders = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36'} response = requests...
Python读写文本文件
发布日期:
1718092389
阅读数:642
# coding=utf-8 # 覆盖写入 def save_to_file(file_name, content): with open(file_name, 'w', encoding='utf-8') as f: f.write(content) if __name__ == '__main__': save_to_file('./test.txt', '这是要写入的内容') # 循环写入 with open('./test.txt', 'a', encoding='utf-8') as f: for i in range(0,10): f.write(str...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21