微博登录限制了错误次数···加上Cookie大批账号被封需要从Cookie池中 剔除被封的账号··· 需要使用代理··· 无赖百度了大半天都是特么的啥玩意儿???结果换成了 Google手到擒来 分分钟解决(那么问题来了?百度除了卖假药还会干啥?)
Selenium+Chrome认证代理不能通过options处理。只能换个方法使用扩展解决
原文地址:https://stackoverflow.com/questions/29983106/how-can-i-set-proxy-with-authentication-in-selenium-chrome-web-driver-using-pyth#answer-30953780 (Stack Overflow 这是个好地方啊)
走你!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
|
# -*- coding: utf-8 -*-
# @Time : 2017/11/15 9:50
# @Author : 哎哟卧槽
# @Site :
# @File : pubilc.py
# @Software: PyCharm
import
string
import
zipfile
def
create_proxyauth_extension
(
proxy_host
,
proxy_port
,
proxy_username
,
proxy_password
,
scheme
=
'http'
,
plugin_path
=
None
)
:
"""代理认证插件
args:
proxy_host (str): 你的代理地址或者域名(str类型)
proxy_port (int): 代理端口号(int类型)
proxy_username (str):用户名(字符串)
proxy_password (str): 密码 (字符串)
kwargs:
scheme (str): 代理方式 默认http
plugin_path (str): 扩展的绝对路径
return str -> plugin_path
"""
if
plugin_path
is
None
:
plugin_path
=
'vimm_chrome_proxyauth_plugin.zip'
manifest_json
=
"""
{
"version": "1.0.0",
"manifest_version": 2,
"name": "Chrome Proxy",
"permissions": [
"proxy",
"tabs",
"unlimitedStorage",
"storage",
"<all_urls>",
"webRequest",
"webRequestBlocking"
],
"background": {
"scripts": ["background.js"]
},
"minimum_chrome_version":"22.0.0"
}
"""
background_js
=
string
.
Template
(
"""
var config = {
mode: "fixed_servers",
rules: {
singleProxy: {
scheme: "${scheme}",
host: "${host}",
port: parseInt(${port})
},
bypassList: ["foobar.com"]
}
};
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
return {
authCredentials: {
username: "${username}",
password: "${password}"
}
};
}
chrome.webRequest.onAuthRequired.addListener(
callbackFn,
{urls: ["<all_urls>"]},
['blocking']
);
"""
)
.
substitute
(
host
=
proxy_host
,
port
=
proxy_port
,
username
=
proxy_username
,
password
=
proxy_password
,
scheme
=
scheme
,
)
with
zipfile
.
ZipFile
(
plugin_path
,
'w'
)
as
zp
:
zp
.
writestr
(
"manifest.json"
,
manifest_json
)
zp
.
writestr
(
"background.js"
,
background_js
)
return
plugin_path
|
使用方法:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
from
selenium
import
webdriver
from
common
.
pubilc
import
create_proxyauth_extension
proxyauth_plugin_path
=
create_proxyauth_extension
(
proxy_host
=
"XXXXX.com"
,
proxy_port
=
9020
,
proxy_username
=
"XXXXXXX"
,
proxy_password
=
"XXXXXXX"
)
co
=
webdriver
.
ChromeOptions
(
)
# co.add_argument("--start-maximized")
co
.
add_extension
(
proxyauth_plugin_path
)
driver
=
webdriver
.
Chrome
(
executable_path
=
"C:chromedriver.exe"
,
chrome_options
=
co
)
driver
.
get
(
"http://ip138.com/"
)
print
(
driver
.
page_source
)
|
无认证代理:
1
2
3
4
5
|
options
=
webdriver
.
ChromeOptions
(
)
options
.
add_argument
(
'--proxy-server=http://ip:port'
)
driver
=
webdriver
.
Chrome
(
executable_path
=
"C:chromedriver.exe"
,
chrome_options
=
0ptions
)
driver
.
get
(
"http://ip138.com/"
)
print
(
driver
.
page_source
)
|
以上完毕 So Easy
转载请注明: 静觅 » 小白学爬虫-设置Selenium+Chrome代理