[277]requests+selenium==requestium模块介绍

[277]requests+selenium==requestium模块介绍
有时,你可能会在⽹上实现⼀些⾃动化操作。⽐如抓取⽹站,进⾏应⽤测试,或在⽹上填表,但⼜不想使⽤API,这时⾃动化就变得很必要。Python提供了⾮常优秀的Requests库可以辅助进⾏这些操作。可惜,很多⽹站采⽤基于JavaScript的重客户端,这就意味着Requests获取的HTML代码中根本就没有⽤来⾃动化操作的表单,更别提⾃动填表了!它取回的基本上都是React或Vue这些现代前端库在浏览器中⽣成的空DIV这类的代码。
虽然可以通过反向⼯程处理JavaScript⽣成的代码,但这需要花⼏个⼩时来编译。处理这些丑陋的JS代码,谢谢,还是算了吧。还有⼀个⽅法就是使⽤Selenium库,允许以程序化的⽅式和浏览器进⾏交互,并运⾏JavaScript代码。⽤了这个库就没什么问题了,但它⽐占⽤资源极少的Requests慢太多了。
如果能以Requests为主,只在需要Selenium的时候才⽆缝调⽤,这样不是更好?看看Requestium吧,它以内嵌⽅式取代Requests,⽽且⼲的不错。它整合了Parsel,⽤它编写的页⾯查询元素选择器代码特别清晰,它还为诸如点击元素和在DOM中渲染内容这些通⽤操作提供了帮助。⼜⼀个⽹页⾃动化省时利器!
安装
pip install requestium
然后你应该下载您的⾸选是WebDriver如果你计划使⽤Requestium的selenium的⼀部分:Chromedriver或PhantomJS
使⽤
⾸先创建⼀个会话,你可以请求,并且可以随意地添加参数的⽹络驱动程序
from requestium import Session, Keys
s = Session(webdriver_path='./chromedriver',
browser='chrome',
default_timeout=15,
webdriver_options={'arguments': ['headless']})
你不需要解析的响应,它是⾃动完成时调⽤XPath,CSS或re
title = s.get('').xpath('//title/text()').extract_first(default='Default Title')
119b
正则表达式需要较少的样本相⽐,Python的标准re模块
response = s.get('/sample_path')
# Extracts the first match
identifier = _first(r'ID_\d\w\d', default='ID_1A1')
# Extracts all matches as a list
users = (r'user_\d\d\d')
会话对象只是⼀个普通的请求的会话对象,所以你可以使⽤所有的⽅法。
s.post('/sample', data={'field1': 'data1'})
s.proxies.update({'http': '10.11.4.254:3128', 'https': '10.11.4.252:3128'})
你可以切换使⽤的是WebDriver运⾏任何JS代码。
('/sample/process')
废盐焚烧炉选型
驱动对象是⼀个是WebDriver的对象,所以你可以使⽤任何正常selenium⽅法加上新添加的requestium⽅法。
s.driver.find_element_by_xpath("//input[@class='user_name']").send_keys('James Bond', Keys.ENTER)
# New method which waits for element to load instead of failing, useful for single page web apps
sure_element_by_xpath("//div[@attribute='button']").click()
requestium还增加了XPath,CSS,和re作为selenium的驱动对象。
if (r'ID_\d\w\d some_pattern'):
print('Found it!')
最后你可以切换回⽤要求。
s.post('/sample2', data={'key1': 'value1'})
你可以使⽤这些元素的⽅法有新的ensure_click⽅法是点击不易失败。这有助于通过⼤量的selenium点击问题。sure_element_by_xpath("//li[@class='b1']", state='clickable', timeout=5).ensure_click()
# === We also added these methods named in accordance to Selenium's api design ===
# ensure_element_by_id
# ensure_element_by_name
# ensure_element_by_link_text
# ensure_element_by_partial_link_text
# ensure_element_by_tag_name
# ensure_element_by_class_name
# ensure_element_by_css_selector
add cookie丙烯运输
cookie = {"domain": "",
"secure": false,
"value": "sd2451dgd13",
"expiry": 1516824855.759154,
"path": "/",
"httpOnly": true,
"name": "sessionid"}
sure_add_cookie(cookie, override_domain='')
氯化钠是一种重要的化工原料使⽤requestium
from requestium import Session, Keys
# If you want requestium to type your username in the browser for you, write it in here:
reddit_user_name = ''
s = Session('./chromedriver', browser='chrome', default_timeout=15)
('')
s.driver.find_element_by_xpath("//a[@href='/login']").click()
print('Waiting for elements ')
sure_element_by_class_name("desktop-onboarding-sign-up__form-toggler",
state='visible').click()
if reddit_user_name:
sure_element_by_id('user_login').send_keys(reddit_user_name)
sure_element_by_id('passwd_login').send_keys(Keys.BACKSPACE)
print('Please log-in in the chrome browser')
sure_element_by_class_name("desktop-onboarding__title", timeout=60, state='invisible') print('Thanks!')
if not reddit_user_name:
reddit_user_name = s.driver.xpath("//span[@class='user']//text()").extract_first()
if reddit_user_name:
response = s.get("/user/{}/".format(reddit_user_name))
cmnt_karma = response.xpath("//span[@class='karma comment-karma']//text()").extract_first()
reddit_golds_given = _first(r"(\d+) gildings given out")
print("Comment karma: {}".format(cmnt_karma))弹力玩具
print("Reddit golds given: {}".format(reddit_golds_given))
else:
print("Couldn't get user name")
使⽤Requests + Selenium + lxml
import re
from lxml import etree
from requests import Session
from selenium import webdriver
ptions import TimeoutException
from keys import Keys
from by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# If you want requestium to type your username in the browser for you, write it in here:
reddit_user_name = ''
driver = webdriver.Chrome('./chromedriver')
<('')
driver.find_element_by_xpath("//a[@href='/login']").click()
print('Waiting for elements ')
WebDriverWait(driver, 15).until(
EC.visibility_of_element_located((By.CLASS_NAME, "desktop-onboarding-sign-up__form-toggler")) ).click()
if reddit_user_name:
WebDriverWait(driver, 15).until(
EC.presence_of_element_located((By.ID, 'user_login'))
).send_keys(reddit_user_name)
driver.find_element_by_id('passwd_login').send_keys(Keys.BACKSPACE)
print('Please log-in in the chrome browser')
try:
WebDriverWait(driver, 3).until(潜流带
EC.presence_of_element_located((By.CLASS_NAME, "desktop-onboarding__title"))    )
except TimeoutException:
pass
WebDriverWait(driver, 60).until(
EC.invisibility_of_element_located((By.CLASS_NAME, "desktop-onboarding__title")) )
print('Thanks!')
if not reddit_user_name:
tree = etree.HTML(driver.page_source)
try:
reddit_user_name = tree.xpath("//span[@class='user']//text()")[0]
except IndexError:
reddit_user_name = None
if reddit_user_name:
s = Session()
# Reddit will think we are a bot if we have the wrong user agent
selenium_user_agent = ute_script("return navigator.userAgent;")
s.headers.update({"user-agent": selenium_user_agent})
for cookie _cookies():
response = s.get("/user/{}/".format(reddit_user_name))
try:
cmnt_karma = etree.t).xpath(
"//span[@class='karma comment-karma']//text()")[0]
except IndexError:
cmnt_karma = None
match = re.search(r"(\d+) gildings given out", t))
if match:
reddit_golds_given = up(1)
else:
reddit_golds_given = None
print("Comment karma: {}".format(cmnt_karma))
print("Reddit golds given: {}".format(reddit_golds_given))
else:
print("Couldn't get user name")

本文发布于:2024-09-20 17:51:00,感谢您对本站的认可!

本文链接:https://www.17tex.com/tex/3/98016.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:操作   需要   动化
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2024 Comsenz Inc.Powered by © 易纺专利技术学习网 豫ICP备2022007602号 豫公网安备41160202000603 站长QQ:729038198 关于我们 投诉建议