Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API. Playwright is built to enable cross-browser web automation that is ever-green, capable, reliable and fast.
Linux | macOS | Windows | |
---|---|---|---|
Chromium 124.0.6367.8 | ✅ | ✅ | ✅ |
WebKit 17.4 | ✅ | ✅ | ✅ |
Firefox 123.0 | ✅ | ✅ | ✅ |
Headless execution is supported for all browsers on all platforms. Check out system requirements for details.
Looking for Playwright for Python, .NET, or Java?
模拟用户操作浏览器,适用于网页图片懒加载,通过构建网络请求地址比较复杂时,可以使用这种操作替换,防止直接调接口被拦截
可以使用无头模式后台执行任务
Java 爬取百度图片例子
import cn.hutool.core.io.FileUtil;
import com.microsoft.playwright.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Paths;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class TestWeb {
static Set<String> set = new HashSet<>();
public static void main(String[] args) throws InterruptedException {
try (Playwright playwright = Playwright.create()) {
Browser browser = playwright.chromium().launch(new BrowserType.LaunchOptions()
.setHeadless(true));
BrowserContext context = browser.newContext(
new Browser.NewContextOptions());
Page page = context.newPage();
page.navigate("https://image.baidu.com/search/index?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&ie=utf-8&dyTabStr=MCwzLDEsMiw1LDYsNCw4LDcsOQ%3D%3D&word=%E7%81%AB%E5%BD%B1%E5%BF%8D%E8%80%85");
int i = 0;
// 滚轮下滑10次
while (i++ < 10) {
List<ElementHandle> elementHandles = page.querySelectorAll("img.main_img");
for (ElementHandle handle : elementHandles) {
set.add(handle.getAttribute("src"));
}
page.evaluate("window.scrollBy(0, 1500)");
}
for (String s : set) {
if (s.contains("base64")) {
continue;
}
FileUtil.appendString("<img src=\"" +s + "\">" + "\n", Paths.get("images.txt").toFile(), StandardCharsets.UTF_8);
}
browser.close();
}
}
}
评论区