mirror of
https://github.com/NaiboWang/EasySpider.git
synced 2025-04-21 09:35:14 +08:00
Compare commits
32 Commits
Author | SHA1 | Date | |
---|---|---|---|
![]() |
fc5aa8368b | ||
![]() |
793f028a00 | ||
![]() |
ae22977143 | ||
![]() |
541b3c13d2 | ||
![]() |
a6192b730c | ||
![]() |
d39218f5fd | ||
![]() |
a94c45b36d | ||
![]() |
0e8aba6b51 | ||
![]() |
e42ad07d80 | ||
![]() |
2f6344d00b | ||
![]() |
bfa6c0de76 | ||
![]() |
b590cc22c5 | ||
![]() |
d69adacbd1 | ||
![]() |
15654da7eb | ||
![]() |
967f5b8033 | ||
![]() |
aa419ee845 | ||
![]() |
f005e48700 | ||
![]() |
4e96ed7d50 | ||
![]() |
e3fecc8926 | ||
![]() |
119cb99711 | ||
![]() |
f43bdd236d | ||
![]() |
56f0847500 | ||
![]() |
0df6cebd18 | ||
![]() |
4b42f6300c | ||
![]() |
2cf33794f1 | ||
![]() |
9efd3b6efe | ||
![]() |
ad956be10d | ||
![]() |
01de17d471 | ||
![]() |
333dcd3ff4 | ||
![]() |
555f02815c | ||
![]() |
34ed41110a | ||
![]() |
32459b622d |
14
.github/ISSUE_TEMPLATE.md
vendored
14
.github/ISSUE_TEMPLATE.md
vendored
@ -9,3 +9,17 @@
|
|||||||
|
|
||||||
## 如何复现 | Steps to Reproduce
|
## 如何复现 | Steps to Reproduce
|
||||||
|
|
||||||
|
## 示例任务文件 | Example Task File
|
||||||
|
|
||||||
|
Windows和Linux版本的软件设计的任务文件在软件目录下的`tasks`文件夹中,文件名为任务列表中`任务的ID号.json`;MacOS系统的任务文件目录请运行下面的命令打开tasks文件夹:
|
||||||
|
|
||||||
|
The task file designed for the Windows and Linux versions of the software is in the `tasks` folder in the software directory, and the file name is `the ID number of the task.json` in the task list; the task file directory of the MacOS system is opened by running the following command:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cd /Users/$(whoami)/Library/Application\ Support/EasySpider/tasks
|
||||||
|
open .
|
||||||
|
```
|
||||||
|
|
||||||
|
请将任务文件直接以文件的方式粘贴到这里,不要截图和打开复制里面的内容。
|
||||||
|
|
||||||
|
Please paste the task file directly as a file here, do not take screenshots and open to copy the content.
|
@ -1,8 +1,29 @@
|
|||||||
Due to the complex security settings of MacOS, the issue of being unable to open software due to the "unverified developer" message may occur upon the first attempt to open the software. Please refer to the following GitHub document to see how to open software and perform tasks on your MacOS version:
|
Due to MacOS's complex security settings, software downloaded for the first time will warn that the developer is unverified and will not allow the application to run. Please follow these steps to unlock:
|
||||||
|
|
||||||
https://github.com/NaiboWang/EasySpider/wiki/MacOS-Guide
|
1. Open the system Terminal.
|
||||||
|
|
||||||
The main steps are as follows:
|
2. Navigate to the EasySpider software directory, such as:
|
||||||
|
|
||||||
|
cd ~/Downloads/EasySpider_MacOS
|
||||||
|
|
||||||
|
3. In the EasySpider directory, run the `first_time_run.sh` script to modify the package properties by using the following command:
|
||||||
|
|
||||||
|
bash first_time_run.sh
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
This will unlock EasySpider for both design and execution stages.
|
||||||
|
|
||||||
|
If you encounter errors such as the one below during the command execution, they can be ignored, and you may proceed to open the software after the command completes:
|
||||||
|
|
||||||
|
xattr: [Errno 13] Permission denied: 'EasySpider.app/Contents/Resources/app/node_modules/node-window-manager/build/node_gyp_bins/python3'
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
For another solution, refer to this video on how to open software and execute tasks in MacOS version: https://www.bilibili.com/video/BV1E34y137fT/
|
||||||
|
|
||||||
- Design phase - Apple Arm chip version of MacOS
|
- Design phase - Apple Arm chip version of MacOS
|
||||||
|
|
||||||
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/323.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/323.json
Normal file
@ -0,0 +1 @@
|
|||||||
|
{"id":323,"name":"新web采集任务","url":"https://www.baidu.com","links":"https://www.baidu.com","create_time":"","update_time":"2024-08-10 17:29:04","version":"0.6.2","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com"}],"outputParameters":[],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com","links":"https://www.baidu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}}]}
|
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/324.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/324.json
Normal file
File diff suppressed because one or more lines are too long
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/325.json
Normal file
1
.temp_to_pub/EasySpider_MacOS/Sample Tasks/325.json
Normal file
@ -0,0 +1 @@
|
|||||||
|
{"id":325,"name":"百度一下,你就知道","url":"https://www.baidu.com","links":"https://www.baidu.com","create_time":"2024-12-30 22:37:29","update_time":"2024-12-30 22:37:43","version":"0.6.3","saveThreshold":10,"quitWaitTime":60,"environment":0,"maximizeWindow":0,"maxViewLength":15,"recordLog":1,"outputFormat":"csv","saveName":"current_time","dataWriteMode":1,"inputExcel":"","startFromExit":0,"pauseKey":"p","containJudge":false,"browser":"chrome","removeDuplicate":0,"desc":"https://www.baidu.com","inputParameters":[{"id":0,"name":"urlList_0","nodeId":1,"nodeName":"打开网页","value":"https://www.baidu.com","desc":"要采集的网址列表,多行以\\n分开","type":"text","exampleValue":"https://www.baidu.com"}],"outputParameters":[{"id":0,"name":"参数1_链接文本","desc":"","type":"text","recordASField":1,"exampleValue":"0暖心2024 总书记的贴心话"},{"id":1,"name":"参数2_链接地址","desc":"","type":"text","recordASField":1,"exampleValue":"https://www.baidu.com/s?wd=%E6%9A%96%E5%BF%832024+%E6%80%BB%E4%B9%A6%E8%AE%B0%E7%9A%84%E8%B4%B4%E5%BF%83%E8%AF%9D&sa=fyb_n_homepage&rsv_dl=fyb_n_homepage&from=super&cl=3&tn=baidutop10&fr=top1000&rsv_idx=2&hisfilter=1"}],"graph":[{"index":0,"id":0,"parentId":0,"type":-1,"option":0,"title":"root","sequence":[1,2],"parameters":{"history":1,"tabIndex":0,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0},"isInLoop":false},{"id":1,"index":1,"parentId":0,"type":0,"option":1,"title":"打开网页","sequence":[],"isInLoop":false,"position":0,"parameters":{"useLoop":false,"xpath":"","wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"url":"https://www.baidu.com","links":"https://www.baidu.com","maxWaitTime":10,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"cookies":""}},{"id":2,"index":2,"parentId":0,"type":1,"option":8,"title":"循环采集数据","sequence":[3],"isInLoop":false,"position":1,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"/html/body/div[1]/div[1]/div[5]/div[1]/div[1]/div[3]/ul[1]/li/a[1]","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"scrollType":0,"scrollCount":1,"scrollWaitTime":1,"loopType":1,"pathList":"","code":"","waitTime":0,"exitCount":0,"exitElement":"//body","historyWait":2,"breakMode":0,"breakCode":"","breakCodeWaitTime":0,"skipCount":0,"allXPaths":["/html/body/div[1]/div[1]/div[5]/div[1]/div[1]/div[3]/ul[1]/li[1]/a[1]","//a[contains(., '0暖心2024 总')]","//a[@class='title-content c-link c-font-medium c-line-clamp1']","/html/body/div[last()-4]/div[last()-3]/div[last()-3]/div/div/div/ul/li[last()-9]/a"]}},{"id":3,"index":3,"parentId":2,"type":0,"option":3,"title":"提取数据","sequence":[],"isInLoop":true,"position":0,"parameters":{"history":4,"tabIndex":-1,"useLoop":false,"xpath":"","iframe":false,"wait":0,"waitType":0,"beforeJS":"","beforeJSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"waitElement":"","waitElementTime":10,"waitElementIframeIndex":0,"clear":0,"newLine":1,"params":[{"nodeType":1,"contentType":8,"relative":true,"name":"参数1_链接文本","desc":"","extractType":0,"relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"0暖心2024 总书记的贴心话"}],"unique_index":"8rtq2is658sm5b58osr","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0},{"nodeType":2,"contentType":0,"relative":true,"name":"参数2_链接地址","desc":"","relativeXPath":"","allXPaths":"","exampleValues":[{"num":0,"value":"https://www.baidu.com/s?wd=%E6%9A%96%E5%BF%832024+%E6%80%BB%E4%B9%A6%E8%AE%B0%E7%9A%84%E8%B4%B4%E5%BF%83%E8%AF%9D&sa=fyb_n_homepage&rsv_dl=fyb_n_homepage&from=super&cl=3&tn=baidutop10&fr=top1000&rsv_idx=2&hisfilter=1"}],"unique_index":"8rtq2is658sm5b58osr","iframe":false,"default":"","paraType":"text","recordASField":1,"beforeJS":"","beforeJSWaitTime":0,"JS":"","JSWaitTime":0,"afterJS":"","afterJSWaitTime":0,"downloadPic":0,"splitLine":0}]}}]}
|
5
.temp_to_pub/EasySpider_MacOS/first_time_run.sh
Normal file
5
.temp_to_pub/EasySpider_MacOS/first_time_run.sh
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
xattr -cr EasySpider.app
|
||||||
|
xattr -cr easyspider_executestage
|
||||||
|
xattr -cr easyspider_executestage_full
|
@ -1,6 +1,26 @@
|
|||||||
由于MacOS复杂的安全性设置,初次打开软件会显示未验证开发者从而不允许打开的问题,请参考以下视频来查看MacOS版本如何打开软件和执行任务:https://www.bilibili.com/video/BV1E34y137fT/
|
由于MacOS复杂的安全性设置,初次打开软件会显示未验证开发者从而不允许打开的问题,请通过以下方式来解锁:
|
||||||
|
|
||||||
主要步骤如下:
|
1. 打开系统terminal命令行窗口。
|
||||||
|
|
||||||
|
2. 切换到EasySpider软件目录,如:
|
||||||
|
|
||||||
|
cd ~/Downloads/EasySpider_MacOS
|
||||||
|
|
||||||
|
3. 在EasySpider目录下,使用以下命令运行目录下的`first_time_run.sh`脚本修改软件包属性:
|
||||||
|
|
||||||
|
bash first_time_run.sh
|
||||||
|
|
||||||
|
即可一键解锁并正常使用EasySpider,包括设计阶段程序和执行阶段程序。
|
||||||
|
|
||||||
|
|
||||||
|
执行命令时如果出现类似下面的错误可以忽略,执行完成之后即可打开软件:
|
||||||
|
|
||||||
|
xattr: [Errno 13] Permission denied: 'EasySpider.app/Contents/Resources/app/node_modules/node-window-manager/build/node_gyp_bins/python3'
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
以下是另一种方案,请参考以下视频来查看MacOS版本如何打开软件和执行任务:https://www.bilibili.com/video/BV1E34y137fT/
|
||||||
|
|
||||||
- 设计阶段 - Apple Arm芯片版MacOS
|
- 设计阶段 - Apple Arm芯片版MacOS
|
||||||
|
|
||||||
|
@ -2325,8 +2325,8 @@ if __name__ == '__main__':
|
|||||||
else:
|
else:
|
||||||
options.add_argument(
|
options.add_argument(
|
||||||
f'--user-data-dir={c.user_folder}')
|
f'--user-data-dir={c.user_folder}')
|
||||||
print(f"Use specifed user data folder: {c.user_folder}", ", please note if you are using docker, this user folder path should be the path inside the docker container.")
|
print(f"Use specifed user data folder: {c.user_folder}, please note if you are using docker, this user folder path should be the path inside the docker container.")
|
||||||
print(f"使用指定的用户信息目录: {c.user_folder}", ",请注意如果您正在使用docker,此用户文件夹路径应是容器内的路径。")
|
print(f"使用指定的用户信息目录: {c.user_folder},请注意如果您正在使用docker,此用户文件夹路径应是容器内的路径。")
|
||||||
print(
|
print(
|
||||||
"如果报错Selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally,说明有之前运行的Chrome实例没有正常关闭,请关闭之前打开的所有Chrome实例后再运行程序即可。")
|
"如果报错Selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally,说明有之前运行的Chrome实例没有正常关闭,请关闭之前打开的所有Chrome实例后再运行程序即可。")
|
||||||
print(
|
print(
|
||||||
|
Binary file not shown.
Binary file not shown.
@ -11,9 +11,10 @@ del out\EasySpider\resources\app\vs_BuildTools.exe
|
|||||||
move out\EasySpider ..\.temp_to_pub\EasySpider_windows_x32\EasySpider
|
move out\EasySpider ..\.temp_to_pub\EasySpider_windows_x32\EasySpider
|
||||||
rmdir /s /q ..\.temp_to_pub\EasySpider_windows_x32\Code
|
rmdir /s /q ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
mkdir ..\.temp_to_pub\EasySpider_windows_x32\Code
|
mkdir ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
@REM copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
@REM copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
@REM copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
|
copy ..\ExecuteStage\*.py ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
copy ..\ExecuteStage\requirements.txt ..\.temp_to_pub\EasySpider_windows_x32\Code
|
copy ..\ExecuteStage\requirements.txt ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
copy ..\ExecuteStage\Readme.md ..\.temp_to_pub\EasySpider_windows_x32\Code
|
copy ..\ExecuteStage\Readme.md ..\.temp_to_pub\EasySpider_windows_x32\Code
|
||||||
copy ..\ExecuteStage\myCode.py ..\.temp_to_pub\EasySpider_windows_x32
|
copy ..\ExecuteStage\myCode.py ..\.temp_to_pub\EasySpider_windows_x32
|
||||||
|
@ -11,9 +11,10 @@ del out\EasySpider\resources\app\vs_BuildTools.exe
|
|||||||
move out\EasySpider ..\.temp_to_pub\EasySpider_windows_x64\EasySpider
|
move out\EasySpider ..\.temp_to_pub\EasySpider_windows_x64\EasySpider
|
||||||
rmdir /s /Q ..\.temp_to_pub\EasySpider_windows_x64\Code
|
rmdir /s /Q ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
mkdir ..\.temp_to_pub\EasySpider_windows_x64\Code
|
mkdir ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
@REM copy ..\ExecuteStage\easyspider_executestage.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
@REM copy ..\ExecuteStage\myChrome.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
@REM copy ..\ExecuteStage\utils.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
|
copy ..\ExecuteStage\*.py ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
copy ..\ExecuteStage\requirements.txt ..\.temp_to_pub\EasySpider_windows_x64\Code
|
copy ..\ExecuteStage\requirements.txt ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
copy ..\ExecuteStage\Readme.md ..\.temp_to_pub\EasySpider_windows_x64\Code
|
copy ..\ExecuteStage\Readme.md ..\.temp_to_pub\EasySpider_windows_x64\Code
|
||||||
copy ..\ExecuteStage\myCode.py ..\.temp_to_pub\EasySpider_windows_x64
|
copy ..\ExecuteStage\myCode.py ..\.temp_to_pub\EasySpider_windows_x64
|
||||||
|
@ -245,6 +245,7 @@ async function findElementAcrossAllWindows(
|
|||||||
let handles = await driver.getAllWindowHandles();
|
let handles = await driver.getAllWindowHandles();
|
||||||
// console.log("handles", handles);
|
// console.log("handles", handles);
|
||||||
let content_handle = current_handle;
|
let content_handle = current_handle;
|
||||||
|
let old_handle = current_handle;
|
||||||
let id = -1;
|
let id = -1;
|
||||||
try {
|
try {
|
||||||
id = msg.message.id;
|
id = msg.message.id;
|
||||||
@ -310,7 +311,7 @@ async function findElementAcrossAllWindows(
|
|||||||
if (h != null && handles.includes(h)) {
|
if (h != null && handles.includes(h)) {
|
||||||
await driver.switchTo().window(h);
|
await driver.switchTo().window(h);
|
||||||
current_handle = h;
|
current_handle = h;
|
||||||
console.log("switch to handle: ", h);
|
console.log("Switch to handle: ", h);
|
||||||
}
|
}
|
||||||
element = await findElement(driver, By.xpath, xpath, iframe);
|
element = await findElement(driver, By.xpath, xpath, iframe);
|
||||||
break;
|
break;
|
||||||
@ -327,6 +328,12 @@ async function findElementAcrossAllWindows(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
if (element == null && notifyBrowser) {
|
if (element == null && notifyBrowser) {
|
||||||
|
// 如果找不到元素,切换回原来的窗口
|
||||||
|
if (old_handle != null && handles.includes(old_handle)) {
|
||||||
|
await driver.switchTo().window(old_handle);
|
||||||
|
current_handle = old_handle;
|
||||||
|
console.log("Switch to handle: ", old_handle);
|
||||||
|
}
|
||||||
notify_browser(
|
notify_browser(
|
||||||
"无法找到元素,请检查XPath是否正确:" + xpath,
|
"无法找到元素,请检查XPath是否正确:" + xpath,
|
||||||
"Cannot find the element, please check if the XPath is correct: " + xpath,
|
"Cannot find the element, please check if the XPath is correct: " + xpath,
|
||||||
|
@ -20,9 +20,10 @@ rm out/EasySpider/resources/app/vs_BuildTools.exe
|
|||||||
mv out/EasySpider ../.temp_to_pub/EasySpider_Linux_x64/EasySpider
|
mv out/EasySpider ../.temp_to_pub/EasySpider_Linux_x64/EasySpider
|
||||||
rm -rf ../.temp_to_pub/EasySpider_Linux_x64/Code
|
rm -rf ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
mkdir ../.temp_to_pub/EasySpider_Linux_x64/Code
|
mkdir ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
# cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
# cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
# cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
|
cp ../ExecuteStage/*.py ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
cp ../ExecuteStage/requirements.txt ../.temp_to_pub/EasySpider_Linux_x64/Code
|
cp ../ExecuteStage/requirements.txt ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
cp ../ExecuteStage/Readme.md ../.temp_to_pub/EasySpider_Linux_x64/Code
|
cp ../ExecuteStage/Readme.md ../.temp_to_pub/EasySpider_Linux_x64/Code
|
||||||
cp ../ExecuteStage/myCode.py ../.temp_to_pub/EasySpider_Linux_x64
|
cp ../ExecuteStage/myCode.py ../.temp_to_pub/EasySpider_Linux_x64
|
||||||
|
@ -20,9 +20,10 @@ rm -r ../.temp_to_pub/EasySpider_MacOS/EasySpider.app/Contents/Resources/app/use
|
|||||||
rm -r ../.temp_to_pub/EasySpider_MacOS/EasySpider.app/Contents/Resources/app/TempUserDataFolder
|
rm -r ../.temp_to_pub/EasySpider_MacOS/EasySpider.app/Contents/Resources/app/TempUserDataFolder
|
||||||
rm -rf ../.temp_to_pub/EasySpider_MacOS/Code
|
rm -rf ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
mkdir ../.temp_to_pub/EasySpider_MacOS/Code
|
mkdir ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_MacOS/Code
|
# cp ../ExecuteStage/easyspider_executestage.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_MacOS/Code
|
# cp ../ExecuteStage/myChrome.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_MacOS/Code
|
# cp ../ExecuteStage/utils.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
|
cp ../ExecuteStage/*.py ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
cp ../ExecuteStage/requirements.txt ../.temp_to_pub/EasySpider_MacOS/Code
|
cp ../ExecuteStage/requirements.txt ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
cp ../ExecuteStage/Readme.md ../.temp_to_pub/EasySpider_MacOS/Code
|
cp ../ExecuteStage/Readme.md ../.temp_to_pub/EasySpider_MacOS/Code
|
||||||
cp ../ExecuteStage/myCode.py ../.temp_to_pub/EasySpider_MacOS
|
cp ../ExecuteStage/myCode.py ../.temp_to_pub/EasySpider_MacOS
|
||||||
|
@ -215,7 +215,7 @@ exports.start = function (port = 8074) {
|
|||||||
let item = {
|
let item = {
|
||||||
id: task.id,
|
id: task.id,
|
||||||
name: task.name,
|
name: task.name,
|
||||||
url: task.url,
|
url: task.links.split("\n")[0],
|
||||||
mtime: stat.mtime,
|
mtime: stat.mtime,
|
||||||
links: task.links,
|
links: task.links,
|
||||||
desc: task.desc,
|
desc: task.desc,
|
||||||
|
@ -61,8 +61,8 @@
|
|||||||
<textarea class="form-control"
|
<textarea class="form-control"
|
||||||
style="margin:0 auto;width:90%; color:black; height: 450px; min-height: 200px; background: white"
|
style="margin:0 auto;width:90%; color:black; height: 450px; min-height: 200px; background: white"
|
||||||
readonly>
|
readonly>
|
||||||
This software is intended for educational and communication purposes only. It is strictly prohibited to use the software for any illegal activities or operations, such as crawling government/military websites that are not allowed to be crawled. The user bears all consequences resulting from the use of this software and the author shall not be held responsible or liable in any way. Furthermore, the software is protected by patent rights. If you intend to use it for commercial purposes or profit-making activities, such as using the software for client orders, selling the collected data, please contact author: naibowang@foxmail.com for patent authorization and payment operations: https://www.patentguru.com/cn/search?q=一种自定义提取流程的服务封装系统
|
This software is intended for educational and communication purposes only. It is strictly prohibited to use the software for any illegal activities or operations, such as crawling government/military websites that are not allowed to be crawled. The user bears all consequences resulting from the use of this software and the author shall not be held responsible or liable in any way.
|
||||||
For individual users, EasySpider is a completely free and ad-free open-source software. The development and maintenance of the software rely solely on the author's voluntary efforts. Therefore, you can choose to support the author, allowing them to have more enthusiasm and energy to maintain this software. Alternatively, if you have profited from using this software, you are welcome to support the author through the following methods:
|
EasySpider is a completely free and ad-free open-source software. The development and maintenance of the software rely solely on the author's voluntary efforts. Therefore, you can choose to support the author, allowing them to have more enthusiasm and energy to maintain this software. Alternatively, if you have profited from using this software, you are welcome to support the author through the following methods:
|
||||||
|
|
||||||
1. PayPal account: naibowang, or scan the QR code provided in the software package.
|
1. PayPal account: naibowang, or scan the QR code provided in the software package.
|
||||||
2. Alipay account: naibowang@foxmail.com, or scan the QR code provided in the software package.
|
2. Alipay account: naibowang@foxmail.com, or scan the QR code provided in the software package.
|
||||||
@ -164,9 +164,9 @@ For individual users, EasySpider is a completely free and ad-free open-source so
|
|||||||
<textarea class="form-control"
|
<textarea class="form-control"
|
||||||
style="margin:0 auto;width:90%; color:black; height: 480px; min-height: 200px; background: white"
|
style="margin:0 auto;width:90%; color:black; height: 480px; min-height: 200px; background: white"
|
||||||
readonly>
|
readonly>
|
||||||
本软件仅供学习交流使用,严禁使用软件进行任何违法违规的操作,如爬取不允许爬取的政府/军事机关网站等。使用本软件所造成的一切后果由使用者自负,与作者本人无关,作者不会承担任何责任。同时,软件受到专利权保护,如要用于商业用途,如使用软件进行盈利接单,用于公司业务,或出售采集到的数据等,请邮件联系作者:naibowang@foxmail.com进行专利授权等付费操作:https://www.patentguru.com/cn/search?q=一种自定义提取流程的服务封装系统
|
本软件仅供学习交流使用,严禁使用软件进行任何违法违规的操作,如爬取不允许爬取的政府/军事机关网站等。使用本软件所造成的一切后果由使用者自负,与作者本人无关,作者不会承担任何责任。
|
||||||
|
|
||||||
对于个人使用者来说,易采集EasySpider是一款完全免费无广告的开源软件,软件开发和维护全靠作者用爱发电,因此您可以选择支持作者让作者有更多的热情和精力维护此软件,或者您使用了此软件进行了盈利,欢迎您通过下面的方式支持作者:
|
易采集EasySpider是一款完全免费无广告的开源软件,软件开发和维护全靠作者用爱发电,因此您可以选择支持作者让作者有更多的热情和精力维护此软件,或者您使用了此软件进行了盈利,欢迎您通过下面的方式支持作者:
|
||||||
|
|
||||||
1、支付宝账号:naibowang@foxmail.com,也可以扫描软件包中带的二维码。
|
1、支付宝账号:naibowang@foxmail.com,也可以扫描软件包中带的二维码。
|
||||||
2、微信收款:扫描软件包中带的二维码。
|
2、微信收款:扫描软件包中带的二维码。
|
||||||
|
2
ElectronJS/stealth.min.js
vendored
2
ElectronJS/stealth.min.js
vendored
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
1
ElectronJS/tasks/326.json
Normal file
1
ElectronJS/tasks/326.json
Normal file
File diff suppressed because one or more lines are too long
2
ExecuteStage/.vscode/launch.json
vendored
2
ExecuteStage/.vscode/launch.json
vendored
@ -12,7 +12,7 @@
|
|||||||
"justMyCode": false,
|
"justMyCode": false,
|
||||||
// "args": ["--ids", "[7]", "--read_type", "remote", "--headless", "0"]
|
// "args": ["--ids", "[7]", "--read_type", "remote", "--headless", "0"]
|
||||||
// "args": ["--ids", "[9]", "--read_type", "remote", "--headless", "0", "--saved_file_name", "YOUTUBE"]
|
// "args": ["--ids", "[9]", "--read_type", "remote", "--headless", "0", "--saved_file_name", "YOUTUBE"]
|
||||||
"args": ["--ids", "[89]", "--headless", "0", "--user_data", "0", "--keyboard", "0",
|
"args": ["--ids", "[0]", "--headless", "0", "--user_data", "0", "--keyboard", "0",
|
||||||
"--read_type", "remote",
|
"--read_type", "remote",
|
||||||
]
|
]
|
||||||
// "args": "--ids '[97]' --user_data 1 --server_address http://localhost:8074 --config_folder '/Users/naibo/Documents/EasySpider/ElectronJS/' --headless 0 --read_type remote --config_file_name config.json --saved_file_name"
|
// "args": "--ids '[97]' --user_data 1 --server_address http://localhost:8074 --config_folder '/Users/naibo/Documents/EasySpider/ElectronJS/' --headless 0 --read_type remote --config_file_name config.json --saved_file_name"
|
||||||
|
@ -73,13 +73,13 @@ desired_capabilities["pageLoadStrategy"] = "none"
|
|||||||
|
|
||||||
|
|
||||||
class BrowserThread(Thread):
|
class BrowserThread(Thread):
|
||||||
def __init__(self, browser_t, id, service, version, event, saveName, config, option):
|
def __init__(self, browser_t, id, service, version, event, saveName, config, option, commandline_config=""):
|
||||||
Thread.__init__(self)
|
Thread.__init__(self)
|
||||||
self.logs = io.StringIO()
|
self.logs = io.StringIO()
|
||||||
self.log = bool(service.get("recordLog", True))
|
self.log = bool(service.get("recordLog", True))
|
||||||
self.browser = browser_t
|
self.browser = browser_t
|
||||||
self.option = option
|
self.option = option
|
||||||
self.config = config
|
self.commandline_config = commandline_config
|
||||||
self.version = version
|
self.version = version
|
||||||
self.totalSteps = 0
|
self.totalSteps = 0
|
||||||
self.id = id
|
self.id = id
|
||||||
@ -108,6 +108,8 @@ class BrowserThread(Thread):
|
|||||||
os.mkdir(self.downloadFolder + "/files")
|
os.mkdir(self.downloadFolder + "/files")
|
||||||
if not os.path.exists(self.downloadFolder + "/images"):
|
if not os.path.exists(self.downloadFolder + "/images"):
|
||||||
os.mkdir(self.downloadFolder + "/images")
|
os.mkdir(self.downloadFolder + "/images")
|
||||||
|
if not os.path.exists(self.downloadFolder + "/screenshots"):
|
||||||
|
os.mkdir(self.downloadFolder + "/screenshots")
|
||||||
self.getDataStep = 0
|
self.getDataStep = 0
|
||||||
self.startSteps = 0
|
self.startSteps = 0
|
||||||
try:
|
try:
|
||||||
@ -1136,7 +1138,7 @@ class BrowserThread(Thread):
|
|||||||
return index, element
|
return index, element
|
||||||
|
|
||||||
# 对循环的处理
|
# 对循环的处理
|
||||||
def loopExecute(self, node, loopValue, clickPath="", index=0):
|
def loopExecute(self, node, loopValue, loopPath="", index=0):
|
||||||
time.sleep(0.1) # 第一次执行循环的时候强制等待1秒
|
time.sleep(0.1) # 第一次执行循环的时候强制等待1秒
|
||||||
thisHandle = self.browser.current_window_handle # 记录本次循环内的标签页的ID
|
thisHandle = self.browser.current_window_handle # 记录本次循环内的标签页的ID
|
||||||
try:
|
try:
|
||||||
@ -1868,9 +1870,17 @@ class BrowserThread(Thread):
|
|||||||
width = size["width"]
|
width = size["width"]
|
||||||
height = size["height"]
|
height = size["height"]
|
||||||
# 调整浏览器窗口的大小
|
# 调整浏览器窗口的大小
|
||||||
|
if self.commandline_config["headless"] == 1: # 无头模式下,截取整个网页的高度
|
||||||
|
page_width = self.browser.execute_script(
|
||||||
|
"return document.body.scrollWidth")
|
||||||
|
page_height = self.browser.execute_script(
|
||||||
|
"return document.body.scrollHeight")
|
||||||
|
self.browser.set_window_size(page_width, page_height)
|
||||||
|
time.sleep(1)
|
||||||
|
else:
|
||||||
self.browser.set_window_size(width, height)
|
self.browser.set_window_size(width, height)
|
||||||
element.screenshot("Data/Task_" + str(self.id) + "/" + self.saveName +
|
element.screenshot("Data/Task_" + str(self.id) + "/" + self.saveName +
|
||||||
"/" + str(time.time()) + ".png")
|
"/screenshots/" + str(time.time()) + ".png")
|
||||||
# 截图完成后,将浏览器的窗口大小设置为原来的大小
|
# 截图完成后,将浏览器的窗口大小设置为原来的大小
|
||||||
self.browser.set_window_size(width, height)
|
self.browser.set_window_size(width, height)
|
||||||
elif p["contentType"] == 8:
|
elif p["contentType"] == 8:
|
||||||
@ -2181,7 +2191,7 @@ class BrowserThread(Thread):
|
|||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
# 如果需要调试程序,请在命令行参数中加入--keyboard 0 来禁用键盘监听以提升调试速度
|
# 如果需要调试程序,请在命令行参数中加入--keyboard 0 来禁用键盘监听以提升调试速度
|
||||||
# If you need to debug the program, please add --keyboard 0 in the command line parameters to disable keyboard listening to improve debugging speed
|
# If you need to debug the program, please add --keyboard 0 in the command line parameters to disable keyboard listening to improve debugging speed
|
||||||
config = {
|
commandline_config = {
|
||||||
"ids": [0],
|
"ids": [0],
|
||||||
"saved_file_name": "",
|
"saved_file_name": "",
|
||||||
"user_data": False,
|
"user_data": False,
|
||||||
@ -2196,7 +2206,7 @@ if __name__ == '__main__':
|
|||||||
"docker_driver": "",
|
"docker_driver": "",
|
||||||
"user_folder": "",
|
"user_folder": "",
|
||||||
}
|
}
|
||||||
c = Config(config)
|
c = Config(commandline_config)
|
||||||
print(c)
|
print(c)
|
||||||
options = webdriver.ChromeOptions()
|
options = webdriver.ChromeOptions()
|
||||||
driver_path = "chromedriver.exe"
|
driver_path = "chromedriver.exe"
|
||||||
@ -2438,7 +2448,7 @@ if __name__ == '__main__':
|
|||||||
event = Event()
|
event = Event()
|
||||||
event.set()
|
event.set()
|
||||||
thread = BrowserThread(browser_t, id, service,
|
thread = BrowserThread(browser_t, id, service,
|
||||||
c.version, event, c.saved_file_name, config=config, option=tmp_options[i])
|
c.version, event, c.saved_file_name, config=config, option=tmp_options[i], commandline_config=c)
|
||||||
print("Thread with task id: ", id, " is created")
|
print("Thread with task id: ", id, " is created")
|
||||||
threads.append(thread)
|
threads.append(thread)
|
||||||
thread.start()
|
thread.start()
|
||||||
|
73
Readme.md
73
Readme.md
@ -1,34 +1,39 @@
|
|||||||
# 易采集/EasySpider: Visual Code-Free Web Crawler
|
# 易采集/EasySpider: Visual Code-Free Web Crawler
|
||||||
|
|
||||||
一个可视化浏览器自动化测试/数据采集/爬虫软件,可以使用图形化界面,无代码可视化的设计和执行任务。只需要在网页上选择自己想要操作的内容并根据提示框操作即可完成任务的设计和执行。同时软件还可以单独以命令行的方式进行执行,从而可以很方便的嵌入到其他系统中。
|
一个**完全免费**(**包括商业使用和二次开发**)的可视化浏览器自动化测试/数据采集/爬虫软件,可以使用图形化界面,无代码可视化的设计和执行任务。只需要在网页上选择自己想要操作的内容并根据提示框操作即可完成任务的设计和执行。同时软件还可以单独以命令行的方式进行执行,从而可以很方便的嵌入到其他系统中。
|
||||||
|
|
||||||
|
A **completely free (including for commercial use and secondary development)** visual browser automation test/data collection/crawler software, which can be used to design and execute tasks in a code-free visual way. You only need to select the content you want to operate on the web page and follow the prompts to complete the design and execution of the task. At the same time, the software can also be executed separately in the command line, so that it can be easily embedded into other systems.
|
||||||
|
|
||||||
|
|
||||||
A visual browser automation test/data collection/crawler software, which can be used to design and execute tasks in a code-free visual way. You only need to select the content you want to operate on the web page and follow the prompts to complete the design and execution of the task. At the same time, the software can also be executed separately in the command line, so that it can be easily embedded into other systems.
|
|
||||||
|
|
||||||
<a href="https://trendshift.io/repositories/3367" target="_blank"><img src="https://trendshift.io/api/badge/repositories/3367" alt="NaiboWang%2FEasySpider | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
|
<a href="https://trendshift.io/repositories/3367" target="_blank"><img src="https://trendshift.io/api/badge/repositories/3367" alt="NaiboWang%2FEasySpider | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
|
||||||
|
|
||||||
## 下载易采集/Download EasySpider
|
## 下载易采集/Download EasySpider
|
||||||
|
|
||||||
进入 [Releases Page](https://github.com/NaiboWang/EasySpider/releases) 下载最新版本。如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://www.easyspider.cn/download.html)。
|
进入 [Releases Page](https://github.com/NaiboWang/EasySpider/releases) 下载最新版本。如果下载速度慢,可以考虑中国境内下载地址:[中国境内下载地址](https://www.easyspider.net/download.html)。
|
||||||
|
|
||||||
Refer to the [Releases Page](https://github.com/NaiboWang/EasySpider/releases) to download the latest version of EasySpider.
|
Refer to the [Releases Page](https://github.com/NaiboWang/EasySpider/releases) to download the latest version of EasySpider.
|
||||||
|
|
||||||
## 赞助者/Sponsors | [First Sponsor: CapSolver](https://www.capsolver.com/zh?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider)
|
## 赞助者/Sponsors
|
||||||
|
|
||||||
<a target="_blank" href="https://www.proxy302.com/?ref=wangnaibo"><img src="media/Proxy302.jpg" width=850></img></a>
|
<a target="_blank" href="https://get.brightdata.com/njx4r"><img src="media/BrightData.svg" width=850></img></a>
|
||||||
|
|
||||||
[Proxy302](https://www.proxy302.com/?ref=wangnaibo)是一个全球代理IP自助超市。按需付费,无需套餐捆绑购买;无阶梯式定价,充值即可使用所有类型的代理IP;免费测试,注册获取$1测试额度。覆盖全球240+国家和地区,6500万个住宅IP可供选择。Proxy302可配合EasySpider进行数据采集。
|
[亮数据BrightData](https://get.brightdata.com/njx4r)是代理市场领导者,覆盖全球的7200万IP,提供真人住宅IP,即时批量采集网络公开数据,成功率亲测有保证。需要性价比高代理IP的可**点击上方图片注册**后联系中文客服,开通后免费试用,**现在有首充多少就送多少的活动**。BrightData可配合EasySpider进行数据采集。
|
||||||
|
|
||||||
<a target="_blank" href="https://get.brightdata.com/naibowang"><img src="media/BrightData.png" width=850></img></a>
|
<a target="_blank" href="https://www.ipwo.net/?code=KK9YVWI2L"><img src="media/IPWO_Proxy.gif" width=850></img></a>
|
||||||
|
|
||||||
[亮数据BrightData](https://get.brightdata.com/naibowang)是代理市场领导者,覆盖全球的7200万IP,提供真人住宅IP,即时批量采集网络公开数据,成功率亲测有保证。需要性价比高代理IP的可**点击上方图片注册**后联系中文客服,开通后免费试用,**现在有首充多少就送多少的活动**。BrightData可配合EasySpider进行数据采集。
|
[IPWO](https://www.ipwo.net/?code=KK9YVWI2L)支持免费测试,作为行业领先的代理IP提供商,拥有 9000万+真实住宅IP,覆盖200+国家和地区,支持无限并发,可用率高达99.9%,帮助用户轻松突破地理限制,实现高效、安全的全球网络访问,与EasySpider完美结合,助力数据采集,尽享无缝体验。
|
||||||
|
|
||||||
|
<!-- <a target="_blank" href="https://www.capsolver.com/zh?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider"><img src="media/capsolver.jpg" width=850></img></a> -->
|
||||||
<a target="_blank" href="https://www.capsolver.com/zh?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider"><img src="media/capsolver.jpg" width=850></img></a>
|
|
||||||
|
|
||||||
<!-- [](https://www.capsolver.com/zh?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider) -->
|
<!-- [](https://www.capsolver.com/zh?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider) -->
|
||||||
|
|
||||||
[Capsolver.com](https://www.capsolver.com/?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider) is an AI-powered service that specializes in solving various types of captchas automatically. It supports captchas such as [reCAPTCHA V2](https://docs.capsolver.com/guide/captcha/ReCaptchaV2.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [reCAPTCHA V3](https://docs.capsolver.com/guide/captcha/ReCaptchaV3.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [DataDome](https://docs.capsolver.com/guide/captcha/DataDome.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [AWS Captcha](https://docs.capsolver.com/guide/captcha/awsWaf.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [Geetest](https://docs.capsolver.com/guide/captcha/Geetest.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), and Cloudflare [Captcha](https://docs.capsolver.com/guide/antibots/cloudflare_turnstile.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider) / [Challenge 5s](https://docs.capsolver.com/guide/antibots/cloudflare_challenge.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [Imperva / Incapsula](https://docs.capsolver.com/guide/antibots/imperva.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), among others.
|
<!-- [Capsolver.com](https://www.capsolver.com/?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider) is an AI-powered service that specializes in solving various types of captchas automatically. It supports captchas such as [reCAPTCHA V2](https://docs.capsolver.com/guide/captcha/ReCaptchaV2.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [reCAPTCHA V3](https://docs.capsolver.com/guide/captcha/ReCaptchaV3.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [DataDome](https://docs.capsolver.com/guide/captcha/DataDome.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [AWS Captcha](https://docs.capsolver.com/guide/captcha/awsWaf.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [Geetest](https://docs.capsolver.com/guide/captcha/Geetest.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), and Cloudflare [Captcha](https://docs.capsolver.com/guide/antibots/cloudflare_turnstile.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider) / [Challenge 5s](https://docs.capsolver.com/guide/antibots/cloudflare_challenge.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), [Imperva / Incapsula](https://docs.capsolver.com/guide/antibots/imperva.html?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), among others.
|
||||||
For developers, Capsolver offers API integration options detailed in their [documentation](https://docs.capsolver.com/?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), facilitating the integration of captcha solving into applications. They also provide browser extensions for [Chrome](https://chromewebstore.google.com/detail/captcha-solver-auto-captc/pgojnojmmhpofjgdmaebadhbocahppod) and [Firefox](https://addons.mozilla.org/es/firefox/addon/capsolver-captcha-solver/), making it easy to use their service directly within a browser. Different pricing packages are available to accommodate varying needs, ensuring flexibility for users.
|
For developers, Capsolver offers API integration options detailed in their [documentation](https://docs.capsolver.com/?utm_source=github&utm_medium=banner_github&utm_campaign=easyspider), facilitating the integration of captcha solving into applications. They also provide browser extensions for [Chrome](https://chromewebstore.google.com/detail/captcha-solver-auto-captc/pgojnojmmhpofjgdmaebadhbocahppod) and [Firefox](https://addons.mozilla.org/es/firefox/addon/capsolver-captcha-solver/), making it easy to use their service directly within a browser. Different pricing packages are available to accommodate varying needs, ensuring flexibility for users. -->
|
||||||
|
|
||||||
|
<!-- <a target="_blank" href="https://www.proxy302.com/?ref=wangnaibo"><img src="media/Proxy302.jpg" width=850></img></a>
|
||||||
|
|
||||||
|
[Proxy302](https://www.proxy302.com/?ref=wangnaibo)是一个全球代理IP自助超市。按需付费,无需套餐捆绑购买;无阶梯式定价,充值即可使用所有类型的代理IP;免费测试,注册获取$1测试额度。覆盖全球240+国家和地区,6500万个住宅IP可供选择。Proxy302可配合EasySpider进行数据采集。
|
||||||
|
|
||||||
<a target="_blank" href="https://www.123proxy.cn/?utm_source=EasySpider"><img src="media/123proxy.png" width=850></img></a>
|
<a target="_blank" href="https://www.123proxy.cn/?utm_source=EasySpider"><img src="media/123proxy.png" width=850></img></a>
|
||||||
|
|
||||||
@ -36,13 +41,13 @@ For developers, Capsolver offers API integration options detailed in their [docu
|
|||||||
|
|
||||||
<a target="_blank" href="https://koala-ip.com/"><img src="media/Koala-IP.png" width=850></img></a>
|
<a target="_blank" href="https://koala-ip.com/"><img src="media/Koala-IP.png" width=850></img></a>
|
||||||
|
|
||||||
[Koala-IP](https://koala-ip.com/)提供海量低价高质量代理IP服务,致力于为客户提供[最优价格](https://zh-cn.koala-ip.com/var-ip)和最稳定的代理IP解决方案。无论你是需要网络爬虫、数据抓取、隐私保护还是跨地域访问,[Koala-IP(中文)](https://zh-cn.koala-ip.com/) 都能满足你的所有需求。[立即注册Koala-IP](https://koala-ip.com/admin/register),享受超高性价比的代理IP服务,提升你的业务效益!
|
[Koala-IP](https://koala-ip.com/)提供海量低价高质量代理IP服务,致力于为客户提供[最优价格](https://zh-cn.koala-ip.com/var-ip)和最稳定的代理IP解决方案。无论你是需要网络爬虫、数据抓取、隐私保护还是跨地域访问,[Koala-IP(中文)](https://zh-cn.koala-ip.com/) 都能满足你的所有需求。[立即注册Koala-IP](https://koala-ip.com/admin/register),享受超高性价比的代理IP服务,提升你的业务效益! -->
|
||||||
|
|
||||||
## 官方网站/Official Website
|
## 官方网站/Official Website
|
||||||
|
|
||||||
访问易采集官网:www.easyspider.cn
|
访问易采集官网:[www.easyspider.cn](http://www.easyspider.cn)
|
||||||
|
|
||||||
Visit the official website of EasySpider: www.easyspider.net
|
Visit the official website of EasySpider: [www.easyspider.net](http://www.easyspider.net)
|
||||||
|
|
||||||
## 软件使用示例/Software Usage Example
|
## 软件使用示例/Software Usage Example
|
||||||
|
|
||||||
@ -74,7 +79,7 @@ More features please scroll to the bottom of this page to view.
|
|||||||
|
|
||||||
## 支持作者/Support Author
|
## 支持作者/Support Author
|
||||||
|
|
||||||
易采集EasySpider是一款完全免费无广告的开源软件,软件开发和维护全靠作者用爱发电,因此您可以选择支持作者让作者有更多的热情和精力维护此软件,或者您使用了此软件进行了盈利,欢迎您通过下面的方式支持作者:
|
易采集EasySpider是一款完全免费且使用中无广告的开源软件,软件开发和维护全靠作者用爱发电,因此您可以选择支持作者让作者有更多的热情和精力维护此软件,或者您使用了此软件进行了盈利,欢迎您通过下面的方式支持作者:
|
||||||
|
|
||||||
1. Github Sponsor:直接点击右侧**Sponsor**按钮赞助。
|
1. Github Sponsor:直接点击右侧**Sponsor**按钮赞助。
|
||||||
2. 支付宝账号:naibowang@foxmail.com,也可以扫描下方二维码。
|
2. 支付宝账号:naibowang@foxmail.com,也可以扫描下方二维码。
|
||||||
@ -156,13 +161,41 @@ This software is for learning and communication only. **It is strictly forbidden
|
|||||||
|
|
||||||
For the crawler operations of government and military websites, **the author will not answer any questions** in order to avoid violating relevant national laws, regulations and policies.
|
For the crawler operations of government and military websites, **the author will not answer any questions** in order to avoid violating relevant national laws, regulations and policies.
|
||||||
|
|
||||||
同时,软件受到专利权保护,如要用于商业用途,如使用软件进行盈利接单,出售采集到的数据,或将软件集成到自己的系统中等,请邮件联系作者:naibowang@foxmail.com
|
EasySpider遵循AGPL-3.0协议,**任何个人和企业都可以免费使用软件本身或使用源代码进行二次开发,无需联系作者进行商业(专利)授权**,但需要注意AGPL-3.0协议的相关规则:
|
||||||
|
|
||||||
Meanwhile, the software is protected by patent rights. If it is used for commercial purposes, such as using the software to make profits, selling the collected data, or integrating the software into your own system, please contact the author by email: naibowang@foxmail.com
|
EasySpider complies with the AGPL-3.0 agreement. **Any individual or enterprise can use the software for free and use the software source code for secondary development without contacting the author for commercial (patent) authorization.** However, it is necessary to pay attention to the related rules of the AGPL-3.0 agreement:
|
||||||
|
|
||||||
<!-- [杭州天勤知识产权代理有限公司](http://www.tqip.com/)进行专利授权等付费操作。 -->
|
### 1. Copyleft(传染性) / Copyleft (Viral Clause)
|
||||||
|
- **衍生作品 / Derivative Works**
|
||||||
|
- 任何基于 AGPL 代码的修改或衍生作品,必须**以相同许可证(AGPL-3.0)发布**。
|
||||||
|
- Any modifications or derivative works based on AGPL code must be **licensed under AGPL-3.0**.
|
||||||
|
- **联动范围 / Scope of Copyleft**
|
||||||
|
- 若 AGPL 代码与其他代码结合(如静态链接、紧密集成),整个作品需遵守 AGPL。
|
||||||
|
- If AGPL code is combined with other code (e.g., static linking), the entire work must comply with AGPL.
|
||||||
|
|
||||||
<!-- At the same time, the software is protected by patent rights. If it is used for commercial purposes, such as using the software to make profits, selling the collected data, etc., please contact [Hangzhou Tianqin Intellectual Property Agency Co., Ltd.](http://www.tqip.com/) for patent authorization and other paid operations. -->
|
### 2. 网络使用条款 / Network Use Clause
|
||||||
|
- **SaaS 触发开源义务 / SaaS Trigger**
|
||||||
|
- 若软件以服务形式提供(如网站、API),必须向所有用户公开**完整对应源代码**(包括修改后的代码)。
|
||||||
|
- If the software is provided as a service (e.g., website, API), the **full corresponding source code** (including modifications) must be made available to all users.
|
||||||
|
- **用户权利 / User Rights**
|
||||||
|
- 服务的接收者可通过下载或书面请求获取源码。
|
||||||
|
- Service recipients may obtain the source code via download or written request.
|
||||||
|
|
||||||
|
### 3. 源码提供要求 / Source Code Provision
|
||||||
|
- **二进制分发 / Binary Distribution**
|
||||||
|
- 必须附带源码或提供获取渠道(如下载链接)。
|
||||||
|
- Source code must be included or a download link provided.
|
||||||
|
- **网络服务场景 / Network Service Scenario**
|
||||||
|
- 需通过服务界面**显式提供源码链接**,或向用户书面承诺提供源码。
|
||||||
|
- The service interface must **explicitly provide a source code link** or offer a written offer for source code.
|
||||||
|
|
||||||
|
### 4. 专利授权 / Patent Grant
|
||||||
|
- 贡献者自动授予用户与软件相关的专利许可,禁止专利诉讼。
|
||||||
|
- Contributors automatically grant users patent rights related to the software, and prohibit patent litigation.
|
||||||
|
|
||||||
|
### 5. 免责声明 / Disclaimer
|
||||||
|
- 软件按“原样”提供,作者**不承担任何责任**(无担保条款)。
|
||||||
|
- The software is provided "as is" with **no warranties or liabilities**.
|
||||||
|
|
||||||
|
|
||||||
## 答疑QQ群
|
## 答疑QQ群
|
||||||
|
Binary file not shown.
Before Width: | Height: | Size: 207 KiB |
398
media/BrightData.svg
Normal file
398
media/BrightData.svg
Normal file
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 152 KiB |
BIN
media/IPWO_Proxy.gif
Normal file
BIN
media/IPWO_Proxy.gif
Normal file
Binary file not shown.
After Width: | Height: | Size: 1.2 MiB |
Loading…
x
Reference in New Issue
Block a user