增加一直滚动直到页面内容不变,单个元素循环默认行为增加页面内容不变退出循环选项

This commit is contained in:
naibo 2023-07-12 04:19:06 +08:00
parent ef0acf838b
commit 0ed3818eaf
28 changed files with 115 additions and 139059 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 456 KiB

View File

@ -1,3 +1,4 @@
sys_version.json
node_modules/ node_modules/
out/ out/
chromedrivers/ chromedrivers/

View File

@ -3,7 +3,7 @@
EasySpider分三部分 EasySpider分三部分
1. 主程序:在`ElectronJS`文件夹下。 1. 主程序:在`ElectronJS`文件夹下。
2. 浏览器扩展:在`Extension`文件夹下,为浏览器的“操作控制台”的代码,打包后的扩展在此目录下的`EasySpider_zh.crx`文件。 2. 浏览器扩展:在`Extension`文件夹下,为浏览器的“操作台”的代码,打包后的扩展在此目录下的`EasySpider_zh.crx`文件。
3. 执行阶段程序:在`ExecuteStage`文件夹下。 3. 执行阶段程序:在`ExecuteStage`文件夹下。
此部分为`主程序`的编译说明。 此部分为`主程序`的编译说明。
@ -53,7 +53,7 @@ chrome_linux64/ # for linux x64
chrome_mac64/ # for mac x64 chrome_mac64/ # for mac x64
``` ```
然后,从下面的页面下载和**自己安装的Chrome版本一致**的Chromedriver[https://chromedriver.chromium.org/downloads](https://chromedriver.chromium.org/downloads)把chromedriver放入刚刚的chrome文件夹内并更名为下面的格式 然后,从下面的页面下载和**自己安装的Chrome版本一致**的Chromedriver[https://chromedriver.chromium.org/downloads](https://chromedriver.chromium.org/downloads)把chromedriver放入刚刚的`chrome`文件夹内,并更名为下面的格式:
``` ```
chromedriver_win32.exe # for windows x32 chromedriver_win32.exe # for windows x32

View File

@ -38,6 +38,29 @@ let driverPath = "";
let chromeBinaryPath = ""; let chromeBinaryPath = "";
let execute_path = ""; let execute_path = "";
console.log(process.arch); console.log(process.arch);
exec(`wmic os get Caption`, function(error, stdout, stderr) {
if (error) {
console.error(`执行的错误: ${error}`);
return;
}
if (stdout.includes('Windows 7')) {
console.log('Windows 7');
let sys_version = fs.readFileSync(path.join(__dirname, `sys_version.json`), 'utf8');
sys_version = JSON.parse(sys_version);
if (sys_version.arch === 'x64') {
dialog.showMessageBoxSync({
type: 'error',
title: 'Error',
message: 'Windows 7系统请下载使用x32版本的软件不论Win 7系统为x64还是x32版本。\nFor Windows 7, please download and use the x32 version of the software, regardless of whether the Win 7 system is x64 or x32 version.',
});
}
} else {
console.log('Not Windows 7');
}
});
if (process.platform === 'win32' && process.arch === 'ia32') { if (process.platform === 'win32' && process.arch === 'ia32') {
driverPath = path.join(__dirname, "chrome_win32/chromedriver_win32.exe"); driverPath = path.join(__dirname, "chrome_win32/chromedriver_win32.exe");
chromeBinaryPath = path.join(__dirname, "chrome_win32/chrome.exe"); chromeBinaryPath = path.join(__dirname, "chrome_win32/chrome.exe");

View File

@ -136,7 +136,7 @@
style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">查看/管理/执行任务</a> style="margin-top: 15px; width: 300px;height:60px;padding-top:12px;color:white">查看/管理/执行任务</a>
</p> </p>
<p> <p>
<a href="https://www.easyspider.cn?lang=zh" target="_blank" style="text-align: center; font-size: 18px">点此访问官网查看教程</a> <a href="https://www.easyspider.cn?lang=zh" target="_blank" style="text-align: center; font-size: 18px">点此访问官网查看文档/视频教程</a>
</p> </p>
<div class="img-container"> <div class="img-container">
<!-- <h5>出品方</h5>--> <!-- <h5>出品方</h5>-->
@ -167,6 +167,7 @@
<a @click="step = 0" class="btn btn-outline-primary btn-lg"style="margin-top: 10px; width: 322px;height:45px;padding-top:5px">返回首页</a> <a @click="step = 0" class="btn btn-outline-primary btn-lg"style="margin-top: 10px; width: 322px;height:45px;padding-top:5px">返回首页</a>
</p> </p>
</div> </div>
<div v-else-if="step == 2"> <div v-else-if="step == 2">
<h4 style="margin-top: 20px">指定用户信息目录</h4> <h4 style="margin-top: 20px">指定用户信息目录</h4>

View File

@ -55,7 +55,7 @@
<div style="text-align: left;margin: 10px;font-size:15px!important">Click button above and then click the flowchart to insert.</div> <div style="text-align: left;margin: 10px;font-size:15px!important">Click button above and then click the flowchart to insert.</div>
</div> </div>
</div> </div>
<div style="margin-top:20px;border: solid;height:850px;overflow: auto;width:58%;float:right"> <div style="margin-top:20px;border: solid;height:850px;overflow: auto;width:58%;float:right" id="flowchart_graph">
<div id="0" class="clk" data="0"> <div id="0" class="clk" data="0">
</div> </div>
<div style="border-radius: 50%;width: 40px;height: 40px;border:solid black;margin: 5px auto;background-color:lightcyan"> <div style="border-radius: 50%;width: 40px;height: 40px;border:solid black;margin: 5px auto;background-color:lightcyan">
@ -110,6 +110,7 @@
<option value = 0>No scrolling</option> <option value = 0>No scrolling</option>
<option value = 1>Scroll one screen</option> <option value = 1>Scroll one screen</option>
<option value = 2>Scroll to the end</option> <option value = 2>Scroll to the end</option>
<option value = 3>Keep scrolling until the page data does not change</option>
</select> </select>
<label>Scroll Times (the wait time after scrolling <b>ineffective</b> when the scrolling type is set to <b>no scrolling</b>):</label> <label>Scroll Times (the wait time after scrolling <b>ineffective</b> when the scrolling type is set to <b>no scrolling</b>):</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
@ -155,6 +156,7 @@
<option value = 0>No Scrolling</option> <option value = 0>No Scrolling</option>
<option value = 1>Scroll one screen</option> <option value = 1>Scroll one screen</option>
<option value = 2>Scroll to the end</option> <option value = 2>Scroll to the end</option>
<option value = 3>Keep scrolling until the page data does not change</option>
</select> </select>
<label>Scroll Times:</label> <label>Scroll Times:</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
@ -449,7 +451,7 @@
<input onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["waitTime"]'></input> <input onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["waitTime"]'></input>
</div> </div>
<!-- 这里添加退出循环条件,找不到元素肯定退出循环 --> <!-- 这里添加退出循环条件,找不到元素肯定退出循环 -->
<label v-if='parseInt(loopType) == 0'>Max Loop time0 means infinite:</label> <label v-if='parseInt(loopType) == 0'>Max Loop time (0 means infinite until cannot find the element or page content doesn't change):</label>
<input onkeydown="inputDelete(event)" required v-if='parseInt(loopType) == 0' class="form-control" type="number" v-model.number='nowNode["parameters"]["exitCount"]'></input> <input onkeydown="inputDelete(event)" required v-if='parseInt(loopType) == 0' class="form-control" type="number" v-model.number='nowNode["parameters"]["exitCount"]'></input>
<div id="breakAdvanced" v-if='nowNode["parameters"]["loopType"] < 5'> <div id="breakAdvanced" v-if='nowNode["parameters"]["loopType"] < 5'>
@ -475,6 +477,7 @@
<option value = 0>No Scrolling</option> <option value = 0>No Scrolling</option>
<option value = 1>Scroll one screen</option> <option value = 1>Scroll one screen</option>
<option value = 2>Scroll to the end</option> <option value = 2>Scroll to the end</option>
<option value = 3>Keep scrolling until the page data does not change</option>
</select> </select>
<label>Scroll Times:</label> <label>Scroll Times:</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
@ -537,10 +540,10 @@
</div> </div>
<div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="myModalLabel" aria-hidden="true"> <div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="myModalLabel" aria-hidden="true">
<div class="modal-dialog modal-lg"> <div class="modal-dialog modal-xl">
<div class="modal-content"> <div class="modal-content">
<div class="modal-header"> <div class="modal-header">
<h4 class="modal-title" id="myModalLabel">Save Task</h4> <h4 class="modal-title" id="myModalLabel">Save Task (Can press Ctrl + S to open this modal)</h4>
<button type="button" class="close" data-dismiss="modal" aria-hidden="true">&times;</button> <button type="button" class="close" data-dismiss="modal" aria-hidden="true">&times;</button>
</div> </div>
<div class="modal-body" style="height:400px;overflow: auto"> <div class="modal-body" style="height:400px;overflow: auto">
@ -558,7 +561,7 @@
<option value = "txt">TXT</option> <option value = "txt">TXT</option>
<option value = "mysql">MySQL Database</option> <option value = "mysql">MySQL Database</option>
</select> </select>
<label>Export File Name/Database Table Name (The keyword "current_time" will be replaced with the timestamp when the task is executed):</label> <label>Export File Name/Database Table Name (Can use ../ to represent relative path to change the file save location,the keyword "current_time" will be replaced with the timestamp when the task is executed):</label>
<input onkeydown="inputDelete(event)" value="current_time" id="saveName" class="form-control"></input> <input onkeydown="inputDelete(event)" value="current_time" id="saveName" class="form-control"></input>
<label>Is it an extreme anti-scraping website like Cloudflare (<a href="https://www.bilibili.com/video/BV1Ph4y1E7R9/" target="_blank">Watch Tutorial</a>)?</label> <label>Is it an extreme anti-scraping website like Cloudflare (<a href="https://www.bilibili.com/video/BV1Ph4y1E7R9/" target="_blank">Watch Tutorial</a>)?</label>
<select id="cloudflare" name="cloudflare" class="form-control"> <select id="cloudflare" name="cloudflare" class="form-control">

View File

@ -46,6 +46,13 @@ let app = new Vue({
paraIndex: 0, //当前参数的index paraIndex: 0, //当前参数的index
XPaths: "", //xpath列表 XPaths: "", //xpath列表
}, },
mounted: function() {
// setTimeout(function () {
// $("#flowchart_graph")[0].scrollTo(0, 10000);
// window.scrollTo(0, 10000);
// console.log("scroll")
// }, 500);
},
watch: { watch: {
nowArrow: { //变量发生变化的时候进行一些操作 nowArrow: { //变量发生变化的时候进行一些操作
deep: true, deep: true,
@ -498,6 +505,7 @@ function toolBoxKernel(e, para = null) {
return t; return t;
} }
option = 0; option = 0;
} }
$(".options").mousedown(function() { $(".options").mousedown(function() {

View File

@ -55,7 +55,7 @@
<div style="text-align: left;margin: 10px;font-size:15px!important">提示:点击上方操作按钮后点击要放置元素的位置处的箭头,可按取消操作按钮取消。</div> <div style="text-align: left;margin: 10px;font-size:15px!important">提示:点击上方操作按钮后点击要放置元素的位置处的箭头,可按取消操作按钮取消。</div>
</div> </div>
</div> </div>
<div style="margin-top:20px;border: solid;height:850px;overflow: auto;width:58%;float:right"> <div style="margin-top:20px;border: solid;height:850px;overflow: auto;width:58%;float:right" id="flowchart_graph">
<div id="0" class="clk" data="0"> <div id="0" class="clk" data="0">
</div> </div>
<div style="border-radius: 50%;width: 40px;height: 40px;border:solid black;margin: 5px auto;background-color:lightcyan"> <div style="border-radius: 50%;width: 40px;height: 40px;border:solid black;margin: 5px auto;background-color:lightcyan">
@ -110,8 +110,9 @@
<option value = 0>不滚动</option> <option value = 0>不滚动</option>
<option value = 1>向下滚动一屏</option> <option value = 1>向下滚动一屏</option>
<option value = 2>滚动到底部</option> <option value = 2>滚动到底部</option>
<option value = 3>一直滚动直到页面内容无变化(需设置好滚动后的等待时间,等待时间太短容易检测不到新数据)</option>
</select> </select>
<label>滚动次数(滚动类型设置为<b>不滚动</b><b>无效</b></label> <label>滚动次数(滚动类型设置为<b>不滚动</b><b>一直滚动</b>时请忽略此项</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
<label>滚动后等待时间(秒):</label> <label>滚动后等待时间(秒):</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
@ -155,8 +156,9 @@
<option value = 0>不滚动</option> <option value = 0>不滚动</option>
<option value = 1>向下滚动一屏</option> <option value = 1>向下滚动一屏</option>
<option value = 2>滚动到底部</option> <option value = 2>滚动到底部</option>
<option value = 3>一直滚动直到页面内容无变化(需设置好滚动后的等待时间,等待时间太短容易检测不到新数据)</option>
</select> </select>
<label>滚动次数(滚动类型设置为<b>不滚动</b><b>无效</b></label> <label>滚动次数(滚动类型设置为<b>不滚动</b><b>一直滚动</b>时请忽略此项</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
<label>滚动后等待时间(秒):</label> <label>滚动后等待时间(秒):</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
@ -449,7 +451,7 @@
<input onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["waitTime"]'></input> <input onkeydown="inputDelete(event)" required class="form-control" type="number" v-model.number='nowNode["parameters"]["waitTime"]'></input>
</div> </div>
<!-- 这里添加退出循环条件,找不到元素肯定退出循环 --> <!-- 这里添加退出循环条件,找不到元素肯定退出循环 -->
<label v-if='parseInt(loopType) == 0'>最多执行循环次数0代表无限循环直到找不到元素为止</label> <label v-if='parseInt(loopType) == 0'>最多执行循环次数0代表无限循环直到找不到元素或数据变化为止):</label>
<input onkeydown="inputDelete(event)" required v-if='parseInt(loopType) == 0' class="form-control" type="number" v-model.number='nowNode["parameters"]["exitCount"]'></input> <input onkeydown="inputDelete(event)" required v-if='parseInt(loopType) == 0' class="form-control" type="number" v-model.number='nowNode["parameters"]["exitCount"]'></input>
<div id="breakAdvanced" v-if='nowNode["parameters"]["loopType"] < 5'> <div id="breakAdvanced" v-if='nowNode["parameters"]["loopType"] < 5'>
@ -475,8 +477,9 @@
<option value = 0>不滚动</option> <option value = 0>不滚动</option>
<option value = 1>向下滚动一屏</option> <option value = 1>向下滚动一屏</option>
<option value = 2>滚动到底部</option> <option value = 2>滚动到底部</option>
<option value = 3>一直滚动直到页面内容无变化(需设置好滚动后的等待时间,等待时间太短容易检测不到新数据)</option>
</select> </select>
<label>滚动次数(滚动类型设置为<b>不滚动</b><b>无效</b></label> <label>滚动次数(滚动类型设置为<b>不滚动</b><b>一直滚动</b>时请忽略此项</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollCount']" type="number" required></input>
<label>滚动后等待时间(秒):</label> <label>滚动后等待时间(秒):</label>
<input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input> <input onkeydown="inputDelete(event)" class="form-control" v-model.number="nowNode['parameters']['scrollWaitTime']" type="number" required></input>
@ -537,10 +540,10 @@
</div> </div>
<div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="myModalLabel" aria-hidden="true"> <div class="modal fade" id="myModal" tabindex="-1" role="dialog" aria-labelledby="myModalLabel" aria-hidden="true">
<div class="modal-dialog modal-lg"> <div class="modal-dialog modal-xl">
<div class="modal-content"> <div class="modal-content">
<div class="modal-header"> <div class="modal-header">
<h4 class="modal-title" id="myModalLabel">保存任务</h4> <h4 class="modal-title" id="myModalLabel">保存任务可按Ctrl+S调出此窗口</h4>
<button type="button" class="close" data-dismiss="modal" aria-hidden="true">&times;</button> <button type="button" class="close" data-dismiss="modal" aria-hidden="true">&times;</button>
</div> </div>
<div class="modal-body" style="height:400px;overflow: auto"> <div class="modal-body" style="height:400px;overflow: auto">
@ -558,7 +561,7 @@
<option value = "txt">TXT</option> <option value = "txt">TXT</option>
<option value = "mysql">MySQL数据库</option> <option value = "mysql">MySQL数据库</option>
</select> </select>
<label>导出文件名/数据库表格名称名称中的“current_time”会被替换为执行任务时的时间戳</label> <label>导出文件名/数据库表格名称(可使用../表示相对路径以改变文件保存位置,名称中的“current_time”会被替换为执行任务时的时间戳</label>
<input onkeydown="inputDelete(event)" value="current_time" id="saveName" class="form-control"></input> <input onkeydown="inputDelete(event)" value="current_time" id="saveName" class="form-control"></input>
<label>是否为Cloudflare等极端反爬网站<a href="https://www.bilibili.com/video/BV1Ph4y1E7R9/" target="_blank">查看Cloudflare设计和执行教程</a></label> <label>是否为Cloudflare等极端反爬网站<a href="https://www.bilibili.com/video/BV1Ph4y1E7R9/" target="_blank">查看Cloudflare设计和执行教程</a></label>
<select id="cloudflare" name="cloudflare" class="form-control"> <select id="cloudflare" name="cloudflare" class="form-control">

View File

@ -24,7 +24,7 @@
text-overflow: ellipsis; text-overflow: ellipsis;
overflow: hidden; overflow: hidden;
white-space: nowrap; white-space: nowrap;
max-width: 400px; max-width: 600px;
min-width: 150px; min-width: 150px;
} }
@ -190,7 +190,7 @@
<th style="text-align: center">{{"Parameter Name~参数名称" | lang}}</th> <th style="text-align: center">{{"Parameter Name~参数名称" | lang}}</th>
<th style="text-align: center">{{"Invoke Name~调用名称" | lang}}</th> <th style="text-align: center">{{"Invoke Name~调用名称" | lang}}</th>
<th style="text-align: center">{{"Parameter Type~参数类型" | lang}}</th> <th style="text-align: center">{{"Parameter Type~参数类型" | lang}}</th>
<th>{{"Parameter Value~参数值" | lang}}</th> <th>{{"Parameter Value~参数值(如想对多个网页执行此任务,可在“打开网页”操作内填入多行网址)" | lang}}</th>
</tr> </tr>
<tr v-for="i in task.inputParameters.length" v-if="task.inputParameters.length>0"> <tr v-for="i in task.inputParameters.length" v-if="task.inputParameters.length>0">
@ -222,6 +222,7 @@
</button> </button>
<!-- <button style="margin-left: 5px;" v-on:click="remoteExcuteInstant" class="btn btn-primary">Directly Run Remotely</button> --> <!-- <button style="margin-left: 5px;" v-on:click="remoteExcuteInstant" class="btn btn-primary">Directly Run Remotely</button> -->
<label style="margin-top: 15px;display: block">{{"You can also use the XPath Helper extension to test XPaths when executing the task:~执行任务的过程中也可以随时使用XPath Helper扩展来调试XPath。" | lang}}</label> <label style="margin-top: 15px;display: block">{{"You can also use the XPath Helper extension to test XPaths when executing the task:~执行任务的过程中也可以随时使用XPath Helper扩展来调试XPath。" | lang}}</label>
<label style="margin-top: 15px;display: block">{{"如果想进行更复杂的操作,如设置无头模式,设置定时执行等,请使用下方的命令行执行任务选项并配置好命令行参数。~ If you want to perform more complex operations, such as setting headless mode, setting scheduled execution, etc., please use the command line to execute the task and configure the command line parameters below." | lang}}</label>
<div style="margin-bottom: 10px;"> <div style="margin-bottom: 10px;">
<label style="margin-top: 10px;">{{"Execution ID (EID):~执行ID" | lang}}</label> <label style="margin-top: 10px;">{{"Execution ID (EID):~执行ID" | lang}}</label>
<input class="form-control" v-model="ID"></input> <input class="form-control" v-model="ID"></input>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -12,7 +12,7 @@
"justMyCode": false, "justMyCode": false,
// "args": ["--id", "[7]", "--read_type", "remote", "--headless", "0"] // "args": ["--id", "[7]", "--read_type", "remote", "--headless", "0"]
// "args": ["--id", "[9]", "--read_type", "remote", "--headless", "0", "--saved_file_name", "YOUTUBE"] // "args": ["--id", "[9]", "--read_type", "remote", "--headless", "0", "--saved_file_name", "YOUTUBE"]
"args": ["--id", "[2]", "--headless", "0", "--user_data", "1"] "args": ["--id", "[90]", "--headless", "0", "--user_data", "1"]
} }
] ]
} }

View File

@ -3,7 +3,7 @@
EasySpider分三部分 EasySpider分三部分
1. 主程序:在`ElectronJS`文件夹下。 1. 主程序:在`ElectronJS`文件夹下。
2. 浏览器扩展:在`Extension`文件夹下,为浏览器的“操作控制台”的代码。 2. 浏览器扩展:在`Extension`文件夹下,为浏览器的“操作台”的代码。
3. 执行阶段程序:在`ExecuteStage`文件夹下。 3. 执行阶段程序:在`ExecuteStage`文件夹下。
此部分为`执行阶段程序`的编译说明。 此部分为`执行阶段程序`的编译说明。
@ -20,8 +20,8 @@ This section covers the compilation instructions for the `Execution stage progra
## 环境构建/Environment Setup ## 环境构建/Environment Setup
1. 安装python3.6及以上版本并添加至系统环境变量:[https://www.python.org/downloads/](https://www.python.org/downloads/)。 1. 安装Python 3.7及以上版本并添加至系统环境变量:[https://www.python.org/downloads/](https://www.python.org/downloads/)。
2. 安装pip3 并添加至系统环境变量Windows安装python后会自带pipLinux和MacOS安装方式请自行搜索 2. 安装`pip3`并添加至系统环境变量Windows安装python后会自带pipLinux和MacOS安装方式请自行搜索
3. 安装执行阶段需要的依赖库: 3. 安装执行阶段需要的依赖库:
```sh ```sh
@ -30,7 +30,7 @@ This section covers the compilation instructions for the `Execution stage progra
----- -----
1. Install Python 3.6 or higher version and add it to the system environment variables: [https://www.python.org/downloads/](https://www.python.org/downloads/). 1. Install Python 3.7 or higher version and add it to the system environment variables: [https://www.python.org/downloads/](https://www.python.org/downloads/).
2. Install pip3 and add it to the system environment variables. (On Windows, pip is automatically installed with Python. For Linux and macOS, please refer to the appropriate installation instructions). 2. Install pip3 and add it to the system environment variables. (On Windows, pip is automatically installed with Python. For Linux and macOS, please refer to the appropriate installation instructions).
3. Install the required dependencies for the execution stage by running: 3. Install the required dependencies for the execution stage by running:

View File

@ -266,6 +266,7 @@ class BrowserThread(Thread):
scrollType = int(para["scrollType"]) scrollType = int(para["scrollType"])
try: try:
if scrollType != 0 and para["scrollCount"] > 0: # 控制屏幕向下滚动 if scrollType != 0 and para["scrollCount"] > 0: # 控制屏幕向下滚动
if scrollType == 1 or scrollType == 2:
for i in range(para["scrollCount"]): for i in range(para["scrollCount"]):
self.Log("Wait for set second after screen scrolling") self.Log("Wait for set second after screen scrolling")
body = self.browser.find_element( body = self.browser.find_element(
@ -278,6 +279,27 @@ class BrowserThread(Thread):
time.sleep(para["scrollWaitTime"]) # 下拉完等待 time.sleep(para["scrollWaitTime"]) # 下拉完等待
except: except:
pass pass
elif scrollType == 3:
bodyText = ""
i = 0
while True:
newBodyText = self.browser.page_source
if newBodyText == bodyText:
print("页面已检测不到新内容,停止滚动。")
print("No new content detected on the page, stop scrolling.")
break
else:
bodyText = newBodyText
body = self.browser.find_element(
By.CSS_SELECTOR, "body", iframe=para["iframe"])
body.send_keys(Keys.END)
print("滚动到底部,第", i + 1, "次。")
print("Scroll to the bottom, the", i + 1, "time.")
i = i + 1
try:
time.sleep(para["scrollWaitTime"]) # 下拉完等待
except:
pass
except: except:
self.Log('Time out after set seconds when scrolling. ') self.Log('Time out after set seconds when scrolling. ')
self.recordLog('Time out after set seconds when scrolling') self.recordLog('Time out after set seconds when scrolling')
@ -589,9 +611,18 @@ class BrowserThread(Thread):
if int(node["parameters"]["loopType"]) == 0: # 单个元素循环 if int(node["parameters"]["loopType"]) == 0: # 单个元素循环
# 无跳转标签页操作 # 无跳转标签页操作
count = 0 # 执行次数 count = 0 # 执行次数
bodyText = "-"
while True: # do while循环 while True: # do while循环
try: try:
finished = False finished = False
newBodyText = self.browser.page_source
if newBodyText == bodyText: # 如果页面内容无变化
print("页面已检测不到新内容,停止循环。")
print("No new content detected on the page, stop loop.")
finished = True
break
else:
bodyText = newBodyText
element = self.browser.find_element( element = self.browser.find_element(
By.XPATH, node["parameters"]["xpath"], iframe=node["parameters"]["iframe"]) By.XPATH, node["parameters"]["xpath"], iframe=node["parameters"]["iframe"])
for i in node["sequence"]: # 挨个执行操作 for i in node["sequence"]: # 挨个执行操作

View File

@ -3,10 +3,10 @@
EasySpider分三部分 EasySpider分三部分
1. 主程序:在`ElectronJS`文件夹下。 1. 主程序:在`ElectronJS`文件夹下。
2. 浏览器扩展:在`Extension`文件夹下,为浏览器的“操作控制台”的代码,打包后的扩展在`ElectronJS`目录下的`EasySpider_zh.crx`文件。 2. 浏览器扩展:在`Extension`文件夹下,为浏览器的“操作台”的代码,打包后的扩展在`ElectronJS`目录下的`EasySpider_zh.crx`文件。
3. 执行阶段程序:在`ExecuteStage`文件夹下。 3. 执行阶段程序:在`ExecuteStage`文件夹下。
此部分为`浏览器扩展`的编译说明,**本节的所有命令都在`manifest_v3`文件夹内执行**。 此部分为`浏览器扩展`的编译说明,**本节的所有命令都在`manifest_v3`文件夹内执行**,即你需要先`cd manifest_v3`
----- -----
@ -16,7 +16,7 @@ EasySpider is divided into three parts:
2. Browser extension: Located in the Extension folder, i.e., the `EasySpider_en.crx` file in the `ElectronJS` folder. 2. Browser extension: Located in the Extension folder, i.e., the `EasySpider_en.crx` file in the `ElectronJS` folder.
3. Execution stage program: Located in the ExecuteStage folder. 3. Execution stage program: Located in the ExecuteStage folder.
This section covers the compilation instructions for the `Browser extension`, **all commands in this section are executed in the `manifest_v3` folder**. This section covers the compilation instructions for the `Browser extension`, **all commands in this section are executed in the `manifest_v3` folder**, i.e., you need to `cd manifest_v3` first.
## 环境构建/Environment Setup ## 环境构建/Environment Setup

View File

@ -18,9 +18,9 @@ A visual code-free/no-code web crawler/spider, just select the content you want
### 示例1/Example 1 ### 示例1/Example 1
(右键)选中一个大商品块 -> 自动检测到同类型商品块 -> 点击“选中全部”选项 -> 点击“选中子元素”选项 -> 点击“采集数据”选项,即可采集到所有商品的所有信息,并分成不同字段保存。 (右键)选中一个大商品块 -> 软件自动检测到同类型商品块 -> 点击“选中全部”选项 -> 点击“选中子元素”选项 -> 点击“采集数据”选项,即可采集到所有商品的所有信息,并分成不同字段保存。
(Right click) Select a large product block -> Click the 'Select All' option -> Click the 'Select Child Elements' option -> Click the 'Collect Data' option, you can collect the information of all products, and will be saved by sub-field. (Right click) Select a large product block -> The software will automatically detect similar blocks -> Click the 'Select All' option -> Click the 'Select Child Elements' option -> Click the 'Collect Data' option, you can collect the information of all products, and will be saved by sub-field.
![animation_zh](media/animation_zh.gif) ![animation_zh](media/animation_zh.gif)

Binary file not shown.

Before

Width:  |  Height:  |  Size: 220 KiB

After

Width:  |  Height:  |  Size: 230 KiB